I woke up to an alert from our Cloudera Manager managed Hadoop cluster:
Error:
Aug 15, 2:46:01.285 PM WARN org.apache.hadoop.security.UserGroupInformation PriviledgedActionException as:hdfs (auth:SIMPLE) cause:org.apache.hadoop.hdfs.qjournal.protocol.JournalNotFormattedException: Journal Storage Directory /space1/dfs/jn/nameservice1 not formatted Aug 15, 2:46:01.286 PM INFO org.apache.hadoop.ipc.Server IPC Server handler 3 on 8485, call org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.getEditLogManifest from 10.170.176.10:37369 Call#83 Retry#0 org.apache.hadoop.hdfs.qjournal.protocol.JournalNotFormattedException: Journal Storage Directory /space1/dfs/jn/nameservice1 not formatted at org.apache.hadoop.hdfs.qjournal.server.Journal.checkFormatted(Journal.java:472) at org.apache.hadoop.hdfs.qjournal.server.Journal.getEditLogManifest(Journal.java:655) at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.getEditLogManifest(JournalNodeRpcServer.java:186) ...
Solution: The JournalNode was missing the cluster configuration file (VERSION). Create the VERSION file in the Storage Directory and add the contents of the VERSION file from another JournalNode (reminds me to schedule config backups):
</div>
<div># create folder</div>
<div>sudo mkdir -p /space1/dfs/jn/nameservice1/current/
# create the file and add permissions
sudo vi /space1/dfs/jn/nameservice1/current/VERSION
sudo chown -R hdfs:hdfs /space1/dfs/jn/nameservice1/</div>
<div>