在伪分布式集群上启动HDFS进程时,发现HDFS开启了safe mode模式,且NN的log显示NameNode is in safe mode,进而查看HDFS的健康状态(hdfs fsck /),发现有block块丢失,日志如下:
# [ruoze@ruozedata001 ~]$ hdfs fsck / 20/01/03 21:50:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Connecting to namenode via http://ruozedata001:50070/fsck?ugi=ruoze&path=%2F FSCK started by ruoze (auth:SIMPLE) from /192.168.0.50 for path / at Fri Jan 03 21:50:56 CST 2020 . /ruozedata/rz.log: CORRUPT blockpool BP-1773600700-192.168.0.50-1576600441225 block blk_1073741825
/ruozedata/rz.log: MISSING 1 blocks of total size 17 B.. /tmp/hadoop-yarn/staging/history/done_intermediate/ruoze/job_1576908651279_0001-1576914012029-ruoze-select+h.id%2Ch.name%2Ch.members%5B%27brothe...null%29+%28Stag-1576914029123-1-0-SUCCEEDED-root.ruoze-1576914020583.jhist: CORRUPT blockpool BP-1773600700-192.168.0.50-1576600441225 block blk_1073741839
/tmp/hadoop-yarn/staging/history/done_intermediate/ruoze/job_1576908651279_0001-1576914012029-ruoze-select+h.id%2Ch.name%2Ch.members%5B%27brothe...null%29+%28Stag-1576914029123-1-0-SUCCEEDED-root.ruoze-1576914020583.jhist: MISSING 1 blocks of total size 22732 B.. /tmp/hadoop-yarn/staging/history/done_intermediate/ruoze/job_1576908651279_0001.summary: CORRUPT blockpool BP-1773600700-192.168.0.50-1576600441225 block blk_1073741838
/tmp/hadoop-yarn/staging/history/done_intermediate/ruoze/job_1576908651279_0001.summary: MISSING 1 blocks of total size 379 B.. /tmp/hadoop-yarn/staging/history/done_intermediate/ruoze/job_1576908651279_0001_conf.xml: CORRUPT blockpool BP-1773600700-192.168.0.50-1576600441225 block blk_1073741840
/tmp/hadoop-yarn/staging/history/done_intermediate/ruoze/job_1576908651279_0001_conf.xml: MISSING 1 blocks of total size 276955 B................................................................................................. ..................................................... /user/hive/warehouse/lsk_1.db/hive_map/hive_map.txt: CORRUPT blockpool BP-1773600700-192.168.0.50-1576600441225 block blk_1073741826
/user/hive/warehouse/lsk_1.db/hive_map/hive_map.txt: MISSING 1 blocks of total size 222 B.. /user/hive/warehouse/lsk_1.db/hive_struct/hive_struct.txt: CORRUPT blockpool BP-1773600700-192.168.0.50-1576600441225 block blk_1073741841
/user/hive/warehouse/lsk_1.db/hive_struct/hive_struct.txt: MISSING 1 blocks of total size 88 B............................................Status: CORRUPT Total size: 11800033 B Total dirs: 62 Total files: 197 Total symlinks: 0 Total blocks (validated): 170 (avg. block size 69411 B) ******************************** UNDER MIN REPL'D BLOCKS: 6 (3.5294118 %) dfs.namenode.replication.min: 1 CORRUPT FILES: 6 MISSING BLOCKS: 6 MISSING SIZE: 300393 B CORRUPT BLOCKS: 6 ******************************** Minimally replicated blocks: 164 (96.47059 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 1 Average block replication: 0.9647059 Corrupt blocks: 6 Missing replicas: 0 (0.0 %) Number of data-nodes: 1 Number of racks: 1 FSCK ended at Fri Jan 03 21:50:56 CST 2020 in 55 milliseconds The filesystem under path '/' is CORRUPT
校验: 健康状态 -bash-4.2$ hdfs fsck / Connecting to namenode via http://yws76:50070/fsck?ugi=hdfs&path=%2F FSCK started by hdfs (auth:SIMPLE) from /192.168.0.76 for path / at Sun Mar 03 14:44:44 CST 2019 ...............................................................................Status: HEALTHY Total size: 50194618424 B Total dirs: 354 Total files: 1079 Total symlinks: 0 Total blocks (validated): 992 (avg. block size 50599413 B) Minimally replicated blocks: 992 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 3.0 Corrupt blocks: 0 Missing replicas: 0 (0.0 %) Number of data-nodes: 3 Number of racks: 1 FSCK ended at Sun Mar 03 14:44:45 CST 2019 in 76 milliseconds
直接重启HDFS,直接模拟损坏效果,然后fsck检查: -bash-4.2$ hdfs fsck / Connecting to namenode via http://yws77:50070/fsck?ugi=hdfs&path=%2F FSCK started by hdfs (auth:SIMPLE) from /192.168.0.76 for path / at Sun Mar 03 16:02:04 CST 2019 . /blockrecover/ruozedata.md: Under replicated BP-1513979236-192.168.0.76-1514982530341:blk_1075808214_2068515. Target Replicas is 3 but found 2 live replica(s), 0 decommissioned replica(s), 0 decommissioning replica(s). ...............................................................................Status: HEALTHY Total size: 50194618424 B Total dirs: 354 Total files: 1079 Total symlinks: 0 Total blocks (validated): 992 (avg. block size 50599413 B) Minimally replicated blocks: 992 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 1 (0.10080645 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 2.998992 Corrupt blocks: 0 Missing replicas: 1 (0.033602152 %) Number of data-nodes: 3 Number of racks: 1 FSCK ended at Sun Mar 03 16:02:04 CST 2019 in 148 milliseconds