HDFS Block损坏&丢失恢复

目标

  1. 记录生产上HDFS block块损坏、丢失的解决方案

背景

伪分布式集群上启动HDFS进程时,发现HDFS开启了safe mode模式,且NN的log显示NameNode is in safe mode,进而查看HDFS的健康状态(hdfs fsck /),发现有block块丢失,日志如下:

# [ruoze@ruozedata001 ~]$ hdfs fsck /
20/01/03 21:50:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connecting to namenode via http://ruozedata001:50070/fsck?ugi=ruoze&path=%2F
FSCK started by ruoze (auth:SIMPLE) from /192.168.0.50 for path / at Fri Jan 03 21:50:56 CST 2020
.
/ruozedata/rz.log: CORRUPT blockpool BP-1773600700-192.168.0.50-1576600441225 block blk_1073741825

/ruozedata/rz.log: MISSING 1 blocks of total size 17 B..
/tmp/hadoop-yarn/staging/history/done_intermediate/ruoze/job_1576908651279_0001-1576914012029-ruoze-select+h.id%2Ch.name%2Ch.members%5B%27brothe...null%29+%28Stag-1576914029123-1-0-SUCCEEDED-root.ruoze-1576914020583.jhist: CORRUPT blockpool BP-1773600700-192.168.0.50-1576600441225 block blk_1073741839

/tmp/hadoop-yarn/staging/history/done_intermediate/ruoze/job_1576908651279_0001-1576914012029-ruoze-select+h.id%2Ch.name%2Ch.members%5B%27brothe...null%29+%28Stag-1576914029123-1-0-SUCCEEDED-root.ruoze-1576914020583.jhist: MISSING 1 blocks of total size 22732 B..
/tmp/hadoop-yarn/staging/history/done_intermediate/ruoze/job_1576908651279_0001.summary: CORRUPT blockpool BP-1773600700-192.168.0.50-1576600441225 block blk_1073741838

/tmp/hadoop-yarn/staging/history/done_intermediate/ruoze/job_1576908651279_0001.summary: MISSING 1 blocks of total size 379 B..
/tmp/hadoop-yarn/staging/history/done_intermediate/ruoze/job_1576908651279_0001_conf.xml: CORRUPT blockpool BP-1773600700-192.168.0.50-1576600441225 block blk_1073741840

/tmp/hadoop-yarn/staging/history/done_intermediate/ruoze/job_1576908651279_0001_conf.xml: MISSING 1 blocks of total size 276955 B.................................................................................................
.....................................................
/user/hive/warehouse/lsk_1.db/hive_map/hive_map.txt: CORRUPT blockpool BP-1773600700-192.168.0.50-1576600441225 block blk_1073741826

/user/hive/warehouse/lsk_1.db/hive_map/hive_map.txt: MISSING 1 blocks of total size 222 B..
/user/hive/warehouse/lsk_1.db/hive_struct/hive_struct.txt: CORRUPT blockpool BP-1773600700-192.168.0.50-1576600441225 block blk_1073741841

/user/hive/warehouse/lsk_1.db/hive_struct/hive_struct.txt: MISSING 1 blocks of total size 88 B............................................Status: CORRUPT
Total size: 11800033 B
Total dirs: 62
Total files: 197
Total symlinks: 0
Total blocks (validated): 170 (avg. block size 69411 B)
********************************
UNDER MIN REPL'D BLOCKS: 6 (3.5294118 %)
dfs.namenode.replication.min: 1
CORRUPT FILES: 6
MISSING BLOCKS: 6
MISSING SIZE: 300393 B
CORRUPT BLOCKS: 6
********************************
Minimally replicated blocks: 164 (96.47059 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 1
Average block replication: 0.9647059
Corrupt blocks: 6
Missing replicas: 0 (0.0 %)
Number of data-nodes: 1
Number of racks: 1
FSCK ended at Fri Jan 03 21:50:56 CST 2020 in 55 milliseconds

The filesystem under path '/' is CORRUPT

block修复

手动修复

修复命令 hdfs debug recoverLease -path <path> [-retries <num-retries>]

[ruoze@ruozedata001 ~]$ hdfs debug recoverLease -path /user/hive/warehouse/lsk_1.db/hive_map/hive_map.txt -retries 10
recoverLease SUCCEEDED on /user/hive/warehouse/lsk_1.db/hive_map/hive_map.txt

自动修复

当数据块损坏后,DN节点执行directoryscan操作之前,都不会发现损坏;
也就是directoryscan操作间隔是6h(`dfs.datanode.directoryscan.interval : 21600`)

在DN向NN进行blockreport前,都不会恢复数据块;
也就是blockreport操作间隔是6h(dfs.blockreport.intervalMsec : 21600000)

当NN收到blockreport才会进行恢复操作。

后记

手动修复block后,查看HDFS的健康状态发现,block还处于Missing状态,考虑到笔者部署的是伪分布式集群,副本数只有一个,故无法恢复Missing的block。

集群环境测试

文件ruozedata.md

上传:
-bash-4.2$ hdfs dfs -mkdir /blockrecover
-bash-4.2$ echo "www.ruozedata.com" > ruozedata.md

-bash-4.2$ hdfs dfs -put ruozedata.md /blockrecover
-bash-4.2$ hdfs dfs -ls /blockrecover
Found 1 items
-rw-r--r-- 3 hdfs supergroup 18 2019-03-03 14:42 /blockrecover/ruozedata.md
-bash-4.2$

校验: 健康状态
-bash-4.2$ hdfs fsck /
Connecting to namenode via http://yws76:50070/fsck?ugi=hdfs&path=%2F
FSCK started by hdfs (auth:SIMPLE) from /192.168.0.76 for path / at Sun Mar 03 14:44:44 CST 2019
...............................................................................Status: HEALTHY
Total size: 50194618424 B
Total dirs: 354
Total files: 1079
Total symlinks: 0
Total blocks (validated): 992 (avg. block size 50599413 B)
Minimally replicated blocks: 992 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 3.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 3
Number of racks: 1
FSCK ended at Sun Mar 03 14:44:45 CST 2019 in 76 milliseconds

The filesystem under path '/' is HEALTHY

直接在DN节点上删除文件一个block的一个副本(3副本)

删除块和meta文件:
[root@yws87 subdir135]# rm -rf blk_1075808214 blk_1075808214_2068515.meta

直接重启HDFS,直接模拟损坏效果,然后fsck检查:
-bash-4.2$ hdfs fsck /
Connecting to namenode via http://yws77:50070/fsck?ugi=hdfs&path=%2F
FSCK started by hdfs (auth:SIMPLE) from /192.168.0.76 for path / at Sun Mar 03 16:02:04 CST 2019
.
/blockrecover/ruozedata.md: Under replicated BP-1513979236-192.168.0.76-1514982530341:blk_1075808214_2068515. Target Replicas is 3 but found 2 live replica(s), 0 decommissioned replica(s), 0 decommissioning replica(s).
...............................................................................Status: HEALTHY
Total size: 50194618424 B
Total dirs: 354
Total files: 1079
Total symlinks: 0
Total blocks (validated): 992 (avg. block size 50599413 B)
Minimally replicated blocks: 992 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 1 (0.10080645 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 2.998992
Corrupt blocks: 0
Missing replicas: 1 (0.033602152 %)
Number of data-nodes: 3
Number of racks: 1
FSCK ended at Sun Mar 03 16:02:04 CST 2019 in 148 milliseconds


The filesystem under path '/' is HEALTHY

手动修复

修复命令:
-bash-4.2$ hdfs debug recoverLease -path /blockrecover/ruozedata.md -retries 10
recoverLease SUCCEEDED on /blockrecover/ruozedata.md
-bash-4.2$

直接在DN节点上查看,block文件和meta文件的恢复情况:
- 恢复前
[root@yws87 subdir135]# ll
-rw-r--r-- 1 hdfs hdfs 56 Mar 3 14:28 blk_1075808202
-rw-r--r-- 1 hdfs hdfs 11 Mar 3 14:28 blk_1075808202_2068503.meta
- 恢复后
[root@yws87 subdir135]# ll
-rw-r--r-- 1 hdfs hdfs 56 Mar 3 14:28 blk_1075808202
-rw-r--r-- 1 hdfs hdfs 11 Mar 3 14:28 blk_1075808202_2068503.meta
-rw-r--r-- 1 hdfs hdfs 18 Mar 3 15:23 blk_1075808214
-rw-r--r-- 1 hdfs hdfs 11 Mar 3 15:23 blk_1075808214_2068515.meta

总结

注:hdfs debug 命令仅适用于集群多副本情况
生产上本人一般倾向于使用手动修复方式,但是前提是要手动删除损坏的block块。
切记,删除的是损坏的block文件和meta文件,而不是删除hdfs文件。当然还可以先把文件get下载,然后hdfs删除,在对应上传。
切记,删除不要执行 hdfs fsck / -delete,这是删除损坏的文件,那么数据不就丢失了嘛,这种情况只适用于无所谓丢数据或者有信心从其他地方可以补数据到hdfs。

Author: Red
Link: http://yoursite.com/2018/09/23/archives/hdfs/hdfs-3/
Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.