1、发现块丢失
2、检测缺失块
(1)hdfs fsck -list-corruptfileblocks
代码语言:javascript复制root@kylin1:~# hdfs fsck -list-corruptfileblocks
18/03/08 09:52:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connecting to namenode via http://kylin2:50070/fsck?ugi=root&listcorruptfileblocks=1&path=/
The list of corrupt files under path '/' are:
blk_1073741825 /hbase/hbase.version
blk_1073741826 /hbase/hbase.id
blk_1073741827 /hbase/data/hbase/meta/1588230740/.regioninfo
blk_1073741829 /hbase/data/hbase/meta/.tabledesc/.tableinfo.0000000001
blk_1073741834 /hbase/data/hbase/namespace/.tabledesc/.tableinfo.0000000001
blk_1073741835 /hbase/data/hbase/namespace/68b26ebda68daa41d66237c2da92f90b/.regioninfo
blk_1073741837 /hbase/data/hbase/meta/1588230740/info/82e4549009424f9cac91412a97ef242a
blk_1073741843 /hbase/data/hbase/namespace/68b26ebda68daa41d66237c2da92f90b/info/d2d30a99382c4eb8a220c81fe8cb906c
blk_1073741846 /hbase/data/hbase/meta/1588230740/info/5d7965bf2c914a99864192b2ef00665c
The filesystem under path '/' has 9 CORRUPT files
root@kylin1:~#
(2)hdfs fsck / | egrep -v ‘^. $’ | grep -v eplica
代码语言:javascript复制root@kylin1:~# hdfs fsck / | egrep -v '^. $' | grep -v eplica
18/03/08 09:53:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connecting to namenode via http://kylin2:50070/fsck?ugi=root&path=/
FSCK started by root (auth:SIMPLE) from /192.168.1.161 for path / at Thu Mar 08 09:53:35 CST 2018
/hbase/data/hbase/meta/.tabledesc/.tableinfo.0000000001: CORRUPT blockpool BP-433695712-192.168.1.162-1520298438803 block blk_1073741829
/hbase/data/hbase/meta/.tabledesc/.tableinfo.0000000001: MISSING 1 blocks of total size 372 B..
/hbase/data/hbase/meta/1588230740/.regioninfo: CORRUPT blockpool BP-433695712-192.168.1.162-1520298438803 block blk_1073741827
/hbase/data/hbase/meta/1588230740/.regioninfo: MISSING 1 blocks of total size 32 B..
/hbase/data/hbase/meta/1588230740/info/5d7965bf2c914a99864192b2ef00665c: CORRUPT blockpool BP-433695712-192.168.1.162-1520298438803 block blk_1073741846
/hbase/data/hbase/meta/1588230740/info/5d7965bf2c914a99864192b2ef00665c: MISSING 1 blocks of total size 5519 B..
/hbase/data/hbase/meta/1588230740/info/82e4549009424f9cac91412a97ef242a: CORRUPT blockpool BP-433695712-192.168.1.162-1520298438803 block blk_1073741837
/hbase/data/hbase/meta/1588230740/info/82e4549009424f9cac91412a97ef242a: MISSING 1 blocks of total size 5317 B...
/hbase/data/hbase/namespace/.tabledesc/.tableinfo.0000000001: CORRUPT blockpool BP-433695712-192.168.1.162-1520298438803 block blk_1073741834
/hbase/data/hbase/namespace/.tabledesc/.tableinfo.0000000001: MISSING 1 blocks of total size 312 B..
/hbase/data/hbase/namespace/68b26ebda68daa41d66237c2da92f90b/.regioninfo: CORRUPT blockpool BP-433695712-192.168.1.162-1520298438803 block blk_1073741835
/hbase/data/hbase/namespace/68b26ebda68daa41d66237c2da92f90b/.regioninfo: MISSING 1 blocks of total size 42 B..
/hbase/data/hbase/namespace/68b26ebda68daa41d66237c2da92f90b/info/d2d30a99382c4eb8a220c81fe8cb906c: CORRUPT blockpool BP-433695712-192.168.1.162-1520298438803 block blk_1073741843
/hbase/data/hbase/namespace/68b26ebda68daa41d66237c2da92f90b/info/d2d30a99382c4eb8a220c81fe8cb906c: MISSING 1 blocks of total size 5023 B...
/hbase/hbase.id: CORRUPT blockpool BP-433695712-192.168.1.162-1520298438803 block blk_1073741826
/hbase/hbase.id: MISSING 1 blocks of total size 42 B..
/hbase/hbase.version: CORRUPT blockpool BP-433695712-192.168.1.162-1520298438803 block blk_1073741825
/hbase/hbase.version: MISSING 1 blocks of total size 7 B.Status: CORRUPT
Total size: 16666 B (Total open files size: 332 B)
Total dirs: 33
Total files: 12
Total symlinks: 0 (Files currently being written: 4)
Total blocks (validated): 9 (avg. block size 1851 B) (Total open file blocks (not validated): 4)
********************************
UNDER MIN REPL'D BLOCKS: 9 (100.0 %)
CORRUPT FILES: 9
MISSING BLOCKS: 9
MISSING SIZE: 16666 B
CORRUPT BLOCKS: 9
********************************
Corrupt blocks: 9
Number of data-nodes: 4
Number of racks: 1
FSCK ended at Thu Mar 08 09:53:35 CST 2018 in 6 milliseconds
The filesystem under path '/' is CORRUPT
root@kylin1:~#
(3)查看上面某一个文件的情况
代码语言:javascript复制root@kylin1:~# hdfs fsck /hbase/hbase.version -locations -blocks -files
18/03/08 10:01:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connecting to namenode via http://kylin2:50070/fsck?ugi=root&locations=1&blocks=1&files=1&path=/hbase/hbase.version
FSCK started by root (auth:SIMPLE) from /192.168.1.161 for path /hbase/hbase.version at Thu Mar 08 10:01:42 CST 2018
/hbase/hbase.version 7 bytes, 1 block(s):
/hbase/hbase.version: CORRUPT blockpool BP-433695712-192.168.1.162-1520298438803 block blk_1073741825
MISSING 1 blocks of total size 7 B
0. BP-433695712-192.168.1.162-1520298438803:blk_1073741825_1001 len=7 MISSING!
Status: CORRUPT
Total size: 7 B
Total dirs: 0
Total files: 1
Total symlinks: 0
Total blocks (validated): 1 (avg. block size 7 B)
********************************
UNDER MIN REPL'D BLOCKS: 1 (100.0 %)
dfs.namenode.replication.min: 1
CORRUPT FILES: 1
MISSING BLOCKS: 1
MISSING SIZE: 7 B
CORRUPT BLOCKS: 1
********************************
Minimally replicated blocks: 0 (0.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 0.0
Corrupt blocks: 1
Missing replicas: 0
Number of data-nodes: 4
Number of racks: 1
FSCK ended at Thu Mar 08 10:01:42 CST 2018 in 1 milliseconds
The filesystem under path '/hbase/hbase.version' is CORRUPT
root@kylin1:~#
可以发现是192.168.1.162节点上的块丢失了。
(4)定位到机器上,然后到此机器上查看日志。
代码语言:javascript复制root@kylin2:/var/log/hdfs# vi hadoop-hdfs-datanode-kylin2.log
发现原因了,该节点被格式化了