Low Orbit Flux Logo 2 F

HDFS How to Fix Corrupt or Missing Blocks

Blocks with corrupt replicas - These blocks have at least one live, intact replica and at least one corrupt replica.
Missing blocks - These blocks are gone. They have no replicas.

You can run this to get statistics missing or corrupt blocks:

hdfs dfsadmin -report

Find the number of missing or corrupted blocks:

hdfs fsck -list-corruptfileblocks

HDFS How to Fix Corrupt or Missing Blocks - Another Method

If you like to live life on the edge you can just go right ahead and let the fsck command delete corrupted blocks for you.

hdfs fsck / -delete

Fix Under Replicated Blocks

You might find it useful to increase the replication factor for any blocks that are under replicated.

hdfs fsck / | grep 'Under replicated' | awk -F':' '{print $1}' >> /tmp/under_replicated_files 

for hdfsfile in `cat /tmp/under_replicated_files`; do echo "Fixing $hdfsfile :" ;  hadoop fs -setrep 3 $hdfsfile; done

References