hbase hbck 介紹 (1)

hbck (HBaseFsck) 是用來檢查hbase中群集一致性的工具,
其中, FSCK是file system consistency check的縮寫,
hbase hbck會確認.META.表格的資訊, Region和RegionServer的對應關係,
以及HDFS中的硬碟資料狀態, 確保hbase中資料是一致的,
以功能而言, hbase hbck分成以下兩種主要功能:
修復Region和HDFS的對應關係(合併或是建立Region),
修復Region和META.表格的對應關係

  The general repair strategy works in these steps.
  1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  2) Repair Region Consistency with META and assignments


當此工具運行時, 將檢查資料在Master及RegionServer的記憶體中狀態,
與資料在HDFS中的狀態之間的一致性, 並修正.
比如說,我們將一個Region從.META.表單中刪除,
由於Region中負責的HFile並未相對應的清除,
將造成HDFS和.META.中的不一致,
此時, 就可以用"hbase hbck PlatformData -repair"來修正,

在hbase 0.94版本中, hbck可以線上(online)修復hbase,
並能夠自動修正由於錯誤操作造成的表格空缺(holes)
以下是hbck(0.94)的功能敘述:

The updated version now has been augmented with the ability repair region consistency problems in .META. (by patching holes), repair overlapping regions (via merging), patch region holes (by fabricating new regions), and detecting and adopting orphaned regions (by fabricating new .regioninfo file if it is missing in a region's dir). 

然而, 根據敘述, hbase hbck運作時,
似乎會產生多餘的Region, 直到offline時, 才可以清空,
如果真是如此, hbck提供我們藉由刪除Region重新設計key的自由度,
同時, 藉由刪除HFile, 也可以減少HDFS的使用空間,
卻也造成Region數量的不當增加,

以下是hbase hbck的操作說明:

hbase hbck -h
Usage: fsck [opts] {only tables}
 where [opts] are:
   -help Display help options (this)
   -details Display full report of all regions.
   -timelag {timeInSeconds}  Process only regions that  have not experienced any metadata updates in the last  {{timeInSeconds} seconds.
   -sleepBeforeRerun {timeInSeconds} Sleep this many seconds before checking if the fix worked if run with -fix
   -summary Print only summary of the tables and status.
   -metaonly Only check the state of ROOT and META tables.
  Repair options: (expert features, use with caution!)
   -fix              Try to fix region assignments.  This is for backwards compatiblity
   -fixAssignments   Try to fix region assignments.  Replaces the old -fix
   -fixMeta          Try to fix meta problems.  This assumes HDFS region info is good.
   -fixHdfsHoles     Try to fix region holes in hdfs.
   -fixHdfsOrphans   Try to fix region dirs with no .regioninfo file in hdfs
   -fixHdfsOverlaps  Try to fix region overlaps in hdfs.
   -fixVersionFile   Try to fix missing hbase.version file in hdfs.
   -maxMerge <n>     When fixing region overlaps, allow at most <n> regions to merge. (n=5 by default)
   -sidelineBigOverlaps  When fixing region overlaps, allow to sideline big overlaps
   -maxOverlapsToSideline <n>  When fixing region overlaps, allow at most <n> regions to sideline per group. (n=2 by default)

   -repair           Shortcut for -fixAssignments -fixMeta -fixHdfsHoles -fixHdfsOrphans -fixHdfsOverlaps -fixVersionFile -sidelineBigOverlaps
   -repairHoles      Shortcut for -fixAssignments -fixMeta -fixHdfsHoles -fixHdfsOrphans


參考資料:
http://stackoverflow.com/questions/6407531/hbase-how-to-delete-region
https://issues.apache.org/jira/browse/HBASE-5128
http://www.sohu.io/article/3770.html
https://gist.github.com/elazarl/10018530
http://prafull-blog.blogspot.tw/2012/06/how-to-delete-hbase-region-including.html
http://hbase.apache.org/0.94/book/hbck.in.depth.html
https://www.linkedin.com/groups/How-merge-regions-in-HBase-988957.S.134472095

留言

熱門文章

LTE筆記: RSRP, RSSI and RSRQ

[WiFi] WiFi 網路的識別: BSS, ESS, SSID, ESSID, BSSID

LTE筆記: 波束成型 (beamforming) 和天線陣列