[TRACE] org.apache.hadoop.hbase.util.Merge
org.apache.hadoop.hbase.util.Merge,
是在hbase 0.94版本中用來實現Region合併的工具,
使用方式如下:
在程式的一開始,
org.apache.hadoop.hbase.util.Merge將會先檢查:
輸入的變數, HDFS的設定, hbase是否關閉:
確認輸入變數正確(包括table name, region-1, region-2),
HDFS開啟, 且hbase關閉時, 才開始進行merging (透過mergeTwoRegions()).
在mergeTwoRegions()內, 一開始將先去找尋存取.META.表格的RegionServer,
並嘗試根據region-1和region-2的名字, 查詢Region資料, 存入info1和info2中,
確定region-1和region-2的資訊後, 進入merge()程序, 產生合併後的內容,
加入.META.表格當中, mergeTwoRegions()的程式碼如下:
在merge()中做了兩件事情,
第一件事情是透過HRegion.merge(r1, r2)將region-1和region-2合併,
第二件事情則是將.META.表中region-1和region-2的資訊刪除,
這樣一來, 在mergeTwoRegions()就只需要把新的資訊加入.META.表中,
merge()的原始碼如下:
在hbase 0.94的merge工具中, 提供了offline合併Region的工具,
然而, trace程式的結果, 並能夠回答為什麼有offline的限制,
(為什麼不直接online修改.META.表格就好了?)
接下來, 將繼續trace HRegion.merge()這一隻程式,
觀察Region合併的過程.
參考資料:
https://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/util/Merge.html
https://hbase.apache.org/xref/org/apache/hadoop/hbase/util/Merge.html
http://blog.csdn.net/macyang/article/details/6624482
是在hbase 0.94版本中用來實現Region合併的工具,
使用方式如下:
$ ./bin/hbase org.apache.hadoop.hbase.util.Merge
Usage: bin/hbase merge <table-name> <region-1> <region-2>
在程式的一開始,
org.apache.hadoop.hbase.util.Merge將會先檢查:
輸入的變數, HDFS的設定, hbase是否關閉:
81 if (parseArgs(args) != 0) {
82 return -1;
83 }
84
85 // Verify file system is up.
86 FileSystem fs = FileSystem.get(getConf()); // get DFS handle
87 LOG.info("Verifying that file system is available...");
88 try {
89 FSUtils.checkFileSystemAvailable(fs);
90 } catch (IOException e) {
91 LOG.fatal("File system is not available", e);
92 return -1;
93 }
94
95 // Verify HBase is down
96 LOG.info("Verifying that HBase is not running...");
97 try {
98 HBaseAdmin.checkHBaseAvailable(getConf());
99 LOG.fatal("HBase cluster must be off-line, and is not. Aborting.");
100 return -1;
101 } catch (ZooKeeperConnectionException zkce) {
102 // If no zk, presume no master.
103 } catch (MasterNotRunningException e) {
104 // Expected. Ignore.
105 }
確認輸入變數正確(包括table name, region-1, region-2),
HDFS開啟, 且hbase關閉時, 才開始進行merging (透過mergeTwoRegions()).
在mergeTwoRegions()內, 一開始將先去找尋存取.META.表格的RegionServer,
並嘗試根據region-1和region-2的名字, 查詢Region資料, 存入info1和info2中,
確定region-1和region-2的資訊後, 進入merge()程序, 產生合併後的內容,
加入.META.表格當中, mergeTwoRegions()的程式碼如下:
130 /*
131 * Merges two regions from a user table.
132 */
133 private void mergeTwoRegions() throws IOException {
134 LOG.info("Merging regions " + Bytes.toStringBinary(this.region1) + " and " +
135 Bytes.toStringBinary(this.region2) + " in table " + this.tableName);
136 HRegion meta = this.utils.getMetaRegion();
137 Get get = new Get(region1);
138 get.addColumn(HConstants.CATALOG_FAMILY, HConstants.REGIONINFO_QUALIFIER);
139 Result result1 = meta.get(get);
140 Preconditions.checkState(!result1.isEmpty(),
141 "First region cells can not be null");
142 HRegionInfo info1 = HRegionInfo.getHRegionInfo(result1);
143 if (info1 == null) {
144 throw new NullPointerException("info1 is null using key " +
145 Bytes.toStringBinary(region1) + " in " + meta);
146 }
147 get = new Get(region2);
148 get.addColumn(HConstants.CATALOG_FAMILY, HConstants.REGIONINFO_QUALIFIER);
149 Result result2 = meta.get(get);
150 Preconditions.checkState(!result2.isEmpty(),
151 "Second region cells can not be null");
152 HRegionInfo info2 = HRegionInfo.getHRegionInfo(result2);
153 if (info2 == null) {
154 throw new NullPointerException("info2 is null using key " + meta);
155 }
156 TableDescriptor htd = FSTableDescriptors.getTableDescriptorFromFs(FileSystem.get(getConf()),
157 this.rootdir, this.tableName);
158 HRegion merged = merge(htd.getHTableDescriptor(), meta, info1, info2);
159
160 LOG.info("Adding " + merged.getRegionInfo() + " to " +
161 meta.getRegionInfo());
162
163 HRegion.addRegionToMETA(meta, merged);
164 merged.close();
165 }
第一件事情是透過HRegion.merge(r1, r2)將region-1和region-2合併,
第二件事情則是將.META.表中region-1和region-2的資訊刪除,
這樣一來, 在mergeTwoRegions()就只需要把新的資訊加入.META.表中,
merge()的原始碼如下:
167 /*
168 * Actually merge two regions and update their info in the meta region(s)
169 * Returns HRegion object for newly merged region
170 */
171 private HRegion merge(final HTableDescriptor htd, HRegion meta,
172 HRegionInfo info1, HRegionInfo info2)
173 throws IOException {
174 if (info1 == null) {
175 throw new IOException("Could not find " + Bytes.toStringBinary(region1) + " in " +
176 Bytes.toStringBinary(meta.getRegionName()));
177 }
178 if (info2 == null) {
179 throw new IOException("Could not find " + Bytes.toStringBinary(region2) + " in " +
180 Bytes.toStringBinary(meta.getRegionName()));
181 }
182 HRegion merged = null;
183 HRegion r1 = HRegion.openHRegion(info1, htd, utils.getLog(info1), getConf());
184 try {
185 HRegion r2 = HRegion.openHRegion(info2, htd, utils.getLog(info2), getConf());
186 try {
187 merged = HRegion.merge(r1, r2);
188 } finally {
189 if (!r2.isClosed()) {
190 r2.close();
191 }
192 }
193 } finally {
194 if (!r1.isClosed()) {
195 r1.close();
196 }
197 }
198
199 // Remove the old regions from meta.
200 // HRegion.merge has already deleted their files
201
202 removeRegionFromMeta(meta, info1);
203 removeRegionFromMeta(meta, info2);
204
205 this.mergeInfo = merged.getRegionInfo();
206 return merged;
207 }
在hbase 0.94的merge工具中, 提供了offline合併Region的工具,
然而, trace程式的結果, 並能夠回答為什麼有offline的限制,
(為什麼不直接online修改.META.表格就好了?)
接下來, 將繼續trace HRegion.merge()這一隻程式,
觀察Region合併的過程.
參考資料:
https://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/util/Merge.html
https://hbase.apache.org/xref/org/apache/hadoop/hbase/util/Merge.html
http://blog.csdn.net/macyang/article/details/6624482
留言
張貼留言