hbase之宽表与窄表对split的影响

 

 

 Hbase的hbase.hregion.max.filesize属性值用来指定region分割的阀值, 该值默认为268435456(256MB), 当一个列族文件大小超过该值时,将会分裂成两个region。
     hbase的列可以有很多,设计时有两种方式可选择, 宽表(一行有很多列)和窄表
如有一个存储用户邮件的表
按宽表设计时,可以表示成(一个用户的所有邮件存成一行)
userid1 email1 emali2 email3 ... ... ... ... ... emailn
userid2 email1 emali2 email3 ... ... ... ... ... emailn
useridn                  
按窄表设计时,可以表示成(rowkey由用ID和emailID组成)
userid1_emialid1  email1
userid1_emialid2  email2
userid1_emialid3  email2
userid1_emialidn  emailn
userid2_emialid1  email1
userid2_emialid2  email2
userid2_emialid3  email3
userid2_emialidn  emailn
这两种设计方法会对region的分割造成影响, 今天在看HFileOutputFormat代码时发现它new出的RecordWriter对 region分割有一定的限制,

只有当rowkey不同是才会做分割, 而rowkey相同时即使region大小已经超过hbase.hregion.max.filesize值, 也不会分割
RecordWriter代码:



1. public void write(ImmutableBytesWritable row, KeyValue kv)  
2.       throws IOException {  
3.         long length = kv.getLength();  
4.         byte [] family = kv.getFamily();  
5.         WriterLength wl = this.writers.get(family);  
6.         if (wl == null || ((length + wl.written) >= maxsize) &&  
7.             Bytes.compareTo(this.previousRow, 0, this.previousRow.length,  
8.               kv.getBuffer(), kv.getRowOffset(), kv.getRowLength()) != 0) {  
9.           // Get a new writer.  
10.           Path basedir = new Path(outputdir, Bytes.toString(family));  
11.           if (wl == null) {  
12.             wl = new WriterLength();  
13.             this.writers.put(family, wl);  
14.             if (this.writers.size() > 1) throw new IOException("One family only");  
15.             // If wl == null, first file in family.  Ensure family dir exits.  
16.             if (!fs.exists(basedir)) fs.mkdirs(basedir);  
17.           }  
18.           wl.writer = getNewWriter(wl.writer, basedir);  
19.           LOG.info("Writer=" + wl.writer.getPath() +  
20.             ((wl.written == 0)? "": ", wrote=" + wl.written));  
21.           wl.written = 0;  
22.         }  
23.         kv.updateLatestStamp(this.now);  
24.         wl.writer.append(kv);  
25.         wl.written += length;  
26.         // Copy the row so we know when a row transition.  
27.         this.previousRow = kv.getRow();  
28.       }

标红加粗部分说明当块大小大于hbase.hregion.max.filesize值, 并却当前行与上一次插入的行不同时才会分割region.
1. 宽表情况下, 单独一行大小超过hbase.hregion.max.filesize值, 不会做分割
2. 相同rowkey下插入很多不同版本的记录,即使大小超过hbase.hregion.max.filesize值, 也不会做分割

下面就来验证下:
为了尽早看到效果, 需要在hbase-site.xml中修改两个配置参数





1. <property>
2.     <name>hbase.hregion.memstore.flush.size</name>  
3.     <value>5</value>  
4.     <description>  
5.     Memstore will be flushed to disk if size of the memstore  
6.     exceeds this number of bytes.  Value is checked by a thread that runs  
7.     every hbase.server.thread.wakefrequency.  
8.     </description>  
9.   </property>  
10. <property>  
11.     <name>hbase.hregion.max.filesize</name>  
12.     <value>10</value>  
13.     <description>  
14.     Maximum HStoreFile size. If any one of a column families' HStoreFiles has  
15.     grown to exceed this value, the hosting HRegion is split in two.  
16.     Default: 256M.  
17.     </description>  
18.   </property>

建测试表t1和t2



1. hbase(main):076:0* create 't1','f1' 
2. 0 row(s) in 1.6460 seconds 
3.  
4. hbase(main):077:0> create 't2','f1' 
5. 0 row(s) in 1.1790 seconds

查看系统表 .META.




1. hbase(main):081:0* scan '.META.' 
2. ROW                                                 COLUMN+CELL                                                                                                                                           
3. t1,,1314720667274.d8acd6bc659ac8326b88850d645a90ad column=info:regioninfo,timestamp=1314720667384, value=REGION => {NAME =>'t1,,1314720667274.d8acd6bc659ac8326b88850d645a90ad.', STARTKEY => '', ENDK 
4. .                                                  EY => '', ENCODED =>d8acd6bc659ac8326b88850d645a90ad, TABLE => {{NAME => 't1', FAMILIES => [{NAME =>'f1', BLOOMFILTER => 'NONE', REPLICATION_SCOPE  
5.                                                     => '0', COMPRESSION => 'NONE',VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',BLOCKCACHE => 'true'}]}}             
6. t1,,1314720667274.d8acd6bc659ac8326b88850d645a90ad column=info:server,timestamp=1314720667941,value=yinjie:60020                                                                                       
7. .                                                                                                                                                                                                        
8. t1,,1314720667274.d8acd6bc659ac8326b88850d645a90ad column=info:serverstartcode,timestamp=1314720667941,value=1314716290123                                                                             
9. .                                                                                                                                                                                                        
10. t2,,1314720672168.16bb3d2563eab3b4e25477c64e007e71 column=info:regioninfo,timestamp=1314720672241, value=REGION => {NAME =>'t2,,1314720672168.16bb3d2563eab3b4e25477c64e007e71.', STARTKEY => '', ENDK 
11. .                                                  EY => '', ENCODED =>16bb3d2563eab3b4e25477c64e007e71, TABLE => {{NAME => 't2', FAMILIES => [{NAME =>'f1', BLOOMFILTER => 'NONE', REPLICATION_SCOPE  
12.                                                     => '0', COMPRESSION => 'NONE',VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',BLOCKCACHE => 'true'}]}}             
13. t2,,1314720672168.16bb3d2563eab3b4e25477c64e007e71 column=info:server,timestamp=1314720672346,value=yinjie:60020                                                                                       
14. .                                                                                                                                                                                                        
15. t2,,1314720672168.16bb3d2563eab3b4e25477c64e007e71 column=info:serverstartcode,timestamp=1314720672346,value=1314716290123                                                                             
16. .                                                                                                                                                                                                        
17. 2 row(s) in 0.0230 seconds

可以看到此时,t1,t2都已有一个region
先往t1表插入10条记录,rowkwy相同




1. hbase(main):086:0* for i in 0..9 do\ 
2. hbase(main):087:1* put 't1','row1',"f1:c#{i}","swallow#{i}"\ 
3. hbase(main):088:1* end 
4. 0 row(s) in 0.0180 seconds 
5.  
6. 0 row(s) in 0.0070 seconds 
7.  
8. 0 row(s) in 0.0420 seconds 
9.  
10. 0 row(s) in 0.0620 seconds 
11.  
12. 0 row(s) in 0.0120 seconds 
13.  
14. 0 row(s) in 0.0770 seconds 
15.  
16. 0 row(s) in 0.0150 seconds 
17.  
18. 0 row(s) in 0.1290 seconds 
19.  
20. 0 row(s) in 10.0740 seconds 
21.  
22. 0 row(s) in 0.1230 seconds 
23.  
24. => 0..9 
25. hbase(main):089:0>

查看t1记录



1. hbase(main):089:0>
2. ROW                                                 COLUMN+CELL                                                                                                                                           
3. row1                                               column=f1:c0,timestamp=1314720946495,value=swallow0                                                                                                 
4. row1                                               column=f1:c1,timestamp=1314720946507,value=swallow1                                                                                                 
5. row1                                               column=f1:c2,timestamp=1314720946903,value=swallow2                                                                                                 
6. row1                                               column=f1:c3,timestamp=1314720946939,value=swallow3                                                                                                 
7. row1                                               column=f1:c4,timestamp=1314720946976,value=swallow4                                                                                                 
8. row1                                               column=f1:c5,timestamp=1314720947055,value=swallow5                                                                                                 
9. row1                                               column=f1:c6,timestamp=1314720947070,value=swallow6                                                                                                 
10. row1                                               column=f1:c7,timestamp=1314720947198,value=swallow7                                                                                                 
11. row1                                               column=f1:c8,timestamp=1314720957272,value=swallow8                                                                                                 
12. row1                                               column=f1:c9,timestamp=1314720957392,value=swallow9                                                                                                 
13. 1 row(s) in 0.0300 seconds

查看 .META.



1. hbase(main):090:0>
2. ROW                                                 COLUMN+CELL                                                                                                                                           
3. t1,,1314720667274.d8acd6bc659ac8326b88850d645a90ad column=info:regioninfo,timestamp=1314720667384, value=REGION => {NAME =>'t1,,1314720667274.d8acd6bc659ac8326b88850d645a90ad.', STARTKEY => '', ENDK 
4. .                                                  EY => '', ENCODED =>d8acd6bc659ac8326b88850d645a90ad, TABLE => {{NAME => 't1', FAMILIES => [{NAME =>'f1', BLOOMFILTER => 'NONE', REPLICATION_SCOPE  
5.                                                     => '0', COMPRESSION => 'NONE',VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',BLOCKCACHE => 'true'}]}}             
6. t1,,1314720667274.d8acd6bc659ac8326b88850d645a90ad column=info:server,timestamp=1314720667941,value=yinjie:60020                                                                                       
7. .                                                                                                                                                                                                        
8. t1,,1314720667274.d8acd6bc659ac8326b88850d645a90ad column=info:serverstartcode,timestamp=1314720667941,value=1314716290123                                                                             
9. .                                                                                                                                                                                                        
10. t2,,1314720672168.16bb3d2563eab3b4e25477c64e007e71 column=info:regioninfo,timestamp=1314720672241, value=REGION => {NAME =>'t2,,1314720672168.16bb3d2563eab3b4e25477c64e007e71.', STARTKEY => '', ENDK 
11. .                                                  EY => '', ENCODED =>16bb3d2563eab3b4e25477c64e007e71, TABLE => {{NAME => 't2', FAMILIES => [{NAME =>'f1', BLOOMFILTER => 'NONE', REPLICATION_SCOPE  
12.                                                     => '0', COMPRESSION => 'NONE',VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',BLOCKCACHE => 'true'}]}}             
13. t2,,1314720672168.16bb3d2563eab3b4e25477c64e007e71 column=info:server,timestamp=1314720672346,value=yinjie:60020                                                                                       
14. .                                                                                                                                                                                                        
15. t2,,1314720672168.16bb3d2563eab3b4e25477c64e007e71 column=info:serverstartcode,timestamp=1314720672346,value=1314716290123                                                                             
16. .                                                                                                                                                                                                        
17. 2 row(s) in 0.0210 seconds

可以看到t1仍旧只有一个region

接下去往往t2表插入10条相同记录,但rowkwy不同



1. hbase(main):091:0>
2. hbase(main):092:1* put 't2',"row#{i}","f1:c#{i}","swallow#{i}"\ 
3. hbase(main):093:1* end 
4. 0 row(s) in 0.1140 seconds 
5.  
6. 0 row(s) in 0.0080 seconds 
7.  
8. 0 row(s) in 0.0410 seconds 
9.  
10. 0 row(s) in 0.0820 seconds 
11.  
12. 0 row(s) in 0.0210 seconds 
13.  
14. 0 row(s) in 0.0410 seconds 
15.  
16. 0 row(s) in 0.0200 seconds 
17.  
18. 0 row(s) in 0.1210 seconds 
19.  
20. 0 row(s) in 0.0140 seconds 
21.  
22. 0 row(s) in 0.0360 seconds 
23.  
24. => 0..9

查看t2记录



1. hbase(main):097:0* scan 't2' 
2. ROW                                                 COLUMN+CELL                                                                                                                                           
3. row0                                               column=f1:c0,timestamp=1314721110769,value=swallow0                                                                                                 
4. row1                                               column=f1:c1,timestamp=1314721110787,value=swallow1                                                                                                 
5. row2                                               column=f1:c2,timestamp=1314721110830,value=swallow2                                                                                                 
6. row3                                               column=f1:c3,timestamp=1314721110916,value=swallow3                                                                                                 
7. row4                                               column=f1:c4,timestamp=1314721110932,value=swallow4                                                                                                 
8. row5                                               column=f1:c5,timestamp=1314721110971,value=swallow5                                                                                                 
9. row6                                               column=f1:c6,timestamp=1314721110989,value=swallow6                                                                                                 
10. row7                                               column=f1:c7,timestamp=1314721111121,value=swallow7                                                                                                 
11. row8                                               column=f1:c8,timestamp=1314721111130,value=swallow8                                                                                                 
12. row9                                               column=f1:c9,timestamp=1314721111172,value=swallow9                                                                                                 
13. 10 row(s) in 1.0450 seconds

查看 .META.




1. hbase(main):102:0>
2. ROW                                                 COLUMN+CELL                                                                                                                                           
3. t1,,1314720667274.d8acd6bc659ac8326b88850d645a90ad column=info:regioninfo,timestamp=1314720667384, value=REGION => {NAME =>'t1,,1314720667274.d8acd6bc659ac8326b88850d645a90ad.', STARTKEY => '', ENDK 
4. .                                                  EY => '', ENCODED =>d8acd6bc659ac8326b88850d645a90ad, TABLE => {{NAME => 't1', FAMILIES => [{NAME =>'f1', BLOOMFILTER => 'NONE', REPLICATION_SCOPE  
5.                                                     => '0', COMPRESSION => 'NONE',VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',BLOCKCACHE => 'true'}]}}             
6. t1,,1314720667274.d8acd6bc659ac8326b88850d645a90ad column=info:server,timestamp=1314720667941,value=yinjie:60020                                                                                       
7. .                                                                                                                                                                                                        
8. t1,,1314720667274.d8acd6bc659ac8326b88850d645a90ad column=info:serverstartcode,timestamp=1314720667941,value=1314716290123                                                                             
9. .                                                                                                                                                                                                        
10. t2,,1314720672168.16bb3d2563eab3b4e25477c64e007e71 column=info:regioninfo,timestamp=1314721112130, value=REGION => {NAME =>'t2,,1314720672168.16bb3d2563eab3b4e25477c64e007e71.', STARTKEY => '', ENDK 
11. .                                                  EY => '', ENCODED =>16bb3d2563eab3b4e25477c64e007e71, OFFLINE => true, SPLIT => true, TABLE => {{NAME=> 't2', FAMILIES => [{NAME => 'f1', BLOOMFILT 
12.                                                     ER => 'NONE', REPLICATION_SCOPE=> '0', VERSIONS => '3', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE =>'65536', IN_MEMORY => 'false', BLOC 
13.                                                     KCACHE =>'true'}]}}                                                                                                                                  
14. t2,,1314720672168.16bb3d2563eab3b4e25477c64e007e71 column=info:server,timestamp=1314720672346,value=yinjie:60020                                                                                       
15. .                                                                                                                                                                                                        
16. t2,,1314720672168.16bb3d2563eab3b4e25477c64e007e71 column=info:serverstartcode,timestamp=1314720672346,value=1314716290123                                                                             
17. .                                                                                                                                                                                                        
18. t2,,1314720672168.16bb3d2563eab3b4e25477c64e007e71 column=info:splitA,timestamp=1314721112130, value=REGION => {NAME =>'t2,,1314721111490.71df02214242923574b71fe5e2a19360.', STARTKEY => '', ENDKEY = 
19. .                                                  > 'row0', ENCODED =>71df02214242923574b71fe5e2a19360, TABLE => {{NAME => 't2', FAMILIES => [{NAME =>'f1', BLOOMFILTER => 'NONE', REPLICATION_SCOPE  
20.                                                     => '0', VERSIONS => '3',COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY =>'false', BLOCKCACHE => 'true'}]}}             
21. t2,,1314720672168.16bb3d2563eab3b4e25477c64e007e71 column=info:splitB,timestamp=1314721112130, value=REGION => {NAME =>'t2,row0,1314721111490.915ee8d4a32c59a4ec3960e335b061ca.', STARTKEY => 'row0',  
22. .                                                  ENDKEY => '', ENCODED =>915ee8d4a32c59a4ec3960e335b061ca, TABLE => {{NAME => 't2', FAMILIES => [{NAME =>'f1', BLOOMFILTER => 'NONE', REPLICATION_SC 
23.                                                     OPE => '0', VERSIONS => '3',COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY =>'false', BLOCKCACHE => 'true'}]}}         
24. t2,,1314721111490.71df02214242923574b71fe5e2a19360 column=info:regioninfo,timestamp=1314721112267, value=REGION => {NAME =>'t2,,1314721111490.71df02214242923574b71fe5e2a19360.', STARTKEY => '', ENDK 
25. .                                                  EY => 'row0', ENCODED =>71df02214242923574b71fe5e2a19360, TABLE => {{NAME => 't2', FAMILIES => [{NAME =>'f1', BLOOMFILTER => 'NONE', REPLICATION_SC 
26.                                                     OPE => '0', VERSIONS => '3',COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY =>'false', BLOCKCACHE => 'true'}]}}         
27. t2,,1314721111490.71df02214242923574b71fe5e2a19360 column=info:server,timestamp=1314721112267,value=yinjie:60020                                                                                       
28. .                                                                                                                                                                                                        
29. t2,,1314721111490.71df02214242923574b71fe5e2a19360 column=info:serverstartcode,timestamp=1314721112267,value=1314716290123                                                                             
30. .                                                                                                                                                                                                        
31. t2,row0,1314721111490.915ee8d4a32c59a4ec3960e335b0 column=info:regioninfo,timestamp=1314721112627, value=REGION => {NAME =>'t2,row0,1314721111490.915ee8d4a32c59a4ec3960e335b061ca.', STARTKEY => 'row 
32. 61ca.                                              0', ENDKEY => '', ENCODED =>915ee8d4a32c59a4ec3960e335b061ca, TABLE => {{NAME => 't2', FAMILIES => [{NAME =>'f1', BLOOMFILTER => 'NONE', REPLICATIO 
33.                                                     N_SCOPE => '0', VERSIONS =>'3', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY=> 'false', BLOCKCACHE => 'true'}]}}     
34. t2,row0,1314721111490.915ee8d4a32c59a4ec3960e335b0 column=info:server,timestamp=1314721112627,value=yinjie:60020                                                                                       
35. 61ca.                                                                                                                                                                                                    
36. t2,row0,1314721111490.915ee8d4a32c59a4ec3960e335b0 column=info:serverstartcode,timestamp=1314721112627,value=1314716290123                                                                             
37. 61ca.                                                                                                                                                                                                    
38. 4 row(s) in 0.0380 seconds

可以看到t2的region已经分裂.