Hbase API操作優化

一. put 優化

hbase的api配備了乙個客戶端的寫緩衝區（write buffer），緩衝區負責收集put操作，然後呼叫prc操作一次性將put送往伺服器。預設情況下寫緩衝區是禁用的，可以呼叫table.setautoflush(false)來啟用緩衝區：

@test
public  void testwritebuffer() throws exception
table.close();
system.out.println(system.currenttimemillis()-start);
}

測試結果：

1 使用table.setautoflushto(false);：結果864 ，注意：即使不使用 table.flushcommits(); 在執行table.close();時也會提交快取內容。

2 不使用table.setautoflushto(false);：結果25443

write buffer預設大小是2mb，如果需要一次儲存較大的資料，可以考慮增大這個數值

方法1: 臨時修改writebuffersize

table.setwritebuffersize(writebuffersize);

方法2: 一次性修改hbase-site.xml

hbase.client.write.buffer

2097152

另外使用list也可以優化put，下面**測試結果614：

@test
public  void testpublist() throws exception
table.put(publist);
table.close();
system.out.println(system.currenttimemillis()-start);
}

二 scan優化

設定掃瞄緩衝器大小可以優化scanner效能，

@test
public  void testscancache() throws exception
table.close();
system.out.println(system.currenttimemillis() - start);
}

scan.setcaching(int value); value 代表一次rpc獲取的行數。預設值取hbase-site.xml中的hbase.client.scanner.caching，為2147483647。所以上例中使用了scan.setcaching(100);效能反而降低。

scanner.caching值過高也會帶來一些壞處，比如rpc超時或者返回給客戶端的資料超過了其堆的大小。

Hbase API高階特性 FilterList

需要使用多個過濾器共同限制返回到客戶端的結果。filterlist public void filterlist throws ioexception scanner1.close 第二個掃瞄器中設定了must pass one，表示只要資料通過了乙個過濾器的過濾就返回 filterlist fil...

Hbase API管理功能1

1.hbase建表涉及到表結構列簇結構的定義，這些定義關係到表和列簇內的資料如何儲存以及何時儲存。2.hbase中的資料最終儲存到表中的主要原因是控制表中的列以到達共享表內的某些特性。3.客服端與伺服器伺服器與伺服器之間進行通訊，都是用hadoop rpc框架，引數都實現了writable介...

Hbase API高階特性計數器

2.原子操作檢查並修改將當前列當作計數器。3.如果沒有計數器特性使用者需要對一行資料加鎖，然後讀取資料，再對當前資料做加法，最後寫回hbase並釋放該行鎖。這樣會引起大量的資源競爭，有其是當客戶端程序崩潰之後，尚未釋放的鎖需要等待超時恢復，這會是乙個高負載的系統中引起災難性的後果。4.計數器的增...

Hbase API操作優化

Hbase API高階特性 FilterList

Hbase API管理功能1

Hbase API高階特性 計數器

相關推薦

Hbase API高階特性計數器