python操作 hbase 資料

python使用的包 thrift

個人使用的python 編譯器是pycharm community edition. 在工程中設定中，找到project interpreter，在相應的工程下，找到package，然後選擇「+」新增，搜尋 hbase-thrift (python client for hbase thrift inte***ce),然後安裝包。

安裝伺服器端thrift。

參考官網，同時也可以在本機上安裝以終端使用。

thrift getting started

也可以參考安裝方法 python 呼叫hbase 範例

然後，到hbase的原始碼包裡，找到

src/main/resources/org/apache/hadoop/hbase/thrift/

執行 thrift –gen py hbase.thrift

mv gen-py/hbase/ /usr/lib/python2.4/site-packages/ (根據python版本可能有不同)

# coding:utf-8
from thrift import thrift
from thrift.transport import tsocket
from thrift.transport import ttransport
from thrift.protocol import tbinaryprotocol
from hbase import hbase
# from hbase.ttypes import columndescriptor, mutation, batchmutation
from hbase.ttypes import *
import csv
defclient_conn
():# make socket
transport = tsocket.tsocket('hostname,like:localhost', port)
# buffering is critical. raw sockets are very slow
transport = ttransport.tbufferedtransport(transport)
# wrap in a protocol
protocol = tbinaryprotocol.tbinaryprotocol(transport)
# create a client to use the protocol encoder
client = hbase.client(protocol)
# connect!
transport.open()
return client
if __name__ == "__main__":
client = client_conn()
# r = client.getrowwithcolumns('table name', 'row name', ['column name'])
# print(r[0].columns.get('column name')), type((r[0].columns.get('column name')))
result = client.getrow("table name","row name")
data_****** =
# print result[0].columns.items()
for k, v in result[0].columns.items(): #.keys()
# print type(k),type(v),v.value,,v.timestamp
writer.writerows(data)
csvfile.close()
csvfile_****** = open("data_xy_******.csv", "wb")
writer_****** = csv.writer(csvfile_******)
writer_******.writerow(["timestamp", "value"])
writer_******.writerows(data_******)
csvfile_******.close()
print
"finished"

會基礎的python應該知道result是個list，result[0].columns.items()是乙個dict 的鍵值對。可以查詢相關資料。或者通過輸出變數，觀察變數的值與型別。

說明：上面程式中 transport.open()進行鏈結，在執行完後，還需要斷開transport.close()

目前只涉及到讀資料，之後還會繼續更新其他dbase操作。

HBASE 資料操作，MapReduce

前面已經對hbase有了不少了解了，這篇重點在實踐操作。hbase本身是乙個很好的key value的儲存系統，但是也不是萬能的，很多時候還是要看用在什麼情形，怎麼使用。kv之類的資料庫就是要應用在這類快速查詢的應用上，而不是像傳統的sql那樣關聯查詢，分組計算，這些可就不是hbase的長處了。下面...

python操作hbase 基於thrift服務

特別注意 thrift thrift2，新版本的hbase，預設使用thrift2，而thrift2相比thrift，去掉了很多對hbase的命令支援。如果你要換用thrift，只要停止thrift2 服務，啟動thrift服務即可啟動停止命令 hbase bin hbase daemon.sh...

大資料HBase系列之HBase基本操作

hbase version hbase zkcli hbase shell 2.1 建立表語法 create 表名列族名 create student info 2.2 顯示所有表語法 list 或 list 表名 list student 2.3 顯示表描述語法 describe 表名 d...

python操作 hbase 資料

HBASE 資料操作，MapReduce

python操作hbase 基於thrift服務

大資料HBase系列之HBase基本操作

相關推薦