python 通过thrift 简单操作hbase
hrift 是facebook开发并开源的一个二进制通讯中间件,通过thrift,我们可以充分利用各个语言的优势,编写高效的代码。
安装thrift:http://thrift.apache.org/docs/install/ubuntu/
安装完成后到hbase的目录下,找到Hbase.thrift,该文件在
hbase-0.94.4/src/main/resources/org/apache/hadoop/hbase/thrift下可以找到
thrift --gen python hbase.thrift 会生成gen-py文件夹,将其修改成hbase
安装python的thrift库
sudo pip install thrift
创建hbase表:
1 from thrift import Thrift 2 from thrift.transport import TSocket 3 from thrift.transport import TTransport 4 from thrift.protocol import TBinaryProtocol 5 6 from hbase import Hbase 7 from hbase.ttypes import * 8 9 transport = TSocket.TSocket( ' localhost ' , 9090 ); 10 11 transport = TTransport.TBufferedTransport(transport) 12 13 protocol = TBinaryProtocol.TBinaryProtocol(transport); 14 15 client = Hbase.Client(protocol) 16 transport.open() 17 18 19 contents = ColumnDescriptor(name= ' cf: ' , maxVersions=1 ) 20 client.createTable( ' test ' , [contents]) 21 22 print client.getTableNames()
执行代码,成功后,进入hbase的shell,用命令list可以看到刚刚的test表已经创建成功。
插入数据:
1 from thrift import Thrift 2 from thrift.transport import TSocket 3 from thrift.transport import TTransport 4 from thrift.protocol import TBinaryProtocol 5 6 from hbase import Hbase 7 8 from hbase.ttypes import * 9 10 transport = TSocket.TSocket( ' localhost ' , 9090 ) 11 12 transport = TTransport.TBufferedTransport(transport) 13 14 protocol = TBinaryProtocol.TBinaryProtocol(transport) 15 16 client = Hbase.Client(protocol) 17 18 transport.open() 19 20 row = ' row-key1 ' 21 22 mutations = [Mutation(column= " cf:a " , value= " 1 " )] 23 client.mutateRow( ' test ' , row, mutations, None)
插入成功,通过scan命令查看插入结果:
获取一行数据:
1 from thrift import Thrift 2 from thrift.transport import TSocket 3 from thrift.transport import TTransport 4 from thrift.protocol import TBinaryProtocol 5 6 from hbase import Hbase 7 from hbase.ttypes import * 8 9 transport = TSocket.TSocket( ' localhost ' , 9090 ) 10 transport = TTransport.TBufferedTransport(transport) 11 12 protocol = TBinaryProtocol.TBinaryProtocol(transport) 13 14 client = Hbase.Client(protocol) 15 16 transport.open() 17 18 tableName = ' test ' 19 rowKey = ' row-key1 ' 20 21 result = client.getRow(tableName, rowKey, None) 22 print result 23 for r in result: 24 print ' the row is ' , r.row 25 print ' the values is ' , r.columns.get( ' cf:a ' ).value
getRow返回的是TResult列表,结果如下:
返回多行则需要使用scan:
1 from thrift import Thrift 2 from thrift.transport import TSocket 3 from thrift.transport import TTransport 4 from thrift.protocol import TBinaryProtocol 5 6 from hbase import Hbase 7 from hbase.ttypes import * 8 9 transport = TSocket.TSocket( ' localhost ' , 9090 ) 10 transport = TTransport.TBufferedTransport(transport) 11 12 protocol = TBinaryProtocol.TBinaryProtocol(transport) 13 14 client = Hbase.Client(protocol) 15 transport.open() 16 17 scan = TScan() 18 tableName = ' test ' 19 id = client.scannerOpenWithScan(tableName, scan, None) 20 21 result2 = client.scannerGetList(id, 10 ) 22 23 print result2
scannerGetList会取10条数据,然后输出结果
scannerGet则是每次只取一行数据:
1 from thrift import Thrift 2 from thrift.transport import TSocket 3 from thrift.transport import TTransport 4 from thrift.protocol import TBinaryProtocol 5 6 from hbase import Hbase 7 from hbase.ttypes import * 8 9 transport = TSocket.TSocket( ' localhost ' , 9090 ) 10 transport = TTransport.TBufferedTransport(transport) 11 12 protocol = TBinaryProtocol.TBinaryProtocol(transport) 13 14 client = Hbase.Client(protocol) 15 transport.open() 16 17 scan = TScan() 18 tableName = ' test ' 19 id = client.scannerOpenWithScan(tableName, scan, None) 20 result = client.scannerGet(id) 21 while result: 22 print result 23 result = client.scannerGet(id)
输出结果:
分类: hadoop , hbase
标签: python , thrift
作者: Leo_wl
出处: http://www.cnblogs.com/Leo_wl/
本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。
版权信息查看更多关于python 通过thrift 简单操作hbase的详细内容...