hbase Java API操作实例

ddl（创建及删除表格）如何在hbase中创建表格以及删除表格。可通过java和hbase shell两种方法实现。创建表格 hbase中表格的创建是通过对操作hbaseadmin这一对象使其调用createtable()这一方法来实现。其中htabledescriptor描述了表的schema，可在其上通过
ddl（创建及删除表格）如何在hbase中创建表格以及删除表格。可通过java和hbase shell两种方法实现。
创建表格hbase中表格的创建是通过对操作hbaseadmin这一对象使其调用createtable()这一方法来实现。
其中htabledescriptor描述了表的schema，可在其上通过addfamily()这一方法增加列族。
以下java代码实现了建立一张简易的hbase表格‘table1’，该表有两个列族，分别为f1和f2。
public class createtable{ private static configuration config; private static hbaseadmin ha; public static void main(string[] args){ try{ config = hbaseconfiguration.create(); config.addresource(core-site.xml); config.addresource(hdfs-site.xml); config.addresource(yarn-site.xml); config.addresource(mapred-site.xml); ha = new hbaseadmin(config); //create table descriptor string tablename = table1; htabledescriptor htd = new htabledescriptor(bytes.tobytes(tablename)); //create and configure column families hcolumndescriptor hcd1 = new hcolumndescriptor(bytes.tobytes(family1)); hcd1.setblocksize(65536); hcd1.setmaxversions(1); hcd1.setbloomfiltertype(bloomtype.row); hcd1.setcompressiontype(algorithm.snappy); hcd1.setdatablockencoding(datablockencoding.prefix); hcd1.settimetolive(36000); hcd1.setinmemory(false); hcolumndescriptor hcd2 = new hcolumndescriptor(bytes.tobytes(family2)); hcd2.setblocksize(65536); hcd2.setmaxversions(1); hcd2.setbloomfiltertype(bloomtype.row); hcd2.setcompressiontype(algorithm.snappy); hcd2.setdatablockencoding(datablockencoding.prefix); hcd2.settimetolive(36000); hcd2.setinmemory(false); //add column families to table descriptor htd.addfamily(hcd1); htd.addfamily(hcd2); //create table ha.createtable(htd); system.out.println(hbase table created.); }catch (tableexistsexception e){ system.out.println(error: attempting to create existing table!); }catch (ioexception e){ e.printstacktrace(); }finally{ try{ ha.close(); }catch(ioexception e){ e.printstacktrace(); } } }}
在hbase shell中，创建表格功能由create ‘hbase表名’，[‘列族名’...]来实现。
例如，create ‘table1’，‘family1’，‘family2’同样可创建上述表格。
删除表格删除表也是通过hbaseadmin来操作，删除表之前首先要disable表。这是一个比较耗时的操作，所以不建议频繁删除表。
以下java代码实现了对表格“table1”的删除操作：
public class deletetable{ private static configuration config; private static hbaseadmin ha; public static void main(string[] args){ try{ config = hbaseconfiguration.create(); config.addresource(core-site.xml); config.addresource(hdfs-site.xml); config.addresource(yarn-site.xml); config.addresource(mapred-site.xml); ha = new hbaseadmin(config); string tablename = table1; //only an existing table can be dropped if (ha.tableexists(tablename)){ //read&write denied ha.disabletable(tablename); ha.deletetable(tablename); system.out.println(hbase table dropped!); } }catch(ioexception e){ e.printstacktrace(); }finally{ try{ ha.close(); }catch(ioexception e){ e.printstacktrace(); } } }}
在hbase shell中，删除表格功能由drop ‘hbase表名’来实现。
例如，先disable ‘table1’再drop ‘table1’同样可删除上述表格。
数据插入在java操作中，put方法被用做插入数据。
put方法可以传递单个put对象: public void put(put put) throws ioexception，也可以对很多put对象进行批量插入: public void put(list puts) throws ioexception
以下java代码实现了对表格table1的批量数据插入操作。插入数据后，表格有10000行，列族“family1”，“family2”中都包含“q1”，“q2”两个列，其中列族“family1”储存整型数据(int)，列族“family2”储存字符串(string)。
attention：虽然hbase支持多种类型储存，但为了应用高性能优化的hbase，表格值的储存类型建议一致使用为string。如上例所示，“family1：q1”中原为整数类型，须转制成string后再录入表中
public class inserttable{ private static configuration config; public static void main(string[] args) throws ioexception{ config = hbaseconfiguration.create(); config.addresource(core-site.xml); config.addresource(hdfs-site.xml); config.addresource(yarn-site.xml); config.addresource(mapred-site.xml); string tablename = table1; htable table = new htable(config, tablename); //set autoflush table.setautoflush(true); int count = 10000; string familyname1 = family1; string familyname2 = family2; string qualifier1 = q1; string qualifier2 = q2; //data to be inserted string[] f1q1 = new string[count]; string[] f1q2 = new string[count]; string[] f2q1 = new string[count]; string[] f2q2 = new string[count]; for(int i = 0; i
在hbase shell中，单条数据插入功能由put ‘hbase表名’，‘rowkey’，‘列族名：列名’，‘数据值’来实现。
数据查询hbase表格的数据查询可分为单条查询与批量查询。
单条查询单条查询是通过匹配rowkey在表格中查询某一行的数据。在java中可通过get()这一方法来实现。
下列java代码实现了在表格“table1”中取出指定rowkey一行的所有列的数据：
public class getfromtable{ private static configuration config; public static void main(string[] args) throws ioexception{ string tablename = table1; config = hbaseconfiguration.create(); config.addresource(core-site.xml); config.addresource(hdfs-site.xml); config.addresource(yarn-site.xml); config.addresource(mapred-site.xml); htable table = new htable(config, tablename); get get = new get(bytes.tobytes(row01230)); //add target columns for get get.addcolumn(bytes.tobytes(family1), bytes.tobytes(q1)); get.addcolumn(bytes.tobytes(family1), bytes.tobytes(q2)); get.addcolumn(bytes.tobytes(family2), bytes.tobytes(q1)); get.addcolumn(bytes.tobytes(family2), bytes.tobytes(q2)); result result = table.get(get); //get results byte[] rowkey = result.getrow(); byte[] val1 = result.getvalue(bytes.tobytes(family1), bytes.tobytes(q1)); byte[] val2 = result.getvalue(bytes.tobytes(family1),bytes.tobytes(q2)); byte[] val3 = result.getvalue(bytes.tobytes(family2), bytes.tobytes(q1)); byte[] val4 = result.getvalue(bytes.tobytes(family2), bytes.tobytes(q2)); system.out.println(row key: + bytes.tostring(rowkey)); system.out.println(value1: + bytes.tostring(val1)); system.out.println(value2: + bytes.tostring(val2)); system.out.println(value3: + bytes.tostring(val3)); system.out.println(value4: + bytes.tostring(val4)); table.close(); }}
在hbase shell中，单条数据查找功能由get ‘hbase表名’，‘rowkey’，‘列族名：列名’来实现。
批量查询批量查询是通过制定一段rowkey的范围来查询。可通过java中getscanner()这一方法来实现。
下列java代码实现了在表格“table1”中取出指定一段rowkey范围的所有列的数据：
public class scanfromtable { private static configuration config; public static void main(string[] args) throws ioexception{ config = hbaseconfiguration.create(); config.addresource(core-site.xml); config.addresource(hdfs-site.xml); config.addresource(yarn-site.xml); config.addresource(mapred-site.xml); string tablename = table1; htable table = new htable(config, tablename); //scan according to rowkey range scan scan = new scan(); //set starting row(included), if not set, start from the first row scan.setstartrow(bytes.tobytes(row01000)); //set stopping row(excluded), if not set, stop at the last row scan.setstoprow(bytes.tobytes(row01100)); //specify columns to scan, if not specified, return all columns； scan.addcolumn(bytes.tobytes(family1), bytes.tobytes(q1)); scan.addcolumn(bytes.tobytes(family1), bytes.tobytes(q2)); scan.addcolumn(bytes.tobytes(family2), bytes.tobytes(q1)); scan.addcolumn(bytes.tobytes(family2), bytes.tobytes(q2)); //specify maximum versions for one cell, if called without arguments, get all versions, if not called, get only the latest version scan.setmaxversions(); //specify maximum number of cells to avoid outofmemory error caused by huge amount of data in a single row scan.setbatch(10000); resultscanner rs = table.getscanner(scan); for(result r:rs){ byte[] rowkey = r.getrow(); byte[] val1 = r.getvalue(bytes.tobytes(family1), bytes.tobytes(q1)); byte[] val2 = r.getvalue(bytes.tobytes(family1), bytes.tobytes(q2)); byte[] val3 = r.getvalue(bytes.tobytes(family2), bytes.tobytes(q1)); byte[] val4 = r.getvalue(bytes.tobytes(family2), bytes.tobytes(q2)); system.out.print(bytes.tostring(rowkey)+: ); system.out.print(bytes.tostring(val1)+ ); system.out.print(bytes.tostring(val2)+ ); system.out.print(bytes.tostring(val3)+ ); system.out.println(bytes.tostring(val4)); } rs.close(); table.close(); }}
在hbase shell中，批量数据查找功能由scan ‘hbase表名’，{columns=>‘列族名：列名’，startrow=>‘起始rowkey’，stoprow=>‘终止rowkey’}来实现。
利用过滤器筛选过滤器是在hbase服务器端上执行筛选操作，可以应用到行键(rowfilter)，列限定符(qualifierfilter)以及数据值(valuefilter)。
这里列举了两个常用的过滤器：rowfilter和singlecolumnvaluefilter。
rowfilterrowfilter通过行键(rowkey)来筛选数据。
其中binarycomparator直接比较两个byte array，可选的比较符(compareop)有equal,not_equal,greater,greater_or_equal,less,less_or_equal。
public class rowfilter{ public static void main(string[] args) throws ioexception{ string tablename = table1; configuration config = hbaseconfiguration.create(); config.addresource(core-site.xml); config.addresource(hdfs-site.xml); config.addresource(yarn-site.xml); config.addresource(mapred-site.xml); htable table = new htable(config, tablename); scan scan = new scan(); scan.addcolumn(bytes.tobytes(family1), bytes.tobytes(q1)); filter filter = new rowfilter(comparefilter.compareop.equal, new binarycomparator(bytes.tobytes(row01234))); scan.setfilter(filter); resultscanner scanner = table.getscanner(scan); for(result res:scanner){ byte[] value = res.getvalue(bytes.tobytes(family1),bytes.tobytes(q1)); system.out.println(new string(res.getrow())+ value is: +bytes.tostring(value)); } scanner.close(); table.close(); }}
singlecolumnvaluefiltersinglecolumnvaluefilter对某一具体列的值进行筛选。
其中substringcomparator检查给定的字符串是否是列值的子字符串，可选的比较符(compareop)有equal和not_equal。
public class singlecolumnvaluefilter{ public static void main(string[] args) throws ioexception{ configuration config = hbaseconfiguration.create(); config.addresource(core-site.xml); config.addresource(hdfs-site.xml); config.addresource(yarn-site.xml); config.addresource(mapred-site.xml); string tablename = table1; htable table = new htable(config,tablename); singlecolumnvaluefilter filter = new singlecolumnvaluefilter( bytes.tobytes(family2), bytes.tobytes(q1), comparefilter.compareop.not_equal, new substringcomparator(45)); //when setting setfilterifmissing(true), rows with null values are filtered filter.setfilterifmissing(true); scan scan = new scan(); scan.setfilter(filter); resultscanner scanner = table.getscanner(scan); for (result res:scanner){ byte[] val = res.getvalue(bytes.tobytes(family1), bytes.tobytes(q1)); system.out.println(new string(res.getrow())); system.out.println(value: + bytes.tostring(val)); } scanner.close(); table.close(); }}
原文地址：hbase java api操作实例, 感谢原作者分享。

hbase Java API操作实例

VIP推荐