hadoop - How can I read from one HBase instance but write to another? -


currently have 2 hbase tables (lets call them tablea , tableb). using single stage mapreduce job data in tablea read processed , saved tableb. both tables reside on same hbase cluster. however, need relocate tableb on cluster.

is possible configure single stage map reduce job in hadoop read , write separate instances of hbase?

it possible, hbase's copytable mapreduce job using tablemapreduceutil.inittablereducerjob() allows set alternative quorumaddress in case need write remote clusters:

public static void inittablereducerjob(string table, class<? extends tablereducer> reducer, org.apache.hadoop.mapreduce.job job, class partitioner, string quorumaddress, string serverclass, string serverimpl) 

quorumaddress - distant cluster write to; default null output cluster designated in hbase-site.xml. set string zookeeper ensemble of alternate remote cluster when have reduce write cluster other default; e.g. copying tables between clusters, source designated hbase-site.xml , param have ensemble address of remote cluster. format pass particular. pass :: such server,server2,server3:2181:/hbase.


another option implement own custom reducer write remote table instead of writing context. similar this:

public static class myreducer extends reducer<text, result, text, text> {      protected table remotetable;      protected connection connection;      @override     protected void setup(context context) throws ioexception, interruptedexception {         super.setup(context);         // clone configuration , provide new quorum address remote cluster         configuration config = hbaseconfiguration.create(context.getconfiguration());         config.set("hbase.zookeeper.quorum","quorum1,quorum2,quorum3");         connection = connectionfactory.createconnection(config); // hbase 0.99+         //connection = hconnectionmanager.createconnection(config); // hbase <0.99         remotetable = connection.gettable("mytable".getbytes());         remotetable.setautoflush(false);         remotetable.setwritebuffersize(1024l*1024l*10l); // 10mb buffer     }      public void reduce(text boardkey, iterable<result> results, context context) throws ioexception, interruptedexception {         /* write puts remotetable */     }      @override     protected void cleanup(context context) throws ioexception, interruptedexception {         super.cleanup(context);         if (remotetable!=null) {             remotetable.flushcommits();             remotetable.close();         }         if(connection!=null) {             connection.close();         }     } } 

Popular posts from this blog