HBase用一個MapReduce Job同時寫入兩張表

阿新 • • 發佈：2019-02-07

原始資料如下：

fansy,22,blog.csdu.net/fansy1990
tom,25,blog.csdu.net/tom1987
kate,23,blog.csdu.net/kate1989
jake,20,blog.csdu.net/jake1992
john,35,blog.csdu.net/john1977
ben,30,blog.csdu.net/ben1982

第一列代表name，dierlie代表age，disanlie代表webPage；要做的事情是把name和age存入表1，name和webPage存入表2；下面貼程式碼：

ImportToHB.java:

package org.fansy.multipletables;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.hbase.mapreduce.MultiTableOutputFormat;
import org.apache.hadoop.io.Writable;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
/**
 * write to multiple tables
 * @author fansy
 *
 */
public class ImportToHB extends Configured implements Tool{


	public static void main(String[] args) throws Exception {
		int exitCode = ToolRunner.run(new ImportToHB(), args);
        System.exit(exitCode);
	}

	@Override
	public int run(String[] args) throws Exception {
		if(args.length!=7){
			System.err.println("wrong args length:"+args.length);
		//	System.out.println();			
			System.out.println("Usage: <input> <table1> <table1-fam> <table1-qua> "+
		"<table2> <table2-fam> <table2-qua>");
			System.exit(-1);
		}
		Configuration conf=new Configuration();
		conf.set("TABLE1", args[1]);
		conf.set("T1-FAM", args[2]);
		conf.set("T1-QUA", args[3]);
		conf.set("TABLE2", args[4]);
		conf.set("T2-FAM", args[5]);
		conf.set("T2-QUA", args[6]);
		Job job = new Job(conf);
        job.setJarByClass(ImportToHB.class);
        job.setMapperClass(MapperHB.class);
        job.setMapOutputKeyClass(ImmutableBytesWritable.class);
        job.setMapOutputValueClass(Writable.class);
        FileInputFormat.addInputPath(job, new Path(args[0]));
        job.setOutputFormatClass(MultiTableOutputFormat.class);
        job.setNumReduceTasks(0);

        if(job.waitForCompletion(true)){
        	return 0;
        }
		return -1;
	}
}

MapperHB.java:

package org.fansy.multipletables;

import java.io.IOException;

import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.io.*;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.Mapper;

public class MapperHB extends Mapper<LongWritable,Text,ImmutableBytesWritable,Writable>{
	private byte[] table1;
	private byte[] table2;
	private byte[] t1_fam;
	private byte[] t1_qua;
	private byte[] t2_fam;
	private byte[] t2_qua;
	
	public  void setup(Context context){
		table1=Bytes.toBytes(context.getConfiguration().get("TABLE1"));
		table2=Bytes.toBytes(context.getConfiguration().get("TABLE2"));
		t1_fam=Bytes.toBytes(context.getConfiguration().get("T1-FAM"));
		t1_qua=Bytes.toBytes(context.getConfiguration().get("T1-QUA"));
		t2_fam=Bytes.toBytes(context.getConfiguration().get("T2-FAM"));
		t2_qua=Bytes.toBytes(context.getConfiguration().get("T2-QUA"));
	}
	
	public void map(LongWritable key,Text value,Context context) throws IOException, InterruptedException{
		String[] info=value.toString().split(",");
		if(info.length!=3){
			return;
		}
		String name=info[0];
		String age=info[1];
		String webPage=info[2];
		
		// write to the first table row = name+age, value=age;
		 ImmutableBytesWritable putTable = new ImmutableBytesWritable(table1);
		 Put put = new Put(Bytes.toBytes(name+","+age));
		 put.add(t1_fam,t1_qua,Bytes.toBytes(age));
		 context.write(putTable, put);
		 
		 // write to the second table row=name+webPage,value=webPage
		  putTable = new ImmutableBytesWritable(table2);
		 put = new Put(Bytes.toBytes(name+","+webPage));
		 put.add(t2_fam,t2_qua,Bytes.toBytes(webPage));
		 context.write(putTable, put);
	}
}

上面的程式碼只用了一個Mapper，同時寫入兩個HBase表中。這裡的要點是設定Mapper的輸出key和value的型別，按照上面的程式碼型別為：ImmutableBytesWritable和Writable,而且在job的宣告處要設定輸出型別：job.setOutputFormatClass(MultiTableOutputFormat.class);

如何執行上面的程式？

(1)在HBase中建立兩張表:
create 'table1','info'
create 'table2','info'
(2)ImportToHB的輸入引數如下：
hdfs://master:9000/user/fansy/input/info.dat table1 info age table2 info webPage
(3)直接在eclipse中執行

執行後在HBase中察看輸出的資料如下：

分享，快樂，成長

HBase用一個MapReduce Job同時寫入兩張表

原始資料如下：fansy,22,blog.csdu.net/fansy1990 tom,25,blog.csdu.net/tom1987 kate,23,blog.csdu.net/kate1989 jake,20,blog.csdu.net/jake1992 john,35

如何同時向兩張表插入記錄

sql語句中很難寫或者沒有同時插入兩張表的sql語句，解決辦法是在xml裡寫兩個insert sql語句，執行兩個插入方法： <insert id="com.app.multimedia.domain.TB0005_SNIMDT.insert"parameterCl

SpringMVC同時儲存兩張表思路

如果需同時儲存兩張表，一個使用者表，一個訂單表，當點選儲存按鈕時，通過ajax傳送請求到後臺控制器controller，在控制器裡面先去判斷該使用者是否存在，即先呼叫判斷使用者是否存在的方法，然後對該

MVC同時返回兩張表的資料

1. 前置條件: 頁面上展示的資料需要從兩張表中查詢 2. 後臺在一個方法中查詢了兩張表後臺程式碼: public ActionResult Index() {

easyui datagrid的列編輯，同時插入兩張表的資料進去

看圖說話。需求：插入兩張表，上面的表單是第一張表的內容，下面的兩個表格是第二張詳情表的內容，跟第一張表的id關聯第二張表有一個列是需要使用者手動填寫新增的。國際慣例，上程式碼 <div id="cc" class="easyui-layout" style

Mysql如何寫一個儲存過程，同時向兩張表裡插入資料，有入參

兩篇文章參考 https://www.cnblogs.com/phpper/p/7361841.html https://www.cnblogs.com/mark-chan/p/5384139.html 回答“： https://segment

用一個MapReduce輸出多個key的分割槽檔案

先看一下要處理的資料型別 19392963501,17816115082,2018-09-18 16:19:44,1431 14081946321,13094566759,2018-05-23 09:34:27,0610 13415701165,18939575060,2018-

SpringMVC配置雙資料來源，一個java專案同時連線兩個資料庫

資料來源在配置檔案中的配置 [java] view plain copy print? <pre name=“code”class=“java”><?xml version=“1.0” encoding=“UTF-8”?> <beans xml

【PostgresSQL】同時更新兩個表

post style gre column div tab pre sql from UPDATE table1 SET column = value FROM table2 WHERE table1.column2 = table2.column2 【Po

用一句sql語句更新兩個表並可更新對應的欄位的值

ACCESS 例子： insert into products (ProNumber,CASNumber,Cnname,Price,Enname,Baozhuang,Pinpai) select ProNumber,CASNumber,Cnname,Price,Enname,Baozhuan

oracle：一個update修改兩張表

需求：用一個update語句修改兩張表？思路：用觸發器來解決觸發器程式碼：表a：table_a，表b：table_b，其中表b裡面有表a的id，這個觸發器意思是當修改表a的最後修改人：table_a_last_chg_usr時，讓表b的最後修改人欄位也跟著修改table_b_las

一個表一次關聯兩張表 desc降序 asc升序

public function wocaotuijian(){ $list=Db::table('xc_tuijiansum')->alias('a') ->join('xc_member me','a.ui

oracle通過兩張表的一個欄位對應，update其中一張表的某個欄位

A、B兩張表，通過關聯欄位A1=B1，複製B表字段B2中資料到A表A2中 update A a set a.A2 = (select b.B2 from B b where b.B1=a.A1) where exists (select 1 from B where B

MySQL用觸發器同步兩張表

在MySQL環境下面，建立相關觸發器，在兩個表之間相關sql，希望能幫到大家 1.建person表 CREATE TABLE `person` ( `id` int(11) DEFAULT NULL, `name` varchar(256) DEFAULT NUL

在不同位置同時顯示兩張點陣圖

在OnDraw(CDC* pDC) 函式裡新增如下程式碼： void CSDIFormView::OnDraw(CDC* pDC) { // TODO: Add your specialized code here and/or call the

mysql——我自己寫的兩張表（插入、刪除、更新）同時進行的（觸發器）例項

資料庫如下： <span style="font-size:18px;">create database mytest; use mytest; DROP TABLE IF EXISTS `user1`; CREATE TABLE `user1` ( `i

兩張表如何關聯

vol pos volist round method back post bsp _id <form method=‘post‘ action=‘xxx.php‘><select name=‘brand_id‘><volist name=‘c

工作同時刪除2張表中記錄

log inf 記錄 ext mage alt bsp 技術作用 mysql: delete a.*,b.* from sys_z_info a,sys_o_info b where a.z_id=b.d_id and a.z_id=‘123456‘ 這個呢在只有1張表

Mysql兩張表相同ID匹配，輸出到新表，刪除舊表匹配

mysql匹配數據0x00：前言。有兩張表，一張表字段是ID-Email，另一張表字段是ID-PWD，想用SQL腳本把這兩張表合並合並,因為有相同的ID才可以匹配。0x01：示例。一個字段ID:Email另一個字段是ID:PWD想要的輸出就結果ID:Email:PWD0x02：Mysql語句。註意：因為MyS

minus查找兩張表的不同項

min sele style sel rom bsp 語句 spa span minus關鍵字的使用： select * from A minus select * from B; 上面的SQL語句返回的是表A中存在，表B中不存在的數據；註意：1、區分不同的規則是查詢

HBase用一個MapReduce Job同時寫入兩張表

相關推薦