mapreduce 將hdfs資料逐行寫入mysql

阿新 • • 發佈：2019-01-25

code

package hdfsToSQL;

import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop 
.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser 
;





public class hdfsToSQL {
    static String driver = "com.mysql.jdbc.Driver";  
//  static String url = "jdbc:mysql://192.168.1.58:3306/powerloaddata?user=dbuser&password=lfmysql";  
    static String url = "jdbc:mysql://master:3306/test?user=root";  
    static Connection conn = null; 
    static Statement stmt = null; 

    static ResultSet rs = null;

    public static class hdfsToSQLMapper extends Mapper<Object, Text, Text, IntWritable>{



        public void map(Object key , Text value, Context context) throws IOException, InterruptedException {
            // get lines
            String line = value.toString();
            String [] words = line.split(",");
            if (words.length == 3){

                try {
                    // write sql
                    Class.forName(driver);  
                    conn = DriverManager.getConnection(url); 
                    stmt = conn.createStatement();
                    String sql = "insert into DataPowerPrediction values("+words[0]+","+words[1]+","+words[2]+")";

                    stmt.executeUpdate(sql);
                } catch (SQLException e) {
                    // TODO Auto-generated catch block
                    e.printStackTrace();
                } catch (ClassNotFoundException e) {
                    // TODO Auto-generated catch block
                    e.printStackTrace();
                }finally {

                    try {
                        conn.close();
                    } catch (SQLException e) {
                        // TODO Auto-generated catch block
                        e.printStackTrace();
                    }

                }


            }


        }
    }


    public static void main(String[] args) throws Exception {

        Configuration conf = new Configuration();

        String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();

        if (otherArgs.length != 2) {

            System.err.println("Usage: wordcount <in> <out>");

            System.exit(2);

        }

        Job job = Job.getInstance(conf, "hdfsToSQL");

        job.setJarByClass(hdfsToSQL.class);

        job.setMapperClass(hdfsToSQLMapper.class);

//      job.setCombinerClass(IntSumReducer.class);

//      job.setReducerClass(IntSumReducer.class);

        job.setOutputKeyClass(Text.class);

        job.setOutputValueClass(IntWritable.class);

        FileInputFormat.addInputPath(job, new Path(otherArgs[0]));

        FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));

        System.exit(job.waitForCompletion(true) ? 0 : 1);

    }

}

執行程式碼

/usr/hadoop/bin/hadoop jar hdfsToSQL.jar hdfsToSQL.hdfsToSQL hdfs://master:9000/user/root/data/foreastdatatest.csv hdfs://master:9000/user/root/output/hdfsToSQL4

結果

15/08/27 02:02:08 INFO mapreduce.Job:  map 19% reduce 0%
15/08/27 02:02:11 INFO mapreduce.Job:  map 33% reduce 0%
15/08/27 02:02:14 INFO mapreduce.Job:  map 47% reduce 0%
15/08/27 02:02:17 INFO mapreduce.Job:  map 62% reduce 0%
15/08/27 02:02:19 INFO mapreduce.Job:  map 100% reduce 0%
15/08/27 02:02:24 INFO mapreduce.Job:  map 100% reduce 100%
15/08/27 02:02:24 INFO mapreduce.Job: Job job_1440638983382_0001 completed successfully

mapreduce 將hdfs資料逐行寫入mysql

code package hdfsToSQL; import java.io.IOException; import java.sql.Connection; import java.sql.DriverManager; import java.sql.

使用mapreduce 將hdfs中的資料匯入到到hbase 中

package hbase; import java.text.SimpleDateFormat; import java.util.Date; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase

MapReduce將HDFS文字資料匯入HBase中

HBase本身提供了很多種資料匯入的方式，通常有兩種常用方式：使用HBase提供的TableOutputFormat，原理是通過一個Mapreduce作業將資料匯入HBase 另一種方式就是使用HBase原生Client API 本文就是示範如何通過M

將檔案內容逐行讀取處理並寫入對應檔案中

場景：現有一個檔案裡面有大量的資料，約3.6G，4000多萬行，每行的資料格式是一樣的，共有9個域，如下： 3880961244329353 9 26 3862561814 2015-08-28 23:45:28 qinglei 2015-08

Python將系統內存使用量寫入mysql數據庫

Python將系統內存使用量寫入數據庫 centos6 1.前提創建數據庫和表格式[root@python ~]# mysql -uroot -pcentosmysql> create database memory;mysql> use memory;mysql> create t

Python指令碼：將Redis資料轉存到Mysql列表中

目錄一、思路三、總結一、思路連線指定的redis和mysql資料庫，從redis中取出資料，然後存到mysql中,中間會遇到幾個問題，在下面的程式碼片段中指出二、程式碼實現 # coding=utf-8 import js

Springboot上傳excel並將表格資料匯入或更新mySql資料庫

1.在pom.xml檔案中匯入註解，主要利用POI <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml</ar

Hbase之--------將Hdfs資料載入到Hbase資料庫中

資料： zhangfenglun,M,20,13522334455,[email protected],23521472 chenfei,M,20,13684634455,[email protected],84545472 liyuchen,M,20,135223342

將Hdfs資料往Hbase表中匯入

package Hbase; import java.io.IOException; import java.text.SimpleDateFormat; import java.util.Date; import org.apache.hadoop.c

使用sqoop將HDFS資料匯出到RDBMS，map100%reduce0%問題

time：2016/12/29 場景：將hive中的資料匯出到oracle資料庫中遇到的問題：使用oozie跑job的時候，一直處於running狀態。實際上資料量很小，而且語句也不復雜。檢視日

將hive資料查詢直接寫入檔案

hive -e "select dateentry,count(distinct(uid)) from log where (channel!='淘寶平臺') and appid=1 and code = 'ABP044' and dateentry> '2015

Spark SQL將資料寫入Mysql表的一些坑

轉自:https://blog.csdn.net/dai451954706/article/details/52840011/ 最近，在使用Spark SQL分析一些資料，要求將分析之後的結果資料存入到相應的MySQL表中。但是將資料處理完了之後，存

Python將資料寫入MySQL

import MySQLdb # connect MySQL conn = MySQLdb.connect( host= , user= , passwd= , db= , port= , charset='utf8') cursor

MapReduce中，從HDFS讀取資料計算後寫入HBase

基於上個例子。做一下簡單的改造。在原本的例子中，從HDFS中讀取資料計算之後再寫會HDFS裡，現在講Reducer類改造一下，把計算後的資料。寫入到HBase當中，寫完之後我們會使用HBase的命令查詢一下寫入資料。開啟原有的Reducer類，程式碼如下：import org

向HBase中匯入資料3：使用MapReduce從HDFS或本地檔案中讀取資料並寫入HBase（增加使用Reduce批量插入）

前面我們介紹了：為了提高插入效率，我們在前面只使用map的基礎上增加使用reduce，思想是使用map-reduce操作，將rowkey相同的項規約到同一個reduce中，再在reduce中構建put物件實現批量插入測試資料如下:注意到有兩條記錄是相似的。package cn

mysql資料庫中某個欄位的資料為分號分割的資料，將該資料拆分成多行

SELECTcount(id) AS counts ,TITLE,CODE,chaifenFROM(SELECTt.CODE,t.TITLE,t.ID,substring_index(substring_index(t.CODE,';',b.help_topic_id + 1

Sqoop_詳細總結使用Sqoop將HDFS/Hive/HBase與MySQL/Oracle中的資料相互匯入、匯出

一、使用Sqoop將MySQL中的資料匯入到HDFS/Hive/HBase 二、使用Sqoop將HDFS/Hive/HBase中的資料匯出到MySQL2.3HBase中的資料匯出到mysql目前沒有直

Spark中ip對映資料應用庫，二分查詢省份，將結果寫入mysql

def main(args: Array[String]): Unit = { val conf = new SparkConf() .setMaster("local") .setAppName(IpLocation3.getClass.getName) val sc =

Sqoop_具體總結使用Sqoop將HDFS/Hive/HBase與MySQL/Oracle中的數據相互導入、導出

能夠 mes south ase form html 技術 popu 沒有一、使用Sqoop將MySQL中的數據導入到HDFS/Hive/HBase 二、使用Sqoop將HDFS/Hive/HBase中的數據導出到MySQL 2.3 HBase中的數據

Spark:將DataFrame寫入Mysql

normal avi sqlt getc height serve saveas ecif access Spark將DataFrame進行一些列處理後，需要將之寫入mysql，下面是實現過程 1.mysql的信息 mysql的信息我保存在了外部的配置文件，這樣方便後續的配

mapreduce 將hdfs資料逐行寫入mysql

code

執行程式碼

結果

相關推薦