1. 程式人生 > 其它 >MapReduce自定義Driver類實現(第三步)

MapReduce自定義Driver類實現(第三步)

1、Driver類:配置Mapper和Reducer的相關屬性
通過WordCountApp.java將Mapper和Reducer關聯起來
使用MapReduce統計HDFS上的檔案對應的詞頻
提交到本地執行:開發過程中使用

2、WordCountApp.java

package com.imooc.bigdata.hadoop.mapreduce.wordcount;


import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; /* * Driver類:配置Mapper和Reducer的相關屬性 * 通過WordCountApp.java將Mapper和Reducer關聯起來 * 使用MapReduce統計HDFS上的檔案對應的詞頻 * * 提交到本地執行:開發過程中使用
*/ public class WordCountApp { public static void main(String[] args) throws Exception{ //設定許可權 System.setProperty("HADOOP_USER_NAME", "hadoop"); Configuration configuration = new Configuration(); //在configuration裡設定一些東西: configuration.set("fs.defaultFS", "hdfs://192.168.126.101:8020");
//建立一個Job //將configuration傳進來 Job job = Job.getInstance(configuration); //設定Job對應的引數:主類 job.setJarByClass(WordCountApp.class); //設定Job對應的引數:設定自定義的Mapper和Reducer處理類 job.setMapperClass(WordCountMapper.class); job.setReducerClass(WordCountReducer.class); //設定Job對應的引數:Mapper輸出key和value的型別 //不需要關注Mapper輸入 //Mapper<LongWritable, Text, Text, IntWritable> job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(IntWritable.class); //設定Job對應的引數:Reducer輸出key和value的型別 //不需要關注Reducer輸入 //Reducer<Text, IntWritable, Text, IntWritable> job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); //設定Job對應的引數:Mapper輸出key和value的型別:作業輸入和輸出的路徑 FileInputFormat.setInputPaths(job, new Path("/wordcount/input")); FileOutputFormat.setOutputPath(job, new Path("/wordcount/output")); //提交job boolean result = job.waitForCompletion(true); System.exit(result ? 0 : -1); } //若輸出失敗,新增以下程式碼 static { try { //G:\BaiduNetdiskDownload\hadoop2.7.6\bin\hadoop.dll System.load("G:\\BaiduNetdiskDownload\\hadoop2.7.6\\bin\\hadoop.dll"); } catch (UnsatisfiedLinkError e) { System.err.println("Native code library failed to load.\n" + e); System.exit(1); } } }

3、log4j.properties

由於日誌中不報錯,新增後,可檢視錯誤原因

在resources中新建file

log4j.rootLogger=INFO,stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=[%-5p] method:%l%n%m%n

4、在命令列中多層新建input資料夾,並放入檔案h.txt

5、執行