hadoop執行程式詳解-helloworld

阿新 • • 發佈：2019-02-09

保證hadoop叢集是配置好了的，單機的也好。新建一個目錄，比如 /home/admin/WordCount

　　編譯WordCount.java程式。

1.javac -classpath /home/admin/hadoop/hadoop-0.19.1-core.jar WordCount.java -d /home/admin/WordCount（目錄）

　　2. 編譯完後在/home/admin/WordCount目錄會發現三個class檔案 WordCount.class，WordCount$Map.class，WordCount$Reduce.class。
　　cd 進入 /home/admin/WordCount目錄,然後執行：

jar cvf WordCount.jar *.class

　　就會生成 WordCount.jar 檔案。


      3.常用命令：
            /bin/hadoop fs -ls
            /bin/hadoop fs -mkdir input
            /bin/hadoop fs -rmr input
            /bin/hadoop jar 要執行的jar jar包類名（wordcount）輸入目錄input 輸出目錄output
            在執行程式之前，一定要把output輸出結果刪除，否則，報錯

             /bin/hadoop fs -rmr output

            檢視輸出結果： /bin/hadoop fs -cat output/part-r-00000
                      /bin/hadoop fs -cat output/*

package org.apache.hadoop.examples;

import java.io.IOException;

import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;

import org.apache.hadoop.mapreduce.Mapper;

import org.apache.hadoop.mapreduce.Reducer;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import org.apache.hadoop.util.GenericOptionsParser;

/**

* 描述：WordCount explains by York

* @author Hadoop Dev Group

publicclass WordCount {

/**

* 建立Mapper類TokenizerMapper繼承自泛型類Mapper

* Mapper類:實現了Map功能基類

* Mapper介面：

* WritableComparable介面：實現WritableComparable的類可以相互比較。所有被用作key的類應該實現此介面。

* Reporter 則可用於報告整個應用的執行進度，本例中未使用。

publicstaticclass TokenizerMapper

extends Mapper<Object, Text, Text, IntWritable>{

/**

* IntWritable, Text 均是 Hadoop 中實現的用於封裝 Java 資料型別的類，這些類實現了WritableComparable介面，

* 都能夠被序列化從而便於在分散式環境中進行資料交換，你可以將它們分別視為int,String 的替代品。

* 宣告one常量和word用於存放單詞的變數

privatefinalstatic IntWritable one =new IntWritable(1);

private Text word =new Text();

/**

* Mapper中的map方法：

* void map(K1 key, V1 value, Context context)

* 對映一個單個的輸入k/v對到一箇中間的k/v對

* 輸出對不需要和輸入對是相同的型別，輸入對可以對映到0個或多個輸出對。

* Context：收集Mapper輸出的<k,v>對。

* Context的write(k, v)方法:增加一個(k,v)對到context

* 程式設計師主要編寫Map和Reduce函式.這個Map函式使用StringTokenizer函式對字串進行分隔,通過write方法把單詞存入word中

* write方法存入(單詞,1)這樣的二元組到context中

publicvoid map(Object key, Text value, Context context

) throws IOException, InterruptedException {

StringTokenizer itr =new StringTokenizer(value.toString());

while (itr.hasMoreTokens()) {

word.set(itr.nextToken());

context.write(word, one);

}

publicstaticclass IntSumReducer

extends Reducer<Text,IntWritable,Text,IntWritable> {

private IntWritable result =new IntWritable();

/**

* Reducer類中的reduce方法：

* void reduce(Text key, Iterable<IntWritable> values, Context context)

* 中k/v來自於map函式中的context,可能經過了進一步處理(combiner),同樣通過context輸出

publicvoid reduce(Text key, Iterable<IntWritable> values,

Context context

) throws IOException, InterruptedException {

int sum =0;

for (IntWritable val : values) {

sum += val.get();

}

result.set(sum);

context.write(key, result);

}

publicstaticvoid main(String[] args) throws Exception {

/**

* Configuration：map/reduce的j配置類，向hadoop框架描述map-reduce執行的工作

Configuration conf =new Configuration();

String[] otherArgs =new GenericOptionsParser(conf, args).getRemainingArgs();

if (otherArgs.length !=2) {

System.err.println("Usage: wordcount <in> <out>");

System.exit(2);

}

Job job =new Job(conf, "word count"); //設定一個使用者定義的job名稱

job.setJarByClass(WordCount.class);

job.setMapperClass(TokenizerMapper.class); //為job設定Mapper類

job.setCombinerClass(IntSumReducer.class); //為job設定Combiner類

job.setReducerClass(IntSumReducer.class); //為job設定Reducer類

job.setOutputKeyClass(Text.class); //為job的輸出資料設定Key類

job.setOutputValueClass(IntWritable.class); //為job輸出設定value類

FileInputFormat.addInputPath(job, new Path(otherArgs[0])); //為job設定輸入路徑

FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));//為job設定輸出路徑

System.exit(job.waitForCompletion(true) ?0 : 1); //執行job

}

hadoop執行程式詳解-helloworld

hadoop執行程式詳解-helloworld

Hadoop執行原理詳解

Python程式碼轉換為exe可執行程式詳解

大資料基礎課之Hadoop MapReduce執行過程詳解

Hadoop之本地執行模式詳解

Hadoop 2.6 MapReduce執行原理詳解

詳解HelloWorld執行過程

Java筆記---Hadoop 2.7.1下WordCount程式詳解

c語言實現漢諾塔（程式執行步驟詳解）

Hadoop MapReduce執行過程詳解（帶hadoop例子）

pycharm編輯、執行abaqus python程式詳解，kernel問題處理等

Hadoop之WordCount詳解

mysql explain執行計劃詳解

SQL語句執行過程詳解

javascript運行機制之執行順序詳解

MySQL 優化sql explain執行計劃詳解

hadoop fs 命令詳解

( 轉 ) MySQL高級之 explain執行計劃詳解

Struts2框架執行流程詳解

Python代碼轉換為exe可執行程序詳解

hadoop執行程式詳解-helloworld

相關推薦