mapreduce程式碼備忘

阿新 • • 發佈：2018-12-30

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;

public class WordCount{
        public static class TokenizerMapper extends Mapper<Object,Text,Text,IntWritable>{
                private final static IntWritable one = new IntWritable(1);
                private Text word = new Text();

                public void map(Object key, Text value, Context context) throws IOException,InterruptedException{
                        StringTokenizer itr = new StringTokenizer(value.toString());
                        while (itr.hasMoreTokens()){
                                word.set(itr.nextToken());
                                context.write(word,one);
                        }
                }
        }


        public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable>{
                private IntWritable result = new IntWritable();
                public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException,InterruptedException{
                        int sum = 0;
                        for (IntWritable val : values)
                        {
                                sum += val.get();
                        }
                        result.set(sum);
                        context.write(key,result);
                }
        }

        public static void main(String[] args) throws Exception{
                Configuration conf = new Configuration();
                String[] otherArgs = new GenericOptionsParser(conf,args).getRemainingArgs();
                if (otherArgs.length != 2)
                {
                        System.err.println("Usage: wordcount <in> <out>");
                        System.exit(2);
                }
                Job job = new Job(conf,"word count");
                job.setJarByClass(WordCount.class);
                job.setMapperClass(TokenizerMapper.class);
                job.setCombinerClass(IntSumReducer.class);
                job.setReducerClass(IntSumReducer.class);
                job.setOutputKeyClass(Text.class);
                job.setOutputValueClass(IntWritable.class);
                FileInputFormat.addInputPath(job,new Path(otherArgs[0]));
                FileOutputFormat.setOutputPath(job,new Path(otherArgs[1]));
                System.exit(job.waitForCompletion(true)?0:1);
        }
}

編譯java原始碼

[[email protected] code]$ javac -classpath /usr/local/hadoop/hadoop-core-1.2.1.jar:/usr/local/hadoop/lib/commons-cli-1.2.jar WordCount.java

打包

[[email protected] code]$ jar -cvf wordcount.jar *.class

執行程式

[[email protected] code]$ ./bin/hadoop jar ./code/wordcount.jar WordCount input/ output/

檢視結果

[[email protected] code]$ ./bin/hadoop fs -cat output/*

mapreduce程式碼備忘

import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.P

java刪除檔案操作程式碼備忘

/** * 刪除目錄下的所有檔案及其自身 * @param file */ private static void deleteFile(File file) { if (file.exists()) { if (file.isFile()) {

【程式碼備忘】C++ fstream 讀寫 unicode 檔案

歡迎加入我們的QQ群，無論你是否工作，學生，只要有c / vc / c++ 程式設計經驗，就來吧！158427611 所謂的unicode檔案，無非就是在檔案頭部插入了 0xFFFE的標誌。。。讀寫的時候對應的讀寫就可以了。 namespace fileStream

java加密簽名程式碼備忘

這類程式碼很容易忘記，所以記到這裡，以後方便翻閱： package com.olivephone.olivestat.task; import java.security.Key; import java.security.KeyFactory; import java.security.KeyPair;

程式碼備忘：常用SQL

1、CASE語句 CASE WHEN <A> THEN <somethingA> WHEN <B> THEN <somethingB> ELSE <somethingE> END 舉例： SELECT

樹莓派檢測運動目標並辨識類別程式碼備忘

rgbhistogram.py import cv2 class RGBHistogram: def __init__(self, bins): self.bins = bins def describe(self, im

c#基礎程式碼備忘

開啟視窗獲取路徑和檔名 OpenFileDialog lvse = new OpenFileDialog(); lvse.Title = "選擇圖片"; lvse.InitialDirectory = ""; lvse.Filter = "圖片檔案|

Antd之三元表示式判斷程式碼備忘

{ Object.keys(Version || Object).map(key => { return ( <TreeNode

echarts 柱狀圖頁面程式碼（備忘）

<!DOCTYPE html> <html> <head> <meta charset="utf-8"> <title>發票資料統計</title>

neon 的常見彙編指令與程式碼對照【基礎備忘】

作者：歌神flaming 來源：CSDN 原文：https://blog.csdn.net/u010684585/article/details/78455993 .arm .text .global cost_init cost_i

mongo與java 的group by分組和排序程式碼（備忘）

MongoTemplate gameMt = （根據具體專案獲得） DBCollection myColl = gameMt.getDb().getCollection("avatar");//表名 // 分組 DBObject

爬取天氣時常用的城市程式碼（備忘）

101010100=北京 101010200=海淀 101010300=朝陽 101010400=順義 101010500=懷柔 101010600=通州 101010700=昌平 101010800=延慶 101010900=豐臺 101011000=石景山 101011100=大興 10

近日有需要寫點C#程式，有用到Dataset資料集和SQLite資料庫，由於我從來就不擅長記各種程式語言的語法，所以在查閱一堆資料後，留下以下內容備忘：一、SQLite操作，直接貼程式碼，很簡單

近日有需要寫點C#程式，有用到Dataset資料集和SQLite資料庫，由於我從來就不擅長記各種程式語言的語法，所以在查閱一堆資料後，留下以下內容備忘：一、SQLite操作，直接貼程式碼，很簡單： //建立一個數據庫檔案 string d

【原創】Leetcode -- Reverse Linked List II -- 程式碼隨筆（備忘）

題目：Reverse Linked List II 題意:Reverse a linked list from position m to n. Do it in-place and in one-pass. 下面這段程式碼，有兩個地方，一個是4、5行的dummy節點設定；另一個是11-14行，區域性視覺

ibatis程式碼自動生成工具ibator修改備忘

由於對ibator瞭解的不夠深入，毅然決然的開始了修改ibator外掛的過程，修改的過程收穫很大，瞭解了這個外掛的諸多使用技巧。 1.自動生成的程式碼中的討厭的Example怎麼改名？這個也是驅動我去修改ibator plugin的原動力，因為我懶，不想每次生成程式

實用收藏Linux命令備忘

屏幕 ssh 狀態標準輸出系統 play mkdir ger rdquo 系統操作 #使用shutdown命令馬上重啟系統[[email protected]/* */ ~]# shutdown –r now #使用shutdown命令馬上

cpan安裝perl module的方法和步驟（備忘帖）

roo for lora pre permanent help base -i rmi 適用場景：不具備root權限且沒有sudo權限的普通用戶安裝perl module安裝步驟：1)刪除/.cpan/.lockrm -rf /home/users/.cpan/.lock2

linux備忘

blog mage 技術分享 img src http image alt logs linux備忘

Python備忘

class 安裝 ont 備忘 org 開源 ron 自己的 color Python 庫索引中包含了大量開源的庫，你可以在你自己的程序中使用它們。要想了解如何安裝並使用這些庫，你可以使用 pip。Python備忘

ajax基礎------備忘

user odi blog www action writer word nal urlencode 1:register.jsp <%@ page language="java" contentType="text/html; charset=UTF-8"

mapreduce程式碼備忘

相關推薦