MapReduce在Shuffle階段按Mapper輸出的Value進行排序

阿新 • • 發佈：2020-07-03

ZKe

-----------------

　　在MapReduce框架中，Mapper的輸出在Shuffle階段，根據Key值分組之後，還將會根據Key值進行排序，因此Reducer的輸出我們看到的結果是按Key有序的。

　　同樣我們可以讓它按Value有序。通過job.setSortComparatorClass(IntWritableComparator.class);即可（這裡的排序規則和型別通過自己定義）

　　實體類不僅需要實現Comparable介面，同樣還要重寫readFiles方法和write方法。然後定義一個該實體的比較器。

　　這裡定義一個實體類，由String的id和int的count作為屬性，我們根據count進行排序。

static class Record implements Comparable<Record>{
        
        private String personalId;
        private int count;
        
        public Record(String id, int count){
            this.personalId = id;
            this.count = count;
        }
        public Record(String line){
             
this.personalId = line.split("\t")[0];
            this.count = Integer.parseInt(line.split("\t")[1]);
        }
        
        /*
         * 反序列化方法
         * @author 180512235 ZhaoKe
         */
        public void readFields(DataInput arg0) throws IOException {
            this.personalId = arg0.readUTF();
             
this.count = arg0.readInt();
        }

        // 序列化方法
        public void write(DataOutput arg0) throws IOException {
            arg0.writeUTF(this.personalId);
            arg0.writeInt(this.count);
        }
        
        public int compareTo(Record o) {
            // TODO Auto-generated method stub
            return this.count<o.count?1:-1;
        }
        public String getPersonalId(){
            return this.personalId;
        }
        
        public int getCount(){
            return this.count;
        }
        
    }

它的比較器如下

    static class IntWritableComparator extends WritableComparator {
     
        /*
         * 重寫構造方法，定義比較類 IntWritable
         */
        public IntWritableComparator() {
            super(IntWritable.class, true);
        }
        /*
         * 重寫compare方法，自定義比較規則
         */
        @Override
        public int compare(WritableComparable a, WritableComparable b) {
            //向下轉型
            IntWritable ia = (IntWritable) a;
            IntWritable ib = (IntWritable) b;
            return ib.compareTo(ia);
        }
    }

Mapper和Reducer如下，沒有任何操作，因為Shuffle階段自己會呼叫比較器進行排序

    static class SortMapper extends Mapper<LongWritable, Text, IntWritable, Text>{
        private Record r;
        protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException{
            r = new Record(value.toString());
            context.write(new IntWritable(r.getCount()), new Text(r.getPersonalId()));
        }
    }
    static class SortReducer extends Reducer<IntWritable, Text, Text, IntWritable>{
        
        protected void reduce(IntWritable key, Iterable<Text> values, Context context) throws IOException, InterruptedException{

            for(Text value:values){
                context.write(value, key);
            }
        }
    }

主類如下，大家作為模板即可

    public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
        // TODO Auto-generated method stub
        String inputFile = "hdfs://master:9000/user/root/finalClassDesign/originData/submitTop10output/";
        
        String outputFile = "hdfs://master:9000/user/root/finalClassDesign/originData/sortedSubmitTop10/";
        BasicConfigurator.configure();
        Configuration conf = new Configuration();
//        String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
//        if(otherArgs.length != 2){
//            System.err.println("Usage:wordcount<in><out>");
//            System.exit(2);
//        }
        
        Job job = Job.getInstance(conf, "WordCount");
        
        job.setJarByClass(SortByMapReduce.class);
        
        job.setMapperClass(SortMapper.class);
        job.setReducerClass(SortReducer.class);
        
        job.setMapOutputKeyClass(IntWritable.class);
        job.setMapOutputValueClass(Text.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        
        job.setSortComparatorClass(IntWritableComparator.class);  // 此處必須注意設定比較器=======================================
        
//        Path path = new Path(otherArgs[1]);
        Path path = new Path(outputFile);
        FileSystem fileSystem = path.getFileSystem(conf);
        if(fileSystem.exists(path)){
            fileSystem.delete(path, true);
        }
        
//        FileInputFormat.setInputPaths(job, new Path(args[0]));
//        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        FileInputFormat.setInputPaths(job, new Path(inputFile));
        FileOutputFormat.setOutputPath(job, new Path(outputFile));
        
        boolean res = job.waitForCompletion(true);
        if(res)
            System.out.println("===========waitForCompletion:"+res+"==========");
        System.exit(res?0:1);
    }

MapReduce在Shuffle階段按Mapper輸出的Value進行排序

ZKe ----------------- 　　在MapReduce框架中，Mapper的輸出在Shuffle階段，根據Key值分組之後，還將會根據Key值進行排序，因此Reducer的輸出我們看到的結果是按Key有序的。

python計算字串單詞出現的頻率按字母對鍵進行排序後輸出

技術標籤：python python計算字串單詞出現的頻率按字母對鍵進行排序後輸出第一次發不太熟，見諒

java 將map中的key或value進行排序

一個專案業務場景，需要根據關鍵詞查詢商品資訊，因為分表分的很細，價格資訊、關鍵詞資訊、商品基本資訊都分開儲存，因此在經過關鍵詞查詢到商品之後，在查詢出來的結果中想要商品按照升序或者降序的順序顯示，這裡

Java TreeMap升序|降序排列和按照value進行排序的案例

TreeMap 升序|降序排列 import java.util.Comparator; import java.util.TreeMap; public class Main { public static void main(String[] args) {

根據map的value進行排序2

技術標籤：java演算法javahashmap JAVA對Map按Value值排序在java實際程式設計中經常需要使用到HashMap，TreeMap以及LinkedHashMap來儲存鍵值對，而java中對Map按Value排序並沒有已經寫好的方法，需要自己實現。

對map的key或value進行排序（升序、降序）

先定義map Map<Integer,Integer> map = Maps.newHashMap(); map.put(2,11); map.put(1,33); map.put(3,22);

對 List 列表中的資料按指定欄位進行排序

/** * 對列表中的資料按指定欄位進行排序。要求類必須有相關的方法返回字串、整型、日期等值以進行比較。 * * @param list集合 * @param sortName需要排序的欄位,目前支援 int String Date 型別 * @param reverseFl

對學生的資訊按成績進行排序輸出

對50個學生的記錄（包括學生的姓名和成績），組成記錄陣列，用簡單選擇法按成績從高到低的次序輸出（每行輸出5個）。

java--將Map轉換為List並按value值排序輸出

程式碼： import java.util.*; public class Main { public static void main(String[] args) { Map <String,Integer> map = new LinkedHashMap<>();

python對陣列進行排序,並輸出排序後對應的索引值方式

廢話不多說，直接上程式碼吧！ # -*- coding: cp936 -*- import numpy as np #一維陣列排序

寫幾個函式: ①輸人10個職工的姓名和職工號; ②按職工號由小到大順序排序,姓名順序也隨之調整; ③要求輸人一個職工號,用折半查詢法找出該職工的姓名,從主函式輸人要查詢的職工號,輸出該職工姓名

寫幾個函式:①輸人10個職工的姓名和職工號;②按職工號由小到大順序排序,姓名順序也隨之調整;③要求輸人一個職工號,用折半查詢法找出該職工的姓名,從主函式輸人要查詢的職工號,輸出該職工姓名。

【C#】 List按指定欄位的給出的自定義順序進行排序

<div id="cnblogs_post_description" style="display: none"> List按指定欄位的給出的自定義順序進行排序

對List集合中的物件按某個屬性進行排序

技術標籤：javaarraylist 上程式碼： import java.util.ArrayList; import java.util.Collections; import static java.util.Comparator.comparing;

泛型介面之實現按name進行排序

1 import java.util.*; 2 public class Demo13{ 3 4/* 5Arrays.sort(Object[])可以對任意陣列進行排序，但待排序的元素必須實現Comparable<T>這個泛型介面

Ruby 對 Hash 按 key 進行排序

技術標籤：Rubyhash排序當然，Ruby 中的Hash 和集合 Set 類似，都是沒有所謂順序的，預設是按照資料插入的先後順序來排列的。但是你肯定有過這樣的需求：怎麼讓 Ruby 中的 Hash 按照鍵的順序重新排列得到新的 H

Map根據value的值進行排序(簡單易懂)

技術標籤：Javajava Map根據value的值進行排序(簡單易懂) 1、將Map的entrySet轉換為List

java中list集合按物件屬性進行排序

在日常搬磚中，我們可能會需要對List中自定義的一些物件進行排序，但java是不知道我們的物件是需要怎麼排序，因此我們得自己寫排序的規則。

SAS 按自定義順序對觀測進行排序

本文連結：https://www.cnblogs.com/snoopy1866/p/15091967.html 實際專案中會經常遇到按指定順序輸出Listing的情況，例如：輸出所有受試者的分組情況列表。

Java如何對HashMap按值進行排序--非String int 型別時

比如：Float 可以通過相減取整返回，如下方： Map<String, Float> PathMap = new HashMap<String, Float>();

python 對字典分別按照key值、value值進行排序

1.sorted函式首先介紹sorted函式，sorted(iterable,key,reverse)，sorted一共有iterable，key，reverse這三個引數。

MapReduce在Shuffle階段按Mapper輸出的Value進行排序

相關推薦