1. 程式人生 > >mapreduce程式的按照key值從大到小降序排列

mapreduce程式的按照key值從大到小降序排列

 

在近期的Hadoop的學習中,在學習mapreduce時遇到問題:讓求所給資料的top10,們我們指導mapreduce中是有預設的排列機制的,是按照key的升序從大到小排列的

然而top10問題的求解需要按照降序排列。在網上找了很長時間才得以解決,解決方法如下:

 

自定義一個比較器,這個比較器要繼承WritableComparator類,程式碼如下:

import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.WritableComparator;

public  class
DescSort extends WritableComparator{ public DescSort() { super(LongWritable.class,true);//註冊排序元件 } @Override public int compare(byte[] arg0, int arg1, int arg2, byte[] arg3, int arg4, int arg5) { return -super.compare(arg0, arg1, arg2, arg3, arg4, arg5);//注意使用負號來完成降序
} @Override public int compare(Object a, Object b) { return -super.compare(a, b);//注意使用負號來完成降序 } }

在主函式中要執行時要宣告該比較器的類的名稱,程式碼如下:

package Sort;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.NullWritable; import org.apache.hadoop.io.RawComparator; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class SortRunner { public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException { Configuration conf = new Configuration(); conf.set("fs.defaultFS","hdfs://192.168.252.200:9000"); Job job = Job.getInstance(conf); job.setJarByClass(SortRunner.class); job.setSortComparatorClass(DescSort.class); job.setMapperClass(SortMapper.class); job.setReducerClass(SortReducer.class); job.setMapOutputKeyClass(LongWritable.class); job.setMapOutputValueClass(NullWritable.class); job.setOutputKeyClass(LongWritable.class); job.setOutputValueClass(NullWritable.class); //輸入輸出的路徑 FileInputFormat.setInputPaths(job,new Path("/sort/srcdata/")); FileOutputFormat.setOutputPath(job, new Path("/sort/output3")); System.exit(job.waitForCompletion(true)?0:1); } }

注:紅色部分便是宣告比較器

這樣就可以實現降序輸出了。

網上與很多按照自定義類型別的排序的輸出,在這裡便不進行詳細介紹,望採納!!!!