hadoop入門教程-程式小例項
阿新 • • 發佈:2019-01-01
無論是在微信還是QQ,我們經常看到好友推薦這樣的功能,其實這個功能是在大資料的基礎上實現的,下面來看具體的程式碼實現:
在src下新增三個類:JobRun.java:
QQMapper.java:package com.lftgb.mr; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class JobRun { public static void main(String[] args) { Configuration conf = new Configuration(); conf.set("mapred.job.tracker", "192.168.152.128:9001"); conf.set("fs.default.name", "hdfs://192.168.152.128:9000"); conf.set("mapred.jar", "C:\\Users\\志鵬\\Desktop\\hadoop程式\\qq.jar"); try { Job job = new Job(conf); /* * job.setJarByClass(JobRun.class); * job.setMapperClass(WcMapper.class); * job.setReducerClass(WcReducer.class); * job.setMapOutputKeyClass(Text.class); * job.setMapOutputValueClass(IntWritable.class); * * // job.setNumReduceTasks(1);//設定reduce任務的個數 預設是一個 * * // mapreduce 輸入資料所在的目錄或者檔案 FileInputFormat.addInputPath(job, new * Path("/usr/my2016")); // mr執行之後的資料資料目錄 * FileOutputFormat.setOutputPath(job, new Path("/usr/output")); */ job.setJobName("qq"); job.setJarByClass(JobRun.class); job.setMapperClass(Test2Mapper.class); job.setReducerClass(Test2Reduce.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(IntWritable.class); // mapreduce 輸入資料所在的目錄或者檔案 FileInputFormat.addInputPath(job, new Path("/usr/input/qq/")); // mr執行之後的資料資料目錄 FileOutputFormat.setOutputPath(job, new Path("/usr/output/qq")); try { System.exit(job.waitForCompletion(true) ? 0 : 1); } catch (ClassNotFoundException e) { e.printStackTrace(); } catch (InterruptedException e) { e.printStackTrace(); } } catch (IOException e) { e.printStackTrace(); } } }
QQReduce.java:package com.lftgb.mr; import java.io.IOException; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; public class Test2Mapper extends Mapper<LongWritable, Text, Text, Text> { protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); String[] ss = line.split("\t"); context.write(new Text(ss[0]),new Text(ss[1])); context.write(new Text(ss[1]),new Text(ss[0])); } }
在eclipse下,與hadoop結合可以更有效的實現大資料處理的功能,深入的研究請期待小編的下次部落格!!package com.lftgb.mr; import java.io.IOException; import java.util.HashSet; import java.util.Iterator; import java.util.Set; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer; public class Test2Reduce extends Reducer<Text, Text, Text, Text> { protected void reduce(Text key, Iterable<Text> i, Context arg2) throws IOException, InterruptedException { Set<String> set= new HashSet<String>(); for(Text t:i){ set.add(t.toString()); } if(set.size()>1){ for (Iterator j = set.iterator(); j.hasNext();) { String name = (String) j.next(); for (Iterator k = set.iterator(); k.hasNext();) { String other = (String) k.next(); if(name.equals(other)){ arg2.write(new Text(name),new Text(other)); } } } } } }