用Maven構建專案(轉)
從2011年開始,中國進入大資料風起雲湧的時代,以Hadoop為代表的家族軟體,佔據了大資料處理的廣闊地盤。開源界及廠商,所有資料軟體,無一不向Hadoop靠攏。Hadoop也從小眾的高富帥領域,變成了大資料開發的標準。在Hadoop原有技術基礎之上,出現了Hadoop家族產品,通過“大資料”概念不斷創新,推出科技進步。
作為IT界的開發人員,我們也要跟上節奏,抓住機遇,跟著Hadoop一起雄起!
前言
Hadoop的MapReduce環境是一個複雜的程式設計環境,所以我們要儘可能地簡化構建MapReduce專案的過程。Maven是一個很不錯的自動化專案構建工具,通過Maven來幫助我們從複雜的環境配置中解脫出來,從而標準化開發過程。所以,寫MapReduce之前,讓我們先花點時間把刀磨快!!當然,除了Maven
後面將會有介紹幾篇MapReduce開發的文章,都要依賴於本文中Maven的構建的MapReduce環境。
目錄
- Maven介紹
- Maven安裝(win)
- Hadoop開發環境介紹
- 用Maven構建Hadoop環境
- MapReduce程式開發
- 模板專案上傳github
1. Maven介紹
Apache Maven,是一個Java的專案管理及自動構建工具,由Apache軟體基金會所提供。基於專案對象模型(縮寫:POM)概念,Maven利用一箇中央資訊片斷能管理一個專案的構建、報告和文件等步驟。曾是Jakarta專案的子專案,現為獨立Apache專案。
maven的開發者在他們開發網站上指出,maven的目標是要使得專案的構建更加容易,它把編譯、打包、測試、釋出等開發過程中的不同環節有機的串聯了起來,併產生一致的、高質量的專案資訊,使得專案成員能夠及時地得到反饋。maven有效地支援了測試優先、持續整合,體現了鼓勵溝通,及時反饋的軟體開發理念。如果說Ant的複用是建立在”拷貝–貼上”的基礎上的,那麼Maven通過外掛的機制實現了專案構建邏輯的真正複用。
2. Maven安裝(win)
下載最新的xxx-bin.zip檔案,在win上解壓到 D:\toolkit\maven3
並把maven/bin目錄設定在環境變數PATH:
然後,開啟命令列輸入mvn,我們會看到mvn命令的執行效果
~ C:\Users\Administrator>mvn [INFO] Scanning for projects... [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 0.086s [INFO] Finished at: Mon Sep 30 18:26:58 CST 2013 [INFO] Final Memory: 2M/179M [INFO] ------------------------------------------------------------------------ [ERROR] No goals have been specified for this build. You must specify a valid lifecycle phase or a goal in the format : or :[:]:. Available lifecycle phases are: validate, initialize, generate-sources, process-sources, generate-resources, process-resources, compile, process-class es, generate-test-sources, process-test-sources, generate-test-resources, process-test-resources, test-compile, process-test-classes, test, prepare-package, package, pre-integration-test, integration-test, post-integration-test, verify, install, deploy, pre-clean, clean, post-clean, pre-site, site, post-site, site-deploy. -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/NoGoalSpecifiedException
Maven的Eclipse外掛配置
3. Hadoop開發環境介紹
如上圖所示,我們可以選擇在win中開發,也可以在linux中開發,本地啟動Hadoop或者遠端呼叫Hadoop,標配的工具都是Maven和Eclipse。
Hadoop集群系統環境:
- Linux: Ubuntu 12.04.2 LTS 64bit Server
- Java: 1.6.0_29
- Hadoop: hadoop-1.0.3,單節點,IP:192.168.1.210
4. 用Maven構建Hadoop環境
- 1. 用Maven建立一個標準化的Java專案
- 2. 匯入專案到eclipse
- 3. 增加hadoop依賴,修改pom.xml
- 4. 下載依賴
- 5. 從Hadoop叢集環境下載hadoop配置檔案
- 6. 配置本地host
1). 用Maven建立一個標準化的Java專案
~ D:\workspace\java>mvn archetype:generate -DarchetypeGroupId=org.apache.maven.archetypes -DgroupId=org.conan.myhadoop.mr -DartifactId=myHadoop -DpackageName=org.conan.myhadoop.mr -Dversion=1.0-SNAPSHOT -DinteractiveMode=false [INFO] Scanning for projects... [INFO] [INFO] ------------------------------------------------------------------------ [INFO] Building Maven Stub Project (No POM) 1 [INFO] ------------------------------------------------------------------------ [INFO] [INFO] >>> maven-archetype-plugin:2.2:generate (default-cli) @ standalone-pom >>> [INFO] [INFO] <<< maven-archetype-plugin:2.2:generate (default-cli) @ standalone-pom <<< [INFO] [INFO] --- maven-archetype-plugin:2.2:generate (default-cli) @ standalone-pom --- [INFO] Generating project in Batch mode [INFO] No archetype defined. Using maven-archetype-quickstart (org.apache.maven.archetypes:maven-archetype-quickstart:1. 0) Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/archetypes/maven-archetype-quickstart/1.0/maven-archet ype-quickstart-1.0.jar Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/archetypes/maven-archetype-quickstart/1.0/maven-archety pe-quickstart-1.0.jar (5 KB at 4.3 KB/sec) Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/archetypes/maven-archetype-quickstart/1.0/maven-archet ype-quickstart-1.0.pom Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/archetypes/maven-archetype-quickstart/1.0/maven-archety pe-quickstart-1.0.pom (703 B at 1.6 KB/sec) [INFO] ---------------------------------------------------------------------------- [INFO] Using following parameters for creating project from Old (1.x) Archetype: maven-archetype-quickstart:1.0 [INFO] ---------------------------------------------------------------------------- [INFO] Parameter: groupId, Value: org.conan.myhadoop.mr [INFO] Parameter: packageName, Value: org.conan.myhadoop.mr [INFO] Parameter: package, Value: org.conan.myhadoop.mr [INFO] Parameter: artifactId, Value: myHadoop [INFO] Parameter: basedir, Value: D:\workspace\java [INFO] Parameter: version, Value: 1.0-SNAPSHOT [INFO] project created from Old (1.x) Archetype in dir: D:\workspace\java\myHadoop [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 8.896s [INFO] Finished at: Sun Sep 29 20:57:07 CST 2013 [INFO] Final Memory: 9M/179M [INFO] ------------------------------------------------------------------------
進入專案,執行mvn命令
~ D:\workspace\java>cd myHadoop ~ D:\workspace\java\myHadoop>mvn clean install [INFO] [INFO] --- maven-jar-plugin:2.3.2:jar (default-jar) @ myHadoop --- [INFO] Building jar: D:\workspace\java\myHadoop\target\myHadoop-1.0-SNAPSHOT.jar [INFO] [INFO] --- maven-install-plugin:2.3.1:install (default-install) @ myHadoop --- [INFO] Installing D:\workspace\java\myHadoop\target\myHadoop-1.0-SNAPSHOT.jar to C:\Users\Administrator\.m2\repository\o rg\conan\myhadoop\mr\myHadoop\1.0-SNAPSHOT\myHadoop-1.0-SNAPSHOT.jar [INFO] Installing D:\workspace\java\myHadoop\pom.xml to C:\Users\Administrator\.m2\repository\org\conan\myhadoop\mr\myHa doop\1.0-SNAPSHOT\myHadoop-1.0-SNAPSHOT.pom [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 4.348s [INFO] Finished at: Sun Sep 29 20:58:43 CST 2013 [INFO] Final Memory: 11M/179M [INFO] ------------------------------------------------------------------------
2). 匯入專案到eclipse
我們建立好了一個基本的maven專案,然後匯入到eclipse中。 這裡我們最好已安裝好了Maven的外掛。
3). 增加hadoop依賴
這裡我使用hadoop-1.0.3版本,修改檔案:pom.xml
~ vi pom.xml
4.0.0
org.conan.myhadoop.mr
myHadoop
jar
1.0-SNAPSHOT
myHadoop
http://maven.apache.org
org.apache.hadoop
hadoop-core
1.0.3
junit
junit
4.4
test
4). 下載依賴
下載依賴:
~ mvn clean install
在eclipse中重新整理專案:
專案的依賴程式,被自動載入的庫路徑下面。
5). 從Hadoop叢集環境下載hadoop配置檔案
- core-site.xml
- hdfs-site.xml
- mapred-site.xml
檢視core-site.xml
fs.default.name
hdfs://master:9000
hadoop.tmp.dir
/home/conan/hadoop/tmp
io.sort.mb
256
檢視hdfs-site.xml
dfs.data.dir /home/conan/hadoop/data dfs.replication 1 dfs.permissions false
檢視mapred-site.xml
mapred.job.tracker
hdfs://master:9001
儲存在src/main/resources/hadoop目錄下面
刪除原自動生成的檔案:App.java和AppTest.java
6).配置本地host,增加master的域名指向
~ vi c:/Windows/System32/drivers/etc/hosts 192.168.1.210 master
6. MapReduce程式開發
編寫一個簡單的MapReduce程式,實現wordcount功能。
新一個Java檔案:WordCount.java
package org.conan.myhadoop.mr; import java.io.IOException; import java.util.Iterator; import java.util.StringTokenizer; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapred.FileInputFormat; import org.apache.hadoop.mapred.FileOutputFormat; import org.apache.hadoop.mapred.JobClient; import org.apache.hadoop.mapred.JobConf; import org.apache.hadoop.mapred.MapReduceBase; import org.apache.hadoop.mapred.Mapper; import org.apache.hadoop.mapred.OutputCollector; import org.apache.hadoop.mapred.Reducer; import org.apache.hadoop.mapred.Reporter; import org.apache.hadoop.mapred.TextInputFormat; import org.apache.hadoop.mapred.TextOutputFormat; public class WordCount { public static class WordCountMapper extends MapReduceBase implements Mapper { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); @Override public void map(Object key, Text value, OutputCollector output, Reporter reporter) throws IOException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); output.collect(word, one); } } } public static class WordCountReducer extends MapReduceBase implements Reducer { private IntWritable result = new IntWritable(); @Override public void reduce(Text key, Iterator values, OutputCollector output, Reporter reporter) throws IOException { int sum = 0; while (values.hasNext()) { sum += values.next().get(); } result.set(sum); output.collect(key, result); } } public static void main(String[] args) throws Exception { String input = "hdfs://192.168.1.210:9000/user/hdfs/o_t_account"; String output = "hdfs://192.168.1.210:9000/user/hdfs/o_t_account/result"; JobConf conf = new JobConf(WordCount.class); conf.setJobName("WordCount"); conf.addResource("classpath:/hadoop/core-site.xml"); conf.addResource("classpath:/hadoop/hdfs-site.xml"); conf.addResource("classpath:/hadoop/mapred-site.xml"); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(WordCountMapper.class); conf.setCombinerClass(WordCountReducer.class); conf.setReducerClass(WordCountReducer.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, new Path(input)); FileOutputFormat.setOutputPath(conf, new Path(output)); JobClient.runJob(conf); System.exit(0); } }
啟動Java APP.
控制檯錯誤
2013-9-30 19:25:02 org.apache.hadoop.util.NativeCodeLoader 警告: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2013-9-30 19:25:02 org.apache.hadoop.security.UserGroupInformation doAs 嚴重: PriviledgedActionException as:Administrator cause:java.io.IOException: Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator1702422322\.staging to 0700 Exception in thread "main" java.io.IOException: Failed to set permissions of path: \tmp\hadoop-Administrator\mapred\staging\Administrator1702422322\.staging to 0700 at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689) at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344) at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:189) at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:116) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261) at org.conan.myhadoop.mr.WordCount.main(WordCount.java:78)
這個錯誤是win中開發特有的錯誤,檔案許可權問題,在Linux下可以正常執行。
解決方法是,修改/hadoop-1.0.3/src/core/org/apache/hadoop/fs/FileUtil.java檔案
688-692行註釋,然後重新編譯原始碼,重新打一個hadoop.jar的包。
685 private static void checkReturnValue(boolean rv, File p, 686 FsPermission permission 687 ) throws IOException { 688 /*if (!rv) { 689 throw new IOException("Failed to set permissions of path: " + p + 690 " to " + 691 String.format("%04o", permission.toShort())); 692 }*/ 693 }
我這裡自己打了一個hadoop-core-1.0.3.jar包,放到了lib下面。
我們還要替換maven中的hadoop類庫。
~ cp lib/hadoop-core-1.0.3.jar C:\Users\Administrator\.m2\repository\org\apache\hadoop\hadoop-core\1.0.3\hadoop-core-1.0.3.jar
再次啟動Java APP,控制檯輸出:
2013-9-30 19:50:49 org.apache.hadoop.util.NativeCodeLoader 警告: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2013-9-30 19:50:49 org.apache.hadoop.mapred.JobClient copyAndConfigureFiles 警告: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 2013-9-30 19:50:49 org.apache.hadoop.mapred.JobClient copyAndConfigureFiles 警告: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 2013-9-30 19:50:49 org.apache.hadoop.io.compress.snappy.LoadSnappy 警告: Snappy native library not loaded 2013-9-30 19:50:49 org.apache.hadoop.mapred.FileInputFormat listStatus 資訊: Total input paths to process : 4 2013-9-30 19:50:50 org.apache.hadoop.mapred.JobClient monitorAndPrintJob 資訊: Running job: job_local_0001 2013-9-30 19:50:50 org.apache.hadoop.mapred.Task initialize 資訊: Using ResourceCalculatorPlugin : null 2013-9-30 19:50:50 org.apache.hadoop.mapred.MapTask runOldMapper 資訊: numReduceTasks: 1 2013-9-30 19:50:50 org.apache.hadoop.mapred.MapTask$MapOutputBuffer 資訊: io.sort.mb = 100 2013-9-30 19:50:50 org.apache.hadoop.mapred.MapTask$MapOutputBuffer 資訊: data buffer = 79691776/99614720 2013-9-30 19:50:50 org.apache.hadoop.mapred.MapTask$MapOutputBuffer 資訊: record buffer = 262144/327680 2013-9-30 19:50:50 org.apache.hadoop.mapred.MapTask$MapOutputBuffer flush 資訊: Starting flush of map output 2013-9-30 19:50:50 org.apache.hadoop.mapred.MapTask$MapOutputBuffer sortAndSpill 資訊: Finished spill 0 2013-9-30 19:50:50 org.apache.hadoop.mapred.Task done 資訊: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting 2013-9-30 19:50:51 org.apache.hadoop.mapred.JobClient monitorAndPrintJob 資訊: map 0% reduce 0% 2013-9-30 19:50:53 org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate 資訊: hdfs://192.168.1.210:9000/user/hdfs/o_t_account/part-m-00003:0+119 2013-9-30 19:50:53 org.apache.hadoop.mapred.Task sendDone 資訊: Task 'attempt_local_0001_m_000000_0' done. 2013-9-30 19:50:53 org.apache.hadoop.mapred.Task initialize 資訊: Using ResourceCalculatorPlugin : null 2013-9-30 19:50:53 org.apache.hadoop.mapred.MapTask runOldMapper 資訊: numReduceTasks: 1 2013-9-30 19:50:53 org.apache.hadoop.mapred.MapTask$MapOutputBuffer 資訊: io.sort.mb = 100 2013-9-30 19:50:53 org.apache.hadoop.mapred.MapTask$MapOutputBuffer 資訊: data buffer = 79691776/99614720 2013-9-30 19:50:53 org.apache.hadoop.mapred.MapTask$MapOutputBuffer 資訊: record buffer = 262144/327680 2013-9-30 19:50:53 org.apache.hadoop.mapred.MapTask$MapOutputBuffer flush 資訊: Starting flush of map output 2013-9-30 19:50:53 org.apache.hadoop.mapred.MapTask$MapOutputBuffer sortAndSpill 資訊: Finished spill 0 2013-9-30 19:50:53 org.apache.hadoop.mapred.Task done 資訊: Task:attempt_local_0001_m_000001_0 is done. And is in the process of commiting 2013-9-30 19:50:54 org.apache.hadoop.mapred.JobClient monitorAndPrintJob 資訊: map 100% reduce 0% 2013-9-30 19:50:56 org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate 資訊: hdfs://192.168.1.210:9000/user/hdfs/o_t_account/part-m-00000:0+113 2013-9-30 19:50:56 org.apache.hadoop.mapred.Task sendDone 資訊: Task 'attempt_local_0001_m_000001_0' done. 2013-9-30 19:50:56 org.apache.hadoop.mapred.Task initialize 資訊: Using ResourceCalculatorPlugin : null 2013-9-30 19:50:56 org.apache.hadoop.mapred.MapTask runOldMapper 資訊: numReduceTasks: 1 2013-9-30 19:50:56 org.apache.hadoop.mapred.MapTask$MapOutputBuffer 資訊: io.sort.mb = 100 2013-9-30 19:50:56 org.apache.hadoop.mapred.MapTask$MapOutputBuffer 資訊: data buffer = 79691776/99614720 2013-9-30 19:50:56 org.apache.hadoop.mapred.MapTask$MapOutputBuffer 資訊: record buffer = 262144/327680 2013-9-30 19:50:56 org.apache.hadoop.mapred.MapTask$MapOutputBuffer flush 資訊: Starting flush of map output 2013-9-30 19:50:56 org.apache.hadoop.mapred.MapTask$MapOutputBuffer sortAndSpill 資訊: Finished spill 0 2013-9-30 19:50:56 org.apache.hadoop.mapred.Task done 資訊: Task:attempt_local_0001_m_000002_0 is done. And is in the process of commiting 2013-9-30 19:50:59 org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate 資訊: hdfs://192.168.1.210:9000/user/hdfs/o_t_account/part-m-00001:0+110 2013-9-30 19:50:59 org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate 資訊: hdfs://192.168.1.210:9000/user/hdfs/o_t_account/part-m-00001:0+110 2013-9-30 19:50:59 org.apache.hadoop.mapred.Task sendDone 資訊: Task 'attempt_local_0001_m_000002_0' done. 2013-9-30 19:50:59 org.apache.hadoop.mapred.Task initialize 資訊: Using ResourceCalculatorPlugin : null 2013-9-30 19:50:59 org.apache.hadoop.mapred.MapTask runOldMapper 資訊: numReduceTasks: 1 2013-9-30 19:50:59 org.apache.hadoop.mapred.MapTask$MapOutputBuffer 資訊: io.sort.mb = 100 2013-9-30 19:50:59 org.apache.hadoop.mapred.MapTask$MapOutputBuffer 資訊: data buffer = 79691776/99614720 2013-9-30 19:50:59 org.apache.hadoop.mapred.MapTask$MapOutputBuffer 資訊: record buffer = 262144/327680 2013-9-30 19:50:59 org.apache.hadoop.mapred.MapTask$MapOutputBuffer flush 資訊: Starting flush of map output 2013-9-30 19:50:59 org.apache.hadoop.mapred.MapTask$MapOutputBuffer sortAndSpill 資訊: Finished spill 0 2013-9-30 19:50:59 org.apache.hadoop.mapred.Task done 資訊: Task:attempt_local_0001_m_000003_0 is done. And is in the process of commiting 2013-9-30 19:51:02 org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate 資訊: hdfs://192.168.1.210:9000/user/hdfs/o_t_account/part-m-00002:0+79 2013-9-30 19:51:02 org.apache.hadoop.mapred.Task sendDone 資訊: Task 'attempt_local_0001_m_000003_0' done. 2013-9-30 19:51:02 org.apache.hadoop.mapred.Task initialize 資訊: Using ResourceCalculatorPlugin : null 2013-9-30 19:51:02 org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate 資訊: 2013-9-30 19:51:02 org.apache.hadoop.mapred.Merger$MergeQueue merge 資訊: Merging 4 sorted segments 2013-9-30 19:51:02 org.apache.hadoop.mapred.Merger$MergeQueue merge 資訊: Down to the last merge-pass, with 4 segments left of total size: 442 bytes 2013-9-30 19:51:02 org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate 資訊: 2013-9-30 19:51:02 org.apache.hadoop.mapred.Task done 資訊: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting 2013-9-30 19:51:02 org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate 資訊: 2013-9-30 19:51:02 org.apache.hadoop.mapred.Task commit 資訊: Task attempt_local_0001_r_000000_0 is allowed to commit now 2013-9-30 19:51:02 org.apache.hadoop.mapred.FileOutputCommitter commitTask 資訊: Saved output of task 'attempt_local_0001_r_000000_0' to hdfs://192.168.1.210:9000/user/hdfs/o_t_account/result 2013-9-30 19:51:05 org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate 資訊: reduce > reduce 2013-9-30 19:51:05 org.apache.hadoop.mapred.Task sendDone 資訊: Task 'attempt_local_0001_r_000000_0' done. 2013-9-30 19:51:06 org.apache.hadoop.mapred.JobClient monitorAndPrintJob 資訊: map 100% reduce 100% 2013-9-30 19:51:06 org.apache.hadoop.mapred.JobClient monitorAndPrintJob 資訊: Job complete: job_local_0001 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: Counters: 20 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: File Input Format Counters 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: Bytes Read=421 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: File Output Format Counters 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: Bytes Written=348 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: FileSystemCounters 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: FILE_BYTES_READ=7377 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: HDFS_BYTES_READ=1535 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: FILE_BYTES_WRITTEN=209510 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: HDFS_BYTES_WRITTEN=348 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: Map-Reduce Framework 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: Map output materialized bytes=458 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: Map input records=11 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: Reduce shuffle bytes=0 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: Spilled Records=30 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: Map output bytes=509 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: Total committed heap usage (bytes)=1838546944 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: Map input bytes=421 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: SPLIT_RAW_BYTES=452 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: Combine input records=22 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: Reduce input records=15 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: Reduce input groups=13 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: Combine output records=15 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: Reduce output records=13 2013-9-30 19:51:06 org.apache.hadoop.mapred.Counters log 資訊: Map output records=22
成功運行了wordcount程式,通過命令我們檢視輸出結果
~ hadoop fs -ls hdfs://192.168.1.210:9000/user/hdfs/o_t_account/result Found 2 items -rw-r--r-- 3 Administrator supergroup 0 2013-09-30 19:51 /user/hdfs/o_t_account/result/_SUCCESS -rw-r--r-- 3 Administrator supergroup 348 2013-09-30 19:51 /user/hdfs/o_t_account/result/part-00000 ~ hadoop fs -cat hdfs://192.168.1.210:9000/user/hdfs/o_t_account/result/part-00000 1,[email protected],2013-04-22 1 10,[email protected],2013-04-23 1 11,[email protected],2013-04-23 1 17:21:24.0 5 2,[email protected],2013-04-22 1 20:21:39.0 6 3,[email protected],2013-04-22 1 4,[email protected],2013-04-22 1 5,[email protected],2013-04-22 1 6,[email protected],2013-04-22 1 7,[email protected],2013-04-23 1 8,[email protected],2013-04-23 1 9,[email protected],2013-04-23 1
這樣,我們就實現了在win7中的開發,通過Maven構建Hadoop依賴環境,在Eclipse中開發MapReduce的程式,然後執行JavaAPP。Hadoop應用會自動把我們的MR程式打成jar包,再上傳的遠端的hadoop環境中執行,返回日誌在Eclipse控制檯輸出。
7. 模板專案上傳github
大家可以下載這個專案,做為開發的起點。
~ git clone https://github.com/bsspirit/maven_hadoop_template.git
我們完成第一步,下面就將正式進入MapReduce開發實踐。
文章轉自:http://cache.baiducontent.com/c?m=9d78d513d9d430dc4f9d9f697b17c017184381136384944223c3923884145f563160f4ba5778465092d13b275fa0131aacb22173441e3de7c595dd5dddccc96e6dcf7723706bda1654ce19abcd4d22cb249147adf44ea2fbb261c2f9c5d3a95752ca59077b82ed&p=87759a45d5c717fc57efce3d555f80&newp=c978dd5386cc41ac5cb2c7710f4f98231610db2151d7d21620&user=baidu&fm=sc&query=%CA%B9%D3%C3maven%B6%D4hdfs&qid=8f358a930005d4ec&p1=3相關推薦
用Maven構建專案(轉)
從2011年開始,中國進入大資料風起雲湧的時代,以Hadoop為代表的家族軟體,佔據了大資料處理的廣闊地盤。開源界及廠商,所有資料軟體,無一不向Hadoop靠攏。Hadoop也從小眾的高富帥領域,變成了大資料開發的標準。在Hadoop原有技術基礎之上,出現了Hadoop家族產品,通過“大資料”概念不斷創新,
Eclipse下用maven構建Struts專案,實現簡單的登入及驗證。
點選File->new->Maven Project. 右鍵所建立的專案->點選Properties->點選Java Build Path將JDK改成所需要的版本 然後點選Project Facets先改Java,再改Dynamic Web Mod
docker用maven構建和啟動一個springboot專案
maven外掛構建一個docker映象1.maven配置<plugin> <groupId>com.spotify</groupId> <artifactId>dock
idea用maven構建java+scala專案
1. 新建專案 File->new project 選擇maven專案,勾上Create from archetype,選擇 scala-archetype-simple 標記路徑為Source root:在目錄上右鍵 -> “Mark di
用Maven做專案構建
簡介 本文將介紹基於 Apache Maven 3 的專案構建的基本概念和方法。Maven 是一套標準的專案構建和管理工具,使用統一規範的指令碼進行專案構建,簡單易用,摒棄了 Ant 中繁瑣的構建元素,並具有較高的可重用性。讀完本文,你將瞭解 Maven 的基本概念和使
使用maven構建專案時,SSM和springboot專案的打包與雲伺服器部署
下面講講如何打包SSM和springboot專案,並部署到雲伺服器上。 由於使用的IDE不同,有的使用eclipse,有的使用idea,所以如果在IDE中按照 maven clean 再 maven install的方式打包會稍有不同,下面介紹一種通用的方式,不論SS
18個實時音視訊開發中會用到開源專案(轉)
本文轉載自部落格:https://blog.csdn.net/dittychen/article/details/79345828 ----------------------------------------------------------------------------------
maven-----使用maven構建專案
一:構建Java專案 命令列輸入:mvn archetype:generate -DgroupId=com.mycompany.app -DartifactId=myapp -DarchetypeArtifactId=maven-archetype-quickstart -Di
spring boot構建微服務-通過maven構建專案HelloWorld
1.訪問如下地址來構建基礎專案 地址:http://start.spring.io/ 2.具體操作選項如下 a.Generate a:建立一個什麼樣的工程,這裡選擇Maven Project b.with:通過什麼語言,這裡選擇Java c.and Spring Boot:這個是選擇sp
Sprting security安全框架配置maven構建專案
pom.xml 4.0.0 cn.qiupengfei.test spring-security-demo war 0.0.1-SNAPSHOT <spring.version>4.2.4.RELEASE</spring.version&g
Maven學習總結(三)——使用Maven構建專案
maven作為一個高度自動化構建工具,本身提供了構建專案的功能,下面就來體驗一下使用maven構建專案的過程。 一、構建Jave專案 1.1、建立Jave Project 1、使用mvn archetype:generate命令,如下所示: mvn archet
使用maven構建專案遇到的問題
為什麼要用Tomcat外掛啟動web工程 1.為什麼不用圖片新增的方式新增web工程到Tomcat,因為會面臨同時執行多個專案的需求,此時需要更改Tomcat的埠號,如果使用圖1的方式要更改3處埠號,相對麻煩,如果採用外掛方式只要在pom檔案配置即可。 2.
Maven構建專案後項目報Error錯誤Java compiler level does not match the version of the installed Java project fac
專案->右鍵->Project Facets->修改facets中java版本(下拉箭頭出)為要用的版本 Maven構建專案需注意 1.專案右鍵->Preferences->buildpath->jdk 2.專案右鍵->Prefer
【Maven】實際構建SSM框架和用maven構建的詳細區別
工具:IDEA 一個新電腦,打算配置個SSM框架。本來 打算自己下Jar包自己配,自力更生嘛! 0x01 --自己配環境 好,既然是Spring MVC+Spring+MyBatis .那麼就從Sping開始配吧 開啟Spring官網:https://spring.io
用vue構建專案筆記5(在vue-cli專案中引用公用方法)(vue resource統一處理)
之前用cli腳手架構建的專案廢了,又讓我改成jq了,悲劇,這次這個專案用純vue實現了,哈哈。下面介紹如何引入全域性方法供每個元件都能呼叫。 1.建立一個js檔案,我起的名字叫做“common.js”,放在assets>js下。 2.在common.js檔案中寫入公用
maven 構建專案時,modules子模組,module對應name中的欄位
當maven構建專案時,modules的module所使用的是子專案中pom中的name欄位名,而不是artifactId,也不是groupId例如:父專案Pom:<modules><
用maven構建ssm(spring+springmvc+mybatis)框架
建立maven專案使用idea或者eclipes建立maven web專案(具體步驟請參考其他部落格)pom.xml檔案1.在pom檔案中的url標籤下加入下面程式碼,方便以後更改jar包版本 <properties> <springframework
Eclipse用maven構建SSM框架遇到的問題及解決方法
聽說Maven很好用,剛剛好實驗了Spring+Mvc企業應用實踐的一個小專案,所以就拿它來練手了,用maven來重構它。 開發環境如下: Eclipse:Eclipse Project Release Notes 4.7 (Oxygen) JDK:JDK1.8.0_1
【Maven問題】應用Maven構建專案遇到的各種問題總結
2. 補充Maven配置本地倉庫 **配置**:windows–>preferences–>Maven–>UserSetting **參考文件**: 3. 新建Maven工程,不顯示原始檔包 現象:eclipse建立maev
SSM(Spring+SpringMVC+Mybatis)框架超詳細搭建指南(利用Maven構建專案)
其實這是我實習生涯開始後的第一個任務,但是當時太忙了一直沒有時間記錄,就按照教程走了。剛好現在實習結束了有些時間,把整個搭建的過程還有一些坑記錄下來還是很有必要的。 DEMO https://github.com/mgsky1/aboutSpring/tree/ma