1. 程式人生 > >大資料IMF傳奇行動 java maven工程(pom.xml配置) 本地模式執行詞頻統計

大資料IMF傳奇行動 java maven工程(pom.xml配置) 本地模式執行詞頻統計

1、下載 eclipse
    登入 www.eclipse.org/downloads
    下載Eclipse IDE for Java EE Developers版本

2、java 1.8版本
   scala 2.10.4

3、解壓 Eclipse IDE for Java

4、新建maven工程  File-other-maven project

5、選擇mavenarchetype-quickstart 1.1

6、輸入
Group id: com.dt.spark
artifact id:SparkApps

7、jre system library (j2se-1.5)修改:
   build path -configure build path-workspace default jre 1.8.0-65

8、新建包 com.dt.spark.SparkApps.cores

9 、新建類WordCount

11、配置pom.xml會下載依賴包

12、執行記憶體不夠
找到eclispe 中window->preferences->Java->Installed JRE ,點選右側的

Edit 按鈕,在編輯介面中的 “Default VM Arguments ”選項中,填入如下值

即可-Xms128m -Xmx512m

13、
 JavaRDD<String> lines = sc.textFile(

"G://IMFBigDataSpark2016//Bigdata_Software//spark-1.6.0-bin-

hadoop2.6//spark-1.6.0-bin-hadoop2.6//spark-1.6.0-bin-

hadoop2.6//README.md");
 
執行ok

16/01/16 20:20:19 INFO ShuffleBlockFetcherIterator: Started 0 remote

fetches in 129 ms
package : 1
For : 2
Programs : 1
processing. : 1
Because : 1
The : 1
cluster. : 1
its : 1
[run : 1
APIs : 1

14、pom.xml配置

  <groupId>com.dt.spark</groupId>
  <artifactId>SparkApps</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <packaging>jar</packaging>

  <name>SparkApps</name>
  <url>http://maven.apache.org</url>

  <properties>
    <project.build.sourceEncoding>UTF-

8</project.build.sourceEncoding>
  </properties>

  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
    <dependency> 
   <groupId>org.apache.spark</groupId>
   <artifactId>spark-core_2.10</artifactId>
   <version>1.6.0</version>
 </dependency>
 <dependency>
   <groupId>org.apache.spark</groupId>
   <artifactId>spark-sql_2.10</artifactId>
   <version>1.6.0</version>
   </dependency>
 <dependency>
   <groupId>org.apache.spark</groupId>
   <artifactId>spark-hive_2.10</artifactId>
   <version>1.6.0</version>
 </dependency>
 <dependency>
   <groupId>org.apache.spark</groupId>
   <artifactId>spark-streaming_2.10</artifactId>
   <version>1.6.0</version>
 </dependency>
 <dependency>
   <groupId>org.apache.hadoop</groupId>
   <artifactId>hadoop-client</artifactId>
   <version>2.6.0</version>
 </dependency>
 <dependency>
   <groupId>org.apache.spark</groupId>
   <artifactId>spark-streaming-kafka_2.10</artifactId>
   <version>1.6.0</version>
 </dependency>
    <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-graphx_2.10</artifactId>
    <version>1.6.0</version>
</dependency>
   
  </dependencies>
 
   <build>
    <sourceDirectory>src/main/java</sourceDirectory>
    <testSourceDirectory>src/main/test</testSourceDirectory>

    <plugins>
      <plugin>
        <artifactId>maven-assembly-plugin</artifactId>
        <configuration>
          <descriptorRefs>
            <descriptorRef>jar-with-dependencies</descriptorRef>
          </descriptorRefs>
          <archive>
            <manifest>
              <mainClass></mainClass>
            </manifest>
          </archive>
        </configuration>
        <executions>
          <execution>
            <id>make-assembly</id>
            <phase>package</phase>
            <goals>
              <goal>single</goal>
            </goals>
          </execution>
        </executions>
      </plugin>

      <plugin>
        <groupId>org.codehaus.mojo</groupId>
        <artifactId>exec-maven-plugin</artifactId>
        <version>1.2.1</version>
        <executions>
          <execution>
            <goals>
              <goal>exec</goal>
            </goals>
          </execution>
        </executions>
        <configuration>
          <executable>java</executable>
         

<includeProjectDependencies>true</includeProjectDependencies>
         

<includePluginDependencies>false</includePluginDependencies>
          <classpathScope>compile</classpathScope>
          <mainClass>com.dt.spark.App</mainClass>
        </configuration>
      </plugin>

      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <configuration>
          <source>1.6</source>
          <target>1.6</target>
        </configuration>
      </plugin>

    </plugins>
  </build>
</project>