Spark Java版 windows本地開發環境
阿新 • • 發佈:2019-01-04
安裝IntelliJ IDEA
選擇Community版本安裝
安裝好後啟動,我這裡選擇UI主題
預設Plugins.
安裝scala外掛.
配置hadoop環境變數
下載winutils.exe
我這裡面選擇hadoop2.7.1版本
在D盤新建檔案D:\hadoop-2.7.1\bin\winutils.exe
配置windows環境變數
使用者變數:
新增HADOOP_HOME=D:\hadoop-2.7.1
系統變數:
Path新增%HADOOP_HOME%\bin
新建maven專案
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.spark</groupId>
<artifactId> sparktest</artifactId>
<version>2.2.0</version>
<packaging>jar</packaging>
<name>sparktest</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding >
<spark.version>2.2.0</spark.version>
<hadoop.version>2.7.1</hadoop.version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql-kafka-0-10_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>
</project>
測試程式碼
package com.spark;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;
/**
* Hello world!
*
*/
public class App
{
public static void main( String[] args )
{
SparkSession spark= SparkSession.builder().appName("spark-test").master("local[3]").getOrCreate();
Dataset<Row> result=spark.read().json("employees.json");
result.show();
result.printSchema();
spark.stop();
}
}
執行結果
完成!