spark學習14之使用maven快速切換本地除錯的spark版本
阿新 • • 發佈:2019-01-08
1解釋
有時候叢集裝了某個版本的spark,想再裝一個版本,想簡單點,可以選擇本地使用idea中的maven。
本文主要是從spark1.5.2切換到spark1.6.1
2.程式碼:
spark-1.5.2:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd" >
<modelVersion>4.0.0</modelVersion>
<groupId>org.apache.spark.version</groupId>
<artifactId>sparkVersion</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<scala.version>2.10.4</scala.version>
<spark.version >1.5.2</spark.version>
<scala.version.prefix>2.10</scala.version.prefix>
</properties>
<dependencies>
<!--<dependency>-->
<!--<groupId>org.</groupId>-->
<!--<artifactId></artifactId>-->
<!--</dependency>-->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.version.prefix}</artifactId>
<version>${spark.version}</version>
<!--<scope>provided</scope>-->
</dependency>
</dependencies>
</project>
spark-1.6.2:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.apache.spark.version</groupId>
<artifactId>sparkVersion</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<scala.version>2.10.4</scala.version>
<spark.version>1.6.1</spark.version>
<scala.version.prefix>2.10</scala.version.prefix>
</properties>
<dependencies>
<!--<dependency>-->
<!--<groupId>org.</groupId>-->
<!--<artifactId></artifactId>-->
<!--</dependency>-->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.version.prefix}</artifactId>
<version>${spark.version}</version>
<!--<scope>provided</scope>-->
</dependency>
</dependencies>
</project>
程式碼:
import org.apache.spark.{SparkConf, SparkContext}
/**
* Created by xubo on 2016/5/23.
*/
object sparkTest {
def main(args: Array[String]) {
// val conf = new SparkConf().setAppName(this.getClass().getSimpleName().filter(!_.equals('$'))).setMaster("local[4]")
val conf = new SparkConf().setAppName("test").setMaster("local[4]")
val sc = new SparkContext(conf)
val rdd = sc.parallelize(Array((1, 2), (3, 1), (3, 3)))
rdd.foreach(println)
println(rdd.partitions.length)
//since 1.6.0
println(rdd.getNumPartitions)
sc.stop()
}
}
3.結果:
執行結果:
spark1.5.2
:
不支援
//since 1.6.0
println(rdd.getNumPartitions)
spark1.6.1:
com.intellij.rt.execution.application.AppMain sparkTest
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/05/23 11:28:20 INFO SparkContext: Running Spark version 1.6.1
16/05/23 11:28:36 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/05/23 11:28:43 INFO SecurityManager: Changing view acls to: xubo
16/05/23 11:28:43 INFO SecurityManager: Changing modify acls to: xubo
16/05/23 11:28:43 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(xubo); users with modify permissions: Set(xubo)
16/05/23 11:29:48 INFO Utils: Successfully started service 'sparkDriver' on port 55625.
16/05/23 11:30:20 INFO Slf4jLogger: Slf4jLogger started
16/05/23 11:30:23 INFO Remoting: Starting remoting
16/05/23 11:30:25 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]211.86.159.133:55638]
16/05/23 11:30:25 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 55638.
16/05/23 11:30:27 INFO SparkEnv: Registering MapOutputTracker
16/05/23 11:30:31 INFO SparkEnv: Registering BlockManagerMaster
16/05/23 11:30:32 INFO DiskBlockManager: Created local directory at C:\Users\xubo\AppData\Local\Temp\blockmgr-5e5515f9-540a-4b6c-98f4-3a3775f5e72f
16/05/23 11:30:34 INFO MemoryStore: MemoryStore started with capacity 789.7 MB
16/05/23 11:30:38 INFO SparkEnv: Registering OutputCommitCoordinator
16/05/23 11:30:55 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/05/23 11:30:55 INFO SparkUI: Started SparkUI at http://211.86.159.133:4040
16/05/23 11:31:03 INFO Executor: Starting executor ID driver on host localhost
16/05/23 11:31:04 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 55674.
16/05/23 11:31:04 INFO NettyBlockTransferService: Server created on 55674
16/05/23 11:31:05 INFO BlockManagerMaster: Trying to register BlockManager
16/05/23 11:31:05 INFO BlockManagerMasterEndpoint: Registering block manager localhost:55674 with 789.7 MB RAM, BlockManagerId(driver, localhost, 55674)
16/05/23 11:31:05 INFO BlockManagerMaster: Registered BlockManager
16/05/23 11:31:30 INFO SparkContext: Starting job: foreach at sparkTest.scala:12
16/05/23 11:31:32 INFO DAGScheduler: Got job 0 (foreach at sparkTest.scala:12) with 4 output partitions
16/05/23 11:31:32 INFO DAGScheduler: Final stage: ResultStage 0 (foreach at sparkTest.scala:12)
16/05/23 11:31:32 INFO DAGScheduler: Parents of final stage: List()
16/05/23 11:31:33 INFO DAGScheduler: Missing parents: List()
16/05/23 11:31:34 INFO DAGScheduler: Submitting ResultStage 0 (ParallelCollectionRDD[0] at parallelize at sparkTest.scala:11), which has no missing parents
16/05/23 11:31:56 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1112.0 B, free 1112.0 B)
16/05/23 11:31:56 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 783.0 B, free 1895.0 B)
16/05/23 11:31:56 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:55674 (size: 783.0 B, free: 789.7 MB)
16/05/23 11:31:56 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006
16/05/23 11:31:57 INFO DAGScheduler: Submitting 4 missing tasks from ResultStage 0 (ParallelCollectionRDD[0] at parallelize at sparkTest.scala:11)
16/05/23 11:31:57 INFO TaskSchedulerImpl: Adding task set 0.0 with 4 tasks
16/05/23 11:31:59 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0,PROCESS_LOCAL, 2076 bytes)
16/05/23 11:31:59 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, partition 1,PROCESS_LOCAL, 2194 bytes)
16/05/23 11:31:59 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, localhost, partition 2,PROCESS_LOCAL, 2194 bytes)
16/05/23 11:31:59 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, localhost, partition 3,PROCESS_LOCAL, 2194 bytes)
16/05/23 11:31:59 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
16/05/23 11:31:59 INFO Executor: Running task 3.0 in stage 0.0 (TID 3)
16/05/23 11:31:59 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
16/05/23 11:31:59 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
(1,2)
(3,3)
(3,1)
16/05/23 11:32:01 INFO Executor: Finished task 3.0 in stage 0.0 (TID 3). 915 bytes result sent to driver
16/05/23 11:32:01 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 915 bytes result sent to driver
16/05/23 11:32:01 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 915 bytes result sent to driver
16/05/23 11:32:01 INFO Executor: Finished task 2.0 in stage 0.0 (TID 2). 915 bytes result sent to driver
16/05/23 11:32:01 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 1828 ms on localhost (1/4)
16/05/23 11:32:01 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 3055 ms on localhost (2/4)
16/05/23 11:32:01 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 1938 ms on localhost (3/4)
16/05/23 11:32:01 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 1936 ms on localhost (4/4)
16/05/23 11:32:01 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
16/05/23 11:32:01 INFO DAGScheduler: ResultStage 0 (foreach at sparkTest.scala:12) finished in 3.961 s
16/05/23 11:32:02 INFO DAGScheduler: Job 0 finished: foreach at sparkTest.scala:12, took 31.532340 s
4
4
16/05/23 11:32:02 WARN QueuedThreadPool: 6 threads could not be stopped
16/05/23 11:32:02 INFO SparkUI: Stopped Spark web UI at http://211.86.159.133:4040
16/05/23 11:32:04 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/05/23 11:32:04 INFO MemoryStore: MemoryStore cleared
16/05/23 11:32:04 INFO BlockManager: BlockManager stopped
16/05/23 11:32:04 INFO BlockManagerMaster: BlockManagerMaster stopped
16/05/23 11:32:04 INFO SparkContext: Successfully stopped SparkContext
16/05/23 11:32:04 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/05/23 11:32:04 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/05/23 11:32:04 INFO ShutdownHookManager: Shutdown hook called
16/05/23 11:32:04 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/05/23 11:32:04 INFO ShutdownHookManager: Deleting directory C:\Users\xubo\AppData\Local\Temp\spark-69d1558f-c3a5-4820-863b-9b7b8986b668
Process finished with exit code 0