java呼叫spark的藉口執行WordCount
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/12/04 21:02:04 INFO SparkContext: Running Spark version 1.6.2
16/12/04 21:02:06 INFO SecurityManager: Changing view acls to: Administrator
16/12/04 21:02:06 INFO SecurityManager: Changing modify acls to: Administrator
16/12/04 21:02:06 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(Administrator); users with modify permissions: Set(Administrator)
16/12/04 21:02:07 INFO Utils: Successfully started service 'sparkDriver' on port 55299.
16/12/04 21:02:07 INFO Slf4jLogger: Slf4jLogger started
16/12/04 21:02:07 INFO Remoting: Starting remoting
16/12/04 21:02:07 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://
16/12/04 21:02:07 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 55312.
16/12/04 21:02:07 INFO SparkEnv: Registering MapOutputTracker
16/12/04 21:02:07 INFO SparkEnv: Registering BlockManagerMaster
16/12/04 21:02:07 INFO DiskBlockManager: Created local directory at C:\Users\Administrator.WIN-20160809ARI\AppData\Local\Temp\blockmgr-335551fd-4327-4871-b594-28c901093e15
16/12/04 21:02:08 INFO MemoryStore: MemoryStore started with capacity 1807.0 MB
16/12/04 21:02:08 INFO SparkEnv: Registering OutputCommitCoordinator
16/12/04 21:02:08 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/12/04 21:02:08 INFO SparkUI: Started SparkUI at http://192.168.164.1:4040
16/12/04 21:02:08 INFO Executor: Starting executor ID driver on host localhost
16/12/04 21:02:08 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 55319.
16/12/04 21:02:08 INFO NettyBlockTransferService: Server created on 55319
16/12/04 21:02:08 INFO BlockManagerMaster: Trying to register BlockManager
16/12/04 21:02:08 INFO BlockManagerMasterEndpoint: Registering block manager localhost:55319 with 1807.0 MB RAM, BlockManagerId(driver, localhost, 55319)
16/12/04 21:02:08 INFO BlockManagerMaster: Registered BlockManager
16/12/04 21:02:09 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 156.3 KB, free 156.3 KB)
16/12/04 21:02:09 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 14.1 KB, free 170.3 KB)
16/12/04 21:02:09 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:55319 (size: 14.1 KB, free: 1807.0 MB)
16/12/04 21:02:09 INFO SparkContext: Created broadcast 0 from textFile at JavaWordCount.java:21
16/12/04 21:02:11 WARN : Your hostname, WIN-20160809ARI resolves to a loopback/non-reachable address: fe80:0:0:0:0:5efe:c0a8:c789%20, but we couldn't find any external IP address!
16/12/04 21:02:11 INFO FileInputFormat: Total input paths to process : 1
16/12/04 21:02:12 INFO deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id
16/12/04 21:02:12 INFO deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
16/12/04 21:02:12 INFO deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap
16/12/04 21:02:12 INFO deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition
16/12/04 21:02:12 INFO deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id
16/12/04 21:02:12 INFO SparkContext: Starting job: saveAsTextFile at JavaWordCount.java:59
16/12/04 21:02:12 INFO DAGScheduler: Registering RDD 3 (mapToPair at JavaWordCount.java:30)
16/12/04 21:02:12 INFO DAGScheduler: Registering RDD 5 (mapToPair at JavaWordCount.java:45)
16/12/04 21:02:12 INFO DAGScheduler: Got job 0 (saveAsTextFile at JavaWordCount.java:59) with 1 output partitions
16/12/04 21:02:12 INFO DAGScheduler: Final stage: ResultStage 2 (saveAsTextFile at JavaWordCount.java:59)
16/12/04 21:02:12 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 1)
16/12/04 21:02:12 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 1)
16/12/04 21:02:12 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[3] at mapToPair at JavaWordCount.java:30), which has no missing parents
16/12/04 21:02:12 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 4.7 KB, free 175.1 KB)
16/12/04 21:02:12 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.6 KB, free 177.7 KB)
16/12/04 21:02:12 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:55319 (size: 2.6 KB, free: 1807.0 MB)
16/12/04 21:02:12 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
16/12/04 21:02:12 INFO DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[3] at mapToPair at JavaWordCount.java:30)
16/12/04 21:02:12 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
16/12/04 21:02:12 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0,PROCESS_LOCAL, 2109 bytes)
16/12/04 21:02:12 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
16/12/04 21:02:12 INFO HadoopRDD: Input split: file:/D:/china.txt:0+19
16/12/04 21:02:12 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 2253 bytes result sent to driver
16/12/04 21:02:12 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 261 ms on localhost (1/1)
16/12/04 21:02:12 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
16/12/04 21:02:12 INFO DAGScheduler: ShuffleMapStage 0 (mapToPair at JavaWordCount.java:30) finished in 0.287 s
16/12/04 21:02:12 INFO DAGScheduler: looking for newly runnable stages
16/12/04 21:02:12 INFO DAGScheduler: running: Set()
16/12/04 21:02:12 INFO DAGScheduler: waiting: Set(ShuffleMapStage 1, ResultStage 2)
16/12/04 21:02:12 INFO DAGScheduler: failed: Set()
16/12/04 21:02:12 INFO DAGScheduler: Submitting ShuffleMapStage 1 (MapPartitionsRDD[5] at mapToPair at JavaWordCount.java:45), which has no missing parents
16/12/04 21:02:12 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 4.0 KB, free 181.7 KB)
16/12/04 21:02:12 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 2.3 KB, free 184.0 KB)
16/12/04 21:02:12 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on localhost:55319 (size: 2.3 KB, free: 1807.0 MB)
16/12/04 21:02:12 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1006
16/12/04 21:02:12 INFO DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 1 (MapPartitionsRDD[5] at mapToPair at JavaWordCount.java:45)
16/12/04 21:02:12 INFO TaskSchedulerImpl: Adding task set 1.0 with 1 tasks
16/12/04 21:02:12 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, localhost, partition 0,NODE_LOCAL, 1883 bytes)
16/12/04 21:02:12 INFO Executor: Running task 0.0 in stage 1.0 (TID 1)
16/12/04 21:02:12 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
16/12/04 21:02:12 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 8 ms
16/12/04 21:02:12 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 1374 bytes result sent to driver
16/12/04 21:02:12 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 96 ms on localhost (1/1)
16/12/04 21:02:12 INFO DAGScheduler: ShuffleMapStage 1 (mapToPair at JavaWordCount.java:45) finished in 0.096 s
16/12/04 21:02:12 INFO DAGScheduler: looking for newly runnable stages
16/12/04 21:02:12 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool
16/12/04 21:02:12 INFO DAGScheduler: running: Set()
16/12/04 21:02:12 INFO DAGScheduler: waiting: Set(ResultStage 2)
16/12/04 21:02:12 INFO DAGScheduler: failed: Set()
16/12/04 21:02:12 INFO DAGScheduler: Submitting ResultStage 2 (MapPartitionsRDD[8] at saveAsTextFile at JavaWordCount.java:59), which has no missing parents
16/12/04 21:02:12 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 66.1 KB, free 250.1 KB)
16/12/04 21:02:12 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 23.1 KB, free 273.2 KB)
16/12/04 21:02:12 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on localhost:55319 (size: 23.1 KB, free: 1807.0 MB)
16/12/04 21:02:12 INFO SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1006
16/12/04 21:02:12 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 2 (MapPartitionsRDD[8] at saveAsTextFile at JavaWordCount.java:59)
16/12/04 21:02:12 INFO TaskSchedulerImpl: Adding task set 2.0 with 1 tasks
16/12/04 21:02:12 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 2, localhost, partition 0,NODE_LOCAL, 1894 bytes)
16/12/04 21:02:12 INFO Executor: Running task 0.0 in stage 2.0 (TID 2)
16/12/04 21:02:12 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
16/12/04 21:02:12 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
16/12/04 21:02:13 INFO FileOutputCommitter: Saved output of task 'attempt_201612042102_0002_m_000000_2' to file:/D:/1.txt/_temporary/0/task_201612042102_0002_m_000000
16/12/04 21:02:13 INFO SparkHadoopMapRedUtil: attempt_201612042102_0002_m_000000_2: Committed
16/12/04 21:02:13 INFO Executor: Finished task 0.0 in stage 2.0 (TID 2). 2080 bytes result sent to driver
16/12/04 21:02:13 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 2) in 181 ms on localhost (1/1)
16/12/04 21:02:13 INFO DAGScheduler: ResultStage 2 (saveAsTextFile at JavaWordCount.java:59) finished in 0.182 s
16/12/04 21:02:13 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool
16/12/04 21:02:13 INFO DAGScheduler: Job 0 finished: saveAsTextFile at JavaWordCount.java:59, took 0.818282 s
16/12/04 21:02:13 INFO SparkUI: Stopped Spark web UI at http://192.168.164.1:4040
16/12/04 21:02:13 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/12/04 21:02:13 INFO MemoryStore: MemoryStore cleared
16/12/04 21:02:13 INFO BlockManager: BlockManager stopped
16/12/04 21:02:13 INFO BlockManagerMaster: BlockManagerMaster stopped
16/12/04 21:02:13 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/12/04 21:02:13 INFO SparkContext: Successfully stopped SparkContext
16/12/04 21:02:13 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/12/04 21:02:13 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/12/04 21:02:13 INFO ShutdownHookManager: Shutdown hook called
16/12/04 21:02:13 INFO ShutdownHookManager: Deleting directory C:\Users\Administrator.WIN-20160809ARI\AppData\Local\Temp\spark-5409914b-5d2d-40cf-95ce-e1383c67fbcd
Process finished with exit code 0
相關推薦
java呼叫spark的藉口執行WordCount
"F:\Program Files\Java\jdk1.7.0_15\bin\java" -Didea.launcher.port=7533 "-Didea.launcher.bin.path=F:\Program Files (x86)\JetBrains\Intelli
java 呼叫cmd互動式執行命令並獲得執行結果
需求是這樣:需要呼叫cmd執行某個位置下的可執行程式,(具體我這裡是gtsstp.exe),首先可以通過Process的getOutputStream向其輸入命令,然後用InputStreamReader獲得執行結果。 這裡需要注意的是:因為讀取執行結果時是按
java呼叫spark+hdfs計算的一個小demo
最近在入門spark+hadoop,偽分散式安裝,部署推薦這幾個地址,不錯。這邊順手記錄一下自己用到的兩個小程式。 推薦教程 maven配置 <project xmlns="
eclipse下執行wordcount報錯 java.lang.ClassNotFoundException 解決辦法
eclipse下執行wordcount報錯 java.lang.classnotfoundexception 解決辦法eclipse下執行wordcount報錯 java.lang.ClassNotFoundException17/08/29 07:52:54 INFO Configuration.depre
Java 呼叫Linux 命令,並獲取命令執行結果
1.工具類 public class ExcuteLinux { public static String exeCmd(String commandStr) { String result = null; try { St
java 如何使用多執行緒呼叫類的靜態方法?
1.情景展示 靜態方法內部實現:將指定內容生成圖片格式的二維碼; 如何通過多執行緒實現? 2.分析 之所以採用多執行緒,是為了節省時間 3.解決方案 準備工作 logo檔案 將生成的檔案儲存在F
java實現kafka整合spark streaming完成wordCount,updateStateByKey完成實時狀態更新
引入依賴 <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-streaming_2.11</artifactId&g
Java呼叫外部程式、執行Shell或cmd命令
亦可以使用Runtime呼叫外部程式 public static void main(String []args) throws IOException { List<String> command = new ArrayList<>();
java呼叫Linux執行Python爬蟲,並將資料儲存到elasticsearch--(一、環境指令碼搭建)
java呼叫Linux執行Python爬蟲,並將資料儲存到elasticsearch中 一、以下部落格程式碼使用的開發工具及環境如下: 1、idea: 2、jdk:1.8 3、elasticsearch:5.2.0 4、Linux 5、Python 6、maven 二、maven座標: <!--jav
基因資料處理119之java呼叫SSW在linux下執行
基因資料處理系列 1.解釋 測試自帶Example: [email protected]:~/xubo/tools/Complete-Striped-Smith-Waterman
java呼叫matlab:二、在服務端(客戶機)搭建執行環境和常見問題解決
在服務端(客戶機)搭建matlab執行環境1.下載MCR到客戶機https://cn.mathworks.com/products/compiler/matlab-runtime.html注意對應的版本和位數,我的是matlab2012b x64就下載這個二、安裝下載下來的M
用java編寫spark程式,簡單示例及執行
最近因為工作需要,研究了下spark,因為scala還不熟,所以先學習了java的spark程式寫法,下面是我的簡單測試程式的程式碼,大部分函式的用法已在註釋裡面註明。 我的環境:hadoop 2.2.0 spark-0.9.0
java呼叫shell命令並獲取執行結果
原文地址:http://blog.csdn.net/arkblue/article/details/7897396 使用到Process和Runtime兩個類,返回值通過Process類的getInputStream()方法獲取 package ark;
Spark程式設計環境搭建(基於Intellij IDEA的Ultimate版本)(包含Java和Scala版的WordCount)(博主強烈推薦)
為什麼,我要在這裡提出要用Ultimate版本。 基於Intellij IDEA搭建Spark開發環境搭——參考文件 操作步驟 a)建立maven 專案 b)引入依賴(Spark 依賴、打包外掛等等) 基於Intellij
Spark on YARN簡介與執行wordcount(master、slave1和slave2)(博主推薦)
前期部落格 Spark On YARN模式 這是一種很有前景的部署模式。但限於YARN自身的發展,目前僅支援粗粒度模式(Coarse-grained Mode)。這是由於YARN上的Container資源是不可以動態伸縮的,一旦Container啟動之後,可使用
Spark standalone簡介與執行wordcount(master、slave1和slave2)
前期部落格 1. Standalone模式 即獨立模式,自帶完整的服務,可單獨部署到一個叢集中,無需依賴任何其他資源管理系統。從一定程度上說,該模式是其他兩種的基礎。借鑑Spark開發模式,我們可以得到一種開發新型計算框架的一般思路:先設計出它的s
win10+eclipse+hadoop2.7.2+maven+local模式直接通過Run as Java Application執行wordcount
一、準備工作 (1)Hadoop2.7.2 在linux部署完畢,成功啟動dfs和yarn,通過jps檢視,程序都存在 (2)安裝maven 二、最終效果 在windows系統中,直接通過Run as Java Application執行wordcount,而不需要先打包成jar包,然後在lin
spark叢集搭建與叢集上執行wordcount程式
Spark 配置 1、master 機器 Spark 配置 進入 Spark 安裝目錄下的 conf 目錄, 拷貝 spark-env.sh.template 到 spark-env.sh。 cp spark-env.sh.template spark-e
Java呼叫外部程式命令時執行緒阻塞問題分析
今天要寫個遠端重啟服務的功能,為了開發速度,暫時定為Java程式碼+WMIC命令的方法,簡單的說,就是利用Java呼叫本機應用程式的方法。涉及到的 Java類有java.lang包裡面的Runtime、Process、ProcessBuilder三個類,以及wmic
Spark local/standalone/yarn/遠端除錯-執行WordCount
local 直接啟動spark-shell ./spark-shell --master local[*] 編寫scala程式碼 sc.textFile("/input/file01.txt") res0.cache() res0.count val