Spark執行SQL報錯GC問題

阿新 • • 發佈：2018-11-30

java.lang.OutOfMemoryError: GC overhead limit exceeded
	at org.apache.spark.unsafe.types.UTF8String.fromAddress(UTF8String.java:102)
	at org.apache.spark.sql.catalyst.expressions.UnsafeRow.getUTF8String(UnsafeRow.java:419)
	at org.apache.spark.sql.catalyst.expressions.JoinedRow.getUTF8String(JoinedRow.scala:102)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate.eval(Unknown Source)
	at org.apache.spark.sql.execution.joins.SortMergeJoinExec$$anonfun$doExecute$1$$anonfun$1$$anonfun$apply$1.apply(SortMergeJoinExec.scala:114)
	at org.apache.spark.sql.execution.joins.SortMergeJoinExec$$anonfun$doExecute$1$$anonfun$1$$anonfun$apply$1.apply(SortMergeJoinExec.scala:114)
	at org.apache.spark.sql.execution.joins.OneSideOuterIterator.advanceBufferUntilBoundConditionSatisfied(SortMergeJoinExec.scala:874)
	at org.apache.spark.sql.execution.joins.OneSideOuterIterator.advanceStream(SortMergeJoinExec.scala:855)
	at org.apache.spark.sql.execution.joins.OneSideOuterIterator.advanceNext(SortMergeJoinExec.scala:881)
	at org.apache.spark.sql.execution.RowIteratorToScala.hasNext(RowIterator.scala:68)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
	at scala.collection.Iterator$class.foreach(Iterator.scala:893)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
	at org.apache.spark.sql.hive.SparkHiveWriterContainer.writeToFile(hiveWriterContainers.scala:184)
	at org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:210)
	at org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:210)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:99)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
18/11/26 17:24:37 INFO executor.Executor: Not reporting error to driver during JVM shutdown.
18/11/26 17:24:37 ERROR util.SparkUncaughtExceptionHandler: [Container in shutdown] Uncaught exception in thread Thread[Executor task launch worker for task 451,5,main]
java.lang.OutOfMemoryError: GC overhead limit exceeded
	at org.apache.spark.unsafe.types.UTF8String.fromAddress(UTF8String.java:102)
	at org.apache.spark.sql.catalyst.expressions.UnsafeRow.getUTF8String(UnsafeRow.java:419)
	at org.apache.spark.sql.catalyst.expressions.JoinedRow.getUTF8String(JoinedRow.scala:102)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate.eval(Unknown Source)
	at org.apache.spark.sql.execution.joins.SortMergeJoinExec$$anonfun$doExecute$1$$anonfun$1$$anonfun$apply$1.apply(SortMergeJoinExec.scala:114)
	at org.apache.spark.sql.execution.joins.SortMergeJoinExec$$anonfun$doExecute$1$$anonfun$1$$anonfun$apply$1.apply(SortMergeJoinExec.scala:114)
	at org.apache.spark.sql.execution.joins.OneSideOuterIterator.advanceBufferUntilBoundConditionSatisfied(SortMergeJoinExec.scala:874)
	at org.apache.spark.sql.execution.joins.OneSideOuterIterator.advanceStream(SortMergeJoinExec.scala:855)
	at org.apache.spark.sql.execution.joins.OneSideOuterIterator.advanceNext(SortMergeJoinExec.scala:881)
	at org.apache.spark.sql.execution.RowIteratorToScala.hasNext(RowIterator.scala:68)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
	at scala.collection.Iterator$class.foreach(Iterator.scala:893)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
	at org.apache.spark.sql.hive.SparkHiveWriterContainer.writeToFile(hiveWriterContainers.scala:184)
	at org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:210)
	at org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:210)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:99)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

叢集節點數：20臺
可用記憶體大小：1024G
每臺核心：48核
使用SparkSQL讀取資料寫入Hive表

spark2-submit \
--class com.lhx.dac.test \
--master yarn \
dac-test.jar

預設方式執行，叢集使用記憶體800G，報錯OOM
檢視CM的Hive配置：
在這裡插入圖片描述

--conf spark.yarn.driver.memoryOverhead=2048m \
--conf spark.yarn.executor.memoryOverhead=2048m \
--conf spark.dynamicAllocation.enabled=false \

檢視記憶體和核數比：3:1
在這裡插入圖片描述
設定引數提高GC記憶體，
spark.executor.memory 調大引數擴大記憶體（預設512M，調整為2G），
修改後Spark執行命令：

spark2-submit \
--class com.lhx.dac.test \
--master yarn \
--deploy-mode cluster \
--driver-memory 6g \
--executor-memory 6g \
--executor-cores 2 \
--conf spark.yarn.driver.memoryOverhead=2048m \
--conf spark.yarn.executor.memoryOverhead=2048m \
dac-test.jar

叢集使用記憶體900多G，成功執行程式。

總結：記憶體分配採用動態，堆疊記憶體是靜態分配。

Spark執行SQL報錯GC問題

java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.spark.unsafe.types.UTF8String.fromAddress(UTF8String.java:102) at org.apach

Oracle執行SQL報錯ORA-00922

log define sql option -1 nbsp 執行sql 問題 val 問題描述：對Oracle數據庫執行序列化腳本出錯，ora-00922 missing or invalid option #無效的選項問題解決：

MySQL - 執行sql報錯USING BTREE

解決方法時間 5.1 有一個解決 ... blog .net name 問題與分析在執行sql文件時發現報錯如下： You have an error in your SQL syntax; check the manual that corresponds to y

springboot 整合spark-sql報錯

artifact boot exec execution erro cti ani ava ren Exception in thread "main" org.spark_project.guava.util.concurrent.ExecutionError: java

windows系統上執行spark、hadoop報錯Could not locate executable null\bin\winutils.exe in the Hadoop binaries

1.下載 winutils.exe：https://download.csdn.net/download/u010020897/10745623 2.將此檔案放置在某個目錄下，比如C:\winutils\bin\中。 3.在程式的一開始宣告：System.s

Error :spark-shell模式報錯：java.sql.SQLException: A read-only user or a user in a read-only database

1.問題描述：啟動spark-shell local的模式 bin/spark-shell --master local[2] 報錯： [[email protected] spark-2.1.0-bin-hadoop2.6]$ bin/spark-she

PL/SQL 報錯：動態執行表不可訪問，本會話的自動統計被禁止。在執行選單裡你可以禁止統計，或在v$session，v$sesstat 和vSstatname表裡獲得選擇許可權。

現象：第一次用PL/SQL Developer連線資料庫，若用sys使用者登入並操作則正常，若用普通使用者比如haishu登入並建立一個表則報錯“動態執行表不可訪問，本會話的自動統計被禁止。在執行選單裡你可以禁止統計，或在v$session,v$sesstat和v$statname表裡獲得選擇許可權。

Android Studio匯入專案執行出現大量警告，且報錯GC，解決辦法

問題描述：同事給了一個專案讓我執行，我開啟工程後，本地使用的gradle 3.3版本和com.android.tools.builld:gradle:2.3.2版本都要高於專案本身指定的gradle 2.14.1和2.2.3，使用本地自己的版本沒有去下載專案原來指定的版本，b

DB2 sql報錯後查證原因與解決問題的方法

sta form con ica before lac tail reference ima 1.對於執行中的報錯，可以在db2命令行下運行命令： db2=>? SQLxxx 查看對應的報錯原因及解決方法。 2.錯誤SQL0206N SQLSTATE=42703

centos 7 執行 groupinstall報錯

ges tap armv7 dev erro http system package org 報錯顯示Error: Package: systemtap-devel-3.10-10.el7.armv7hl (base) Requires: kernel-devel 解決方案

spark-shell啟動報錯：Yarn application has already ended! It might have been killed or unable to launch application master

name limits nor bsp closed pre opened 頁面 loading spark-shell不支持yarn cluster，以yarn client方式啟動 spark-shell --master=yarn --deploy-mode=cli

eclipse下執行wordcount報錯 java.lang.ClassNotFoundException 解決辦法

eclipse下執行wordcount報錯 java.lang.classnotfoundexception 解決辦法eclipse下執行wordcount報錯 java.lang.ClassNotFoundException17/08/29 07:52:54 INFO Configuration.depre

IDEA中 Spark 讀Hbase 報錯處理：

ado htable client ets rim ogg expec zookeep ati SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 17/11/19 14:25:57 E

eclipse報錯GC overhead limit exceed，卡頓

檢查提前 err 更改 during ror bubuko 防止避免在使用Eclipse的Build Project功能時，提示以下錯誤： An internal error occurred during: “Build Project”. GC overhead

執行setup,報錯 -bash: setup: command not found

配置 xshell localhost 工具 host work ins lan uri 執行setup,報錯。 [root@localhost ~]# setup -bash: setup: command not found 出現這個問題的是因為沒有安裝setuptoo

SSH執行hql報錯：Caused by: org.hibernate.hql.ast.QuerySyntaxException: user is not mapped [from user where username = ?]

執行 occurred ble xml文件 ron red 報錯 temp caused 報錯信息： ERROR Dispatcher:38 - Exception occurred during processing request: user is not mapped

django中執行py報錯Requested setting DEFAULT_INDEX_TABLESPACE, but settings are not configured

setting 執行添加 all ted tables IT clas core https://blog.csdn.net/heybob/article/details/49684261 django代碼下面直接run的時候報錯： django.core.excepti

SQL報錯註入總結

字段 name tab extra SQ password get upd 報錯 1.Floor()報錯註入關於Floor報錯註入原理可以看http://blog.51cto.com/wt7315/1891458 獲取數據庫 select count(*),(conca

Failed to execute 'toDataURL' on 'HTMLCanvasElement,在canvas.toDataURL()執行時候報錯解決方案

from info long allow 條件 star The cross can 添加跨域條件 crossorigin="anonymous" 【Redirect at origin ‘http://xxx.xx.com‘ has been block

爬坑：spark專案打包報錯(java和scala混編)

專案打包：mvn clean package -DskipTests java和scala混編打包報錯： /Users/rocky/source/work/sparktrain/src/main/scala/com/zoujc/spark/project/dao/CourseSearchC

Spark執行SQL報錯GC問題

相關推薦