1. 程式人生 > 實用技巧 >Spark開發-SparkSQL引擎自定義配置

Spark開發-SparkSQL引擎自定義配置

Spark catalyst的擴充套件

org.apache.spark.sql
	SparkSession
	  Inject extensions into the [[SparkSession]]. 
	  This allows a user to add Analyzer rules, Optimizer rules, Planning Strategies or a customized parser.
		* @since 2.2.0
		def withExtensions(f: SparkSessionExtensions => Unit): Builder = {f(extensions) this }

org.apache.spark.sql
	SparkSessionExtensions
     * This current provides the following extension points:
     * - Analyzer Rules.
     * - Check Analysis Rules
     * - Optimizer Rules.
     * - Planning Strategies.
     * - Customized Parser.
     * - (External) Catalog listeners.
     *
     * The extensions can be used by calling withExtension on the [[SparkSession.Builder]], for

使用:

  01.過Spark配置引數,具體引數名為spark.sql.extensions。
	   使用者可以將的自定義函式實現定義為一個類 MyExtensions ,將完整類名作為引數值 
	   例如:  class MyExtensions extends (SparkSessionExtensions => Unit) {
 SparkSession.builder()
  .config("spark.sql.extensions", classOf[MyExtensions].getCanonicalName)
  .getOrCreate()
 02.或者 接收一個自定義函式作為引數,這個自定義函式以SparkSessionExtensions作為引數
 bui: ExtensionsBuilder
  02.SparkSession.builder().
   .withExtensions( bui)
   .getOrCreate()
  其中:

參考:

 http://spark.apache.org/releases/spark-release-2-2-0.html SPARK-18127: Add hooks and extension points to Spark