[Spark基礎]-- Spark 內建資料來源 options 名稱
阿新 • • 發佈:2018-11-09
在 Spark-2.1.0 以後支援的 Options 如下:
--------- JDBC’s options --------- user password url dbtable driver partitionColumn lowerBound upperBound numPartitions fetchsize truncate createTableOptions batchsize isolationLevel --------- CSV’s options --------- path sep delimiter mode encoding charset quote escape comment header inferSchema ignoreLeadingWhiteSpace ignoreTrailingWhiteSpace nullValue nanValue positiveInf negativeInf compression codec dateFormat timestampFormat maxColumns maxCharsPerColumn escapeQuotes quoteAll --------- JSON’s options --------- path samplingRatio primitivesAsString prefersDecimal allowComments allowUnquotedFieldNames allowSingleQuotes allowNumericLeadingZeros allowNonNumericNumbers allowBackslashEscapingAnyCharacter compression mode columnNameOfCorruptRecord dateFormat timestampFormat --------- Parquet’s options --------- path compression mergeSchema. --------- ORC’s options --------- path compression orc.compress. --------- FileStream’s options --------- path maxFilesPerTrigger maxFileAge latestFirst. --------- Text’s options --------- path compression --------- LibSVM’s options ------- path vectorType numFeatures
注意:在 Spark-2.1.0 以前,他們都是區分大小寫的。
參考:https://spark.apache.org/docs/2.3.0/api/java/org/apache/spark/sql/DataFrameReader.html