用sqarkSQL往MySQL寫入資料
阿新 • • 發佈:2020-03-02
先設定表頭,再寫內容,內容得通過Row再轉換成dataframe,再把內容與表頭連線,再插入到MySQL中
#!/usr/bin/env python3 from pyspark.sql import Row from pyspark.sql.types import * from pyspark import SparkContext,SparkConf from pyspark.sql import SparkSession spark = SparkSession.builder.config(conf=SparkConf()).getOrCreate() schema=StructType([StructField("id",IntegerType(),True),\#true代表可以為空 StructField("name",StringType(),\ StructField("gender",\ StructField("age",IntegerType,True]) studentRDD = spark.saprkContext.parallelize(["3 HuangYukai M 26"]).map(lambda x:x.split(" ")) rowRDD = studentRDD.map(lamda x:Row(int(x[0].strip()),x[1].strip[],x[2].strip().int(x[3].strip()))) studentDF = spark.createDataFrame(rowRDD,schema) prop={} prop['user']='hadoop' prop['password']='hadoop' prop['driver']= "com.mysql.jdbc.Driver" studentDF.write.jdbc("jdbc:mysql://localhost:3306/spark",'student','append',prop)