1. 程式人生 > >Exception: Java gateway process exited before sending its port number

Exception: Java gateway process exited before sending its port number

problem:when i run the code in spark environment,it alway error:Exception: Java gateway process exited before sending its port number!
it takes me a long time to solve,finally a man in google who also encountered with the same problem like me says the errror is cause by anaconda,better to uninstall anaconda.I did not take his advice at beginning.but i could not find any solutions to

it.so i just do the following steps:
1.uninstall anaconda
2.remove the path i added in pycarm(Run-edit configurations-environment variables)
3.change the strange name i i use in the py file like 1,2,3 to a normal name
Finnally,my code run successfully in spark enviroment.Here is the my code:

start_time = time.time()
password = parse.quote_plus('
[email protected]
') uri = "mongodb://gouuse:{}@192.168.5.113:27017/Flight.test".format(password) # mongdb 連線引數 input_uri = "mongodb://username:[email protected]:27017/Flight" database = "Flight" collection = "info" spark = SparkSession.builder.master('local[*]').appName("read").getOrCreate() mongdbDf = spark.read.format('com.mongodb.spark.sql').options(fetchsize=1000,uri=input_uri,database=database,collection=collection).load() mongdbDf.printSchema() mongdbDf.show() #預設顯示20行 print('before delete the duplicated:') print(mongdbDf.count()) mongdbDf.registerTempTable("temp_table") # mngDF = mongdbDf.dropDuplicates(['customer_name','website']) # 去重 mngDF = mongdbDf.dropDuplicates(['customer_name']) # 去重 mngDF.write.format("com.mongodb.spark.sql.DefaultSource").mode("append").option("spark.mongodb.output.uri", uri).save() time_inv = time.time()-start_time print("************************************") print(time_inv) print('after delete the duplicated:') print(mngDF.count()) mngDF.show()