scala jdbc遠端訪問hive資料倉庫
阿新 • • 發佈:2018-12-11
需求:
通過簡單的Scala程式碼遠端連線Hive,查詢Hive表資料並將資料轉存到本地。另外,用Scala查詢到資料後,我們還可以將查詢到的ResultSet集合轉化為RDD或者DataFrame進行scala的運算元運算
第一步:啟動伺服器以及需要的服務(hiveserver2)遠端連線埠預設配置為10000
hive --service hiveserver2 10000
第二步:建立maven專案匯入pom.xml依賴
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.cyy.sparkSql</groupId> <artifactId>sparkSqlTest</artifactId> <version>1.0-SNAPSHOT</version> <properties> <java.version>1.8</java.version> <spark.version>2.1.0</spark.version> </properties> <dependencies> <!-- spark--> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.11</artifactId> <version>2.1.0</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.1.0</version> </dependency> <dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> <version>5.1.38</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-hive_2.11</artifactId> <version>2.1.0</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.hive/hive-jdbc --> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-jdbc</artifactId> <version>1.1.0</version> </dependency> <!-- https://mvnrepository.com/artifact/org.json/json --> <dependency> <groupId>org.json</groupId> <artifactId>json</artifactId> <version>20160810</version> </dependency> </dependencies> </project>
第三步:程式碼實現:
import java.io.{File, PrintWriter} import java.sql.{Connection, DriverManager, ResultSet, Statement} import org.json.{JSONArray, JSONObject} object sparkSqlTest { def main(args: Array[String]): Unit = { val driverName:String = "org.apache.hive.jdbc.HiveDriver" try { Class.forName(driverName) } catch{ case e: ClassNotFoundException => println("Missing Class",e) } val con:Connection =DriverManager.getConnection("jdbc:hive2://tdxy-bigdata-04:10000/p2p") val stmt:Statement = con.createStatement() val res:ResultSet = stmt.executeQuery("show tables") System.out.println("------"+res) while (res.next()) { System.out.println(res.getString(1)) } val jsonArray:JSONArray=new JSONArray() val rs:ResultSet = stmt.executeQuery("select * from p2p_order_info_user") println(rs.getMetaData.getColumnCount) println(rs.getMetaData.getColumnName(2)) val writer = new PrintWriter(new File("src/"+"order_table.txt")) writer.print("[") while (rs.next()) { // System.out.println(rs.getObject(4)) val jsonObject:JSONObject = new JSONObject() for (i <- 1 to rs.getMetaData.getColumnCount) { if (rs.getObject(i) != null) { println(rs.getObject(i)) jsonObject.put(rs.getMetaData.getColumnName(i).split("\\.")(1),rs.getString(i)) }else{ jsonObject.put(rs.getMetaData.getColumnName(i).split("\\.")(1),"null") } } writer.println(jsonObject+",") println("物件串"+jsonObject.toString) jsonArray.put(jsonObject) } writer.println("]") writer.close() } }
執行結果: