scala處理json(針對json中陣列巢狀陣列,針對json中value資料型別不同,針對json中map的key不一定存在)
阿新 • • 發佈:2020-12-13
技術標籤:scala-spark
目的:
解決json很不規範 key不一定存在 value資料型別不一定相同 等多種問題
處理巢狀json 處理不規範json 模式匹配
參考文件(play-json)
pom
<!-- https://mvnrepository.com/artifact/com.typesafe.play/play-json --> <dependency> <groupId>com.typesafe.play</groupId> <artifactId>play-json_${scala.binary.version}</artifactId> <version>2.7.4</version> </dependency>
案例
package JsonParse import play.api.libs.json._ import scala.collection.mutable.{ArrayBuffer, ListBuffer} // 使用play-json模組 stackoverflow推薦的好用的文件: https : //www.playframework.com/documentation/2.8.x/ScalaJson object JsonTest { /* 目的:獲取裡面 name is_active completeness的值組成的三元組 如果欄位不全使用空字串填充 注意: 原始資料不規範,存在 陣列巢狀(root下陣列中的language中又是陣列) value型別不同等多種情況(lanuage下有的是map有的是字串) 核心思想: 使用match配合 validate 驗證資料型別 驗證該key是否存在來處理資料 不匹配統統給空陣列 最終結果中filter刪除空陣列導致的結果 */ def main(args: Array[String]): Unit = { val jsonString = """ { "root":[ { "languages": [ { "name": "English", "is_active": "true", "completeness": "asdf" }, {"aa":"我是來干擾的map"} , { "name": "Latin", "is_active": "asdf", "completeness": "232" } ,{ "name": "Latin", "is_active": "0009" } ] }, { "languages": [ { "name": "English1", "is_active": "true1", "completeness": "asdf1" }, { "name": "Latin1", "is_active": "asdf1", "completeness": "2321" }, "我是來干擾的字串" , { "name": "Latin1", "is_active": "00091" } ] }, { "notLanguage":"部分map不存在language的情況" } ] } """.stripMargin val listb: ListBuffer[Tuple3[String, String, String]] = ListBuffer.empty val json = Json.parse(jsonString) val list1 = (json \ "root").as[Seq[JsValue]] list1.foreach( root2list => { val list0 = (root2list \ "languages").validate[JsArray] match { // 篩選獲取不到 language或者返回型別不是陣列的為 空JSArray case JsSuccess(v, p) => v case _ => JsArray.empty } // 陣列轉換為Seq val list = list0.as[Seq[JsValue]] val names = list.map(x => ((x \ "name").validate[String] match { case JsSuccess(v, p) => v case _ => "" } )) val isActives = list.map(x => ((x \ "is_active").validate[String] match { case JsSuccess(v, p) => v case _ => "" } )) val completeness = list.map(x => ((x \ "completeness").validate[String] match { case JsSuccess(v, p) => v case _ => "" } )) val res = for (idx <- 0 until list.length) yield (names(idx), isActives(idx), completeness(idx)) val res1 = res.toList listb ++= res.toList } ) println(listb.filter(!_._1.equals(""))) } }
輸出結果
ListBuffer((English,true,asdf), (Latin,asdf,232), (Latin,0009,), (English1,true1,asdf1), (Latin1,asdf1,2321), (Latin1,00091,))