Spark Hbase GeoMesa編寫分析模組
阿新 • • 發佈:2018-12-20
一、部署
二、編寫分析模組
我們編寫資料分析功能例如疊加分析、緩衝區分析等都需要用到geotools和opengis api。maven不好下載,中間各種踩坑,一開始將geotools全部考進去,會遇到日誌紀錄類與spark的日誌記錄類衝突。後來刪除多餘api,解決了這個問題。此篇記錄主要將下載不了的jar包貼出。
以及所有maven依賴,提醒下hbase需要mapreduce依賴,所以不能本地除錯,以及空間資料型別屬於自定義型別,檢視官網需要設定序列化類。
conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer") conf.set("spark.kryo.registrator", classOf[GeoMesaSparkKryoRegistrator].getName)
<properties> <gt.version>18.0</gt.version> <hbase.version>1.2.6</hbase.version> </properties> <dependencies> <dependency> <groupId>org.locationtech.geomesa</groupId> <artifactId>geomesa-spark-jts_2.11</artifactId> <version>2.0.0</version> </dependency> <dependency> <groupId>org.locationtech.geomesa</groupId> <artifactId>geomesa-hbase-spark_2.11</artifactId> <version>2.0.0</version> </dependency> <dependency> <groupId>org.locationtech.geomesa</groupId> <artifactId>geomesa-spark-core_2.11</artifactId> <version>2.0.0</version> </dependency> <dependency> <groupId>org.locationtech.geomesa</groupId> <artifactId>geomesa-spark-converter_2.11</artifactId> <version>2.0.0</version> </dependency> <dependency> <groupId>org.locationtech.geomesa</groupId> <artifactId>geomesa-spark-jts_2.11</artifactId> <version>2.0.0</version> </dependency> <dependency> <groupId>org.locationtech.geomesa</groupId> <artifactId>geomesa-spark-core_2.11</artifactId> <version>2.0.0</version> </dependency> <dependency> <groupId>org.locationtech.geomesa</groupId> <artifactId>geomesa-spark-geotools_2.11</artifactId> <version>1.3.0</version> </dependency> <dependency> <groupId>org.locationtech.spatial4j</groupId> <artifactId>spatial4j</artifactId> <version>0.6</version> </dependency> <dependency> <groupId>com.vividsolutions</groupId> <artifactId>jts</artifactId> <version>1.13</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.3.1</version> </dependency> <dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-client</artifactId> <version>1.2.6</version> </dependency> <dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-common</artifactId> <version>1.2.6</version> </dependency> <dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-server</artifactId> <version>${hbase.version}</version> </dependency> <dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-annotations</artifactId> <version>${hbase.version}</version> </dependency> <dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-protocol</artifactId> <version>${hbase.version}</version> </dependency> </dependencies> <build> <finalName>yyy</finalName> <plugins> <plugin> <artifactId>maven-assembly-plugin</artifactId> <configuration> <!--這部分可有可無,加上的話則直接生成可執行jar包--> <archive> <manifest> <mainClass>uu</mainClass> </manifest> </archive> <descriptorRefs> <descriptorRef>jar-with-dependencies</descriptorRef> </descriptorRefs> </configuration> </plugin> </plugins> </build> </project>
我的配置試成功了的,中間遇到各種問題網上沒有資料。大家耐心想想解決辦法吧。
三、示例
val conf = new SparkConf().setMaster("local[*]").setAppName("testSpark") conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer") conf.set("spark.kryo.registrator", classOf[GeoMesaSparkKryoRegistrator].getName) val sc = SparkContext.getOrCreate(conf) val params = Map("hbase.zookeepers" -> "xxx,xxx,xxx", "hbase.catalog" -> "geomesa_hbase","geomesa.security.auths" -> "USER,ADMIN") val query = new Query("gdelt-quickstart") val rdd = GeoMesaSpark(params).rdd(new Configuration(), sc, params,query) rdd.collect().foreach(println)
以及部分叢集執行結果:
ScalaSimpleFeature:719025549:719025549|GERMANY|DEU|POLICE||020|2|1|2|4|Hamburg, Hamburg, Germany|GM|Mon Jan 01 08:00:00 CST 2018|POINT (10 53.55)
ScalaSimpleFeature:719025586:719025586|DENMARK|DNK|CHRISTIANITY||020|10|1|10|1|Denmark|DA|Mon Jan 01 08:00:00 CST 2018|POINT (10 56)
ScalaSimpleFeature:719025545:719025545|BERLIN|DEU|||051|30|3|30|4|Berlin, Berlin, Germany|GM|Mon Jan 01 08:00:00 CST 2018|POINT (13.4 52.5167)
ScalaSimpleFeature:719025544:719025544|GERMANY|DEU|||010|1|1|1|4|Frankfurt, Brandenburg, Germany|GM|Mon Jan 01 08:00:00 CST 2018|POINT (14.55 52.35)
ScalaSimpleFeature:719026612:719026612|POLISH|POL|||046|8|1|8|1|Poland|PL|Mon Jan 01 08:00:00 CST 2018|POINT (20 52)
ScalaSimpleFeature:719026615:719026615|WARSAW|POL|ABBOT||043|6|1|6|4|Warsaw, (PL67), Poland|PL|Mon Jan 01 08:00:00 CST 2018|POINT (21 52.25)
ScalaSimpleFeature:719025664:719025664|UNITED KINGDOM|GBR|PRINCE||013|1|1|1|4|Frederiksborg, Hovedstaden, Denmark|DA|Mon Jan 01 08:00:00 CST 2018|POINT (12.3004 55.9345)
ScalaSimpleFeature:719025665:719025665|UNITED KINGDOM|GBR|PRINCE||013|1|1|1|4|Frederiksborg, Hovedstaden, Denmark|DA|Mon Jan 01 08:00:00 CST 2018|POINT (12.3004 55.9345)
ScalaSimpleFeature:719026743:719026743|ROMANIA||CABINET||010|4|2|4|1|Romania|RO|Mon Jan 01 08:00:00 CST 2018|POINT (25 46)
ScalaSimpleFeature:719025907:719025907|INTERIOR MINIST||MILITANT||190|2|1|2|4|Moscow, Moskva, Russia|RS|Mon Jan 01 08:00:00 CST 2018|POINT (37.6156 55.7522)
ScalaSimpleFeature:719026746:719026746|RUSSIAN|RUS|||015|6|1|6|4|Moscow, Moskva, Russia|RS|Mon Jan 01 08:00:00 CST 2018|POINT (37.6156 55.7522)
ScalaSimpleFeature:719026840:719026840|MILITANT||INTERIOR MINIST||190|4|1|4|4|Moscow, Moskva, Russia|RS|Mon Jan 01 08:00:00 CST 2018|POINT (37.6156 55.7522)