Skip to content
ErikHoel edited this page Mar 22, 2013 · 3 revisions

The Hive Spatial component of this spatial framework adds geometric user-defined functions (UDFs) to Hive. These UDFs are built on top of the Esri Geometry API and are modeled on the ST_Geometry OGC compliant geometry type.

SELECT counties.name, count(*) cnt FROM counties
JOIN earthquakes
WHERE ST_Contains(counties.boundaryshape, ST_Point(earthquakes.longitude, earthquakes.latitude))
GROUP BY counties.name
ORDER BY cnt desc;

Additionally, there is a JSON SerDe (Serializer/Deserializer) that allows ArcGIS produced JSON to be mapped as rows and columns in a Hive table schema definition.

CREATE EXTERNAL TABLE IF NOT EXISTS counties (Name string, BoundaryShape binary)                                         
ROW FORMAT SERDE 'com.esri.hadoop.hive.serde.JsonSerde'              
STORED AS INPUTFORMAT 'com.esri.json.hadoop.EnclosedJsonInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'