Customizing Hive Implementation

Running AMP requires a configuration file (.ini) to be passed in to the main python file.

This configuration has several parameters that are used to define the data used and how to perform the aggregation. An example can be found in config/ais.ini.

table_name - The name of the Hive table that contains your data.

table_schema_id - The column of your Hive table that contains the track id or user id that identifies a track.

table_schema_dt - The column of your Hive table that contains the timestamp to be used (YYYY-mm-dd HH:MM:SS).

table_schema_lat - The column of your Hive table that contains latitude.

table_schema_lon - The column of your Hive table that contains longitude.

time_filter - The maximum number of seconds allowed between points on a track. Any segment with more time between points gets removed.

distance_filter - The maximum distance allowable between points in KM. Any segment with more distance between points gets removed.

lower_left_lat - Lower Left latitude of bounding box to contain data.

lower_left_lon - Lower Left longitude of bounding box to contain data.

upper_right_lat - Upper Right latitude of bounding box to contain data.

upper_right_lon - Upper Right longitude of bounding box to contain data.

trip_name - A label for the aggregated data. Used in naming Hive tables.

resolution_lat - The height of bins in approximately 100 KM. This must be a factor of 10 (e.g. 1 ~= 100KM, .1 ~= 10KM, .01 ~= 1KM).

resolution_lon - The width of bins in approximately 100 KM. This must be a factor of 10 (e.g. 1 ~= 100KM, .1 ~= 10KM, .01 ~= 1KM).

temporal_split - Used to further bin data by discrete temporal amounts. Valid values are "minute", "hour", "day", "month", "year", and "all" for ignoring timestamps for binning.

Home
What is Aggregate Micro Path?

Environment Setup

Running Hive Implementation
Customizing Hive Implementation
Using Your Own Data

Visualizing the Results

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Customizing Hive Implementation

Clone this wiki locally