-
-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serialize finite data columns #189
Comments
Created a new geojson dataset and uploaded it to the PostGIS database with top 70 car makes as keyed values and everything else as "MISC." Replaced table references to new dataset. Code now needed to swap the values on the front end. If all goes well, I will key all the other values including the column names. Notes: Used a 20% sampling of the data. While trying to upload, ran out of memory on my local computer. Had to create EC2, setup environment, download and clean the data, made sure PostGIS database security settings was open to receiving data from new EC2, and finally upload via psychopg2. There was was a slight problem with dotenv and psychopg2 dependencies. |
Matthew was able to integrate serialization of car makes. Will serialize other data columns |
Update: will serialize body type. Need to find out if day of week can be pulled out of time/date data format in JS. |
@gregpawin This issue has not had an update since June 7, 2021. If you are no longer working on this issue please let us know.
|
Progress: Serialized the car makes, which has been implemented. Will serialize the other categorical info. |
Progress: Added more code to introduce serialization. Will continue to work on it. |
Progress: Analyzing other data columns to see which labels to keep. |
Progress: Made changes to make_serial_data.py. Removed location, violation_code. |
Progress: Did some data exploration--will keep state_plate, body_style, color, fine_amount as raw text data. Will serialize violation codes and possibly remove lat/lon |
Progress: Added regex rules to clean up violation codes. Will take top 20-30 and serialize it. |
Progress: Fixed bugs with code that cleans violation_codes af016a1 |
Fixed bugs in data cleaning code and added more serial data columns https://github.com/hackforla/lucky-parking/tree/e6a750db47fcbe8e61d0f47e40e8910785f48d17. Will hand it off to devs to test. |
New data format:
|
Overview
The amount of data transfer is very inefficient. Including making changes on the frontend, there are plenty of backend efficiencies that can be taken advantage of. Including: keying the categorical data, extracting the coordinates from the geometry, making SQL queries of only the pertinent data, extracting day of week/hour/etc from the datetime.
Action items
Resources/Instructions
Confer with Greg
The text was updated successfully, but these errors were encountered: