-
-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add censusfips metadata #4006
add censusfips metadata #4006
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to approve so you can merge this without re-review but the working partitions need to get fixed!
src/pudl/metadata/sources.py
Outdated
"Reference files for Federal Information Processing Series (FIPS) Geographic Codes. " | ||
"These FIPS Codes are a subset of a broader Population Estimates dataset." | ||
), | ||
"working_partitions": {"years": sorted(set(range(2011, 2024)))}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"working_partitions": {"years": sorted(set(range(2011, 2024)))}, | |
"working_partitions": {"years": sorted(set(2000, range(2011, 2024)))}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a good flag! but since for now we are only planning on using the one freshest year i think it should only be 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we anticipate updating our fipsification to deal with the time variability of these mappings? There'll definitely be inconsistencies in any individual vintage if we apply it across all years of data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the plan for this issue rn is to effectively replicate addfips which is to say use one vintage. This vintage will be much more recent (2023 instead of addfips most recent 2015). It should be pretty easy with this overall setup to add the additional vintages as a next step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding the extractor in here is going to create a horrible loop - you can't merge the archiver PR without having the metadata in PUDL main, and you can't extract the data without having a production archive published or else zenodo-cache-sync
will fail. I suppose you could work around this by pinning to your PUDL with the metadata locally, but it'll make it hard for anyone else to reproduce this. So I'd propose splitting the extraction into a separate PR and getting the metadata piece merged in first (which I think is pretty much good to go?).
yes good point! i will remove all the extractor stuff and make a new pr. |
Overview
Working on #3884.
What problem does this address?
-just add the fips codes metadata
OOS/for this other PR #4019:
pudl.helpers.add_fips_ids
work like it does rn but with our newer archived data and doesn't have the error found in Incorrect county FIPS code for Bedford, VA #3531Documentation
Make sure to update relevant aspects of the documentation.
Tasks
Testing
How did you make sure this worked? How can a reviewer verify this?
To-do list