Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Social Vulnerability Index (SVI) subpackage #169

Merged
merged 50 commits into from
May 24, 2022

Conversation

aaraney
Copy link
Member

@aaraney aaraney commented Jan 28, 2022

This PR adds a client for programmatically accessing the Center for Disease Control's (CDC) Social Vulnerability Index (SVI).

"Social vulnerability refers to the potential negative effects on communities caused by external stresses on human health. Such stresses include natural or human-caused disasters, or disease outbreaks. Reducing social vulnerability can decrease both human suffering and economic loss." [source]

The SVI has been released 5 times (2000, 2010, 2014, 2016, and 2018) and calculates a relative percentile ranking in four themes categories and an overall ranking at a given geographic context and geographic scale. The themes are:

  • Socioeconomic
  • Household Composition & Disability
  • Minority Status & Language
  • Housing Type & Transportation

Rankings are calculated relative to a geographic context, state or all states (United States) . Meaning, for example, a ranking calculated for some location at the United States geographic context would be relative to all other locations where rankings was calculated in the United States. Similarly, SVI rankings are calculated at two geographic scales, census tract and county scales. Meaning, the rankings correspond to a county for a census tract. For completeness, for example, if you were to retrieve the 2018 SVI at the census tract scale, at the state context for the state of Alabama, you would receive 1180 records (number of census tracts in AL in 2010 census) where each ranked percentile is calculated relative to census tracts in Alabama. The tool released in this PR only supports querying for ranking calculated at the United States geographic context. Future work will add support for retrieving rankings at the state spatial scale.

Documentation for each year release of the SVI are located below:

Example

from hydrotools.svi_client import SVIClient

client = SVIClient()
df = client.get(
    location="AL", # state / nation name (i.e. "alabama" or "United States") also accepted. case insensitive
    geographic_scale="census_tract", # "census_tract" or "county"
    year="2018", # 2000, 2010, 2014, 2016, or 2018
    geographic_context="national" # only "national" supported. "state" will be supported in the future
    )
print(df)
                    state_name state_abbreviation  ... svi_edition                                           geometry
        0        alabama                 al  ...        2018  POLYGON ((-87.21230 32.83583, -87.20970 32.835...
        1        alabama                 al  ...        2018  POLYGON ((-86.45640 31.65556, -86.44864 31.655...
        ...          ...                ...  ...         ...                                                ...
        29498    alabama                 al  ...        2018  POLYGON ((-85.99487 31.84424, -85.99381 31.844...
        29499    alabama                 al  ...        2018  POLYGON ((-86.19941 31.80787, -86.19809 31.808...

Additions

  • adds a client for programmatically accessing the Center for Disease Control's (CDC) Social Vulnerability Index (SVI)

Testing

  1. Integration tests are included that test all valid SVI queries currently supported by the tool.
  2. Some unit tests included for utility functions.

Todos

  • Future work will add support for retrieving rankings at the state spatial scale.

Checklist

  • PR has an informative and human-readable title
  • PR is well outlined and documented. See #12 for an example
  • Changes are limited to a single goal (no scope creep)
  • Code can be automatically merged (no conflicts)
  • Code follows project standards (see CONTRIBUTING.md)
  • Passes all existing automated tests
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output) using numpy docstring formatting
  • Placeholder code is flagged / future todos are captured in comments
  • Reviewers requested with the Reviewers tool ➡️

@aaraney
Copy link
Member Author

aaraney commented Jan 28, 2022

@jarq6c

python/svi_client/setup.cfg Outdated Show resolved Hide resolved
@jarq6c
Copy link
Collaborator

jarq6c commented Feb 3, 2022

If the data source is inherently spatial, would it make sense for get to return a geopandas.GeoDataFrame? The canonical standard already includes a geometry column for this purpose.

@aaraney
Copy link
Member Author

aaraney commented Feb 3, 2022

If the data source is inherently spatial, would it make sense for get to return a geopandas.GeoDataFrame? The canonical standard already includes a geometry column for this purpose.

Thanks for asking! I was planning on bringing this up as I got further along with this PR. My thinking is, as a user I want the ability to have the geospatial data and not to have the geospatial data. The geospatial data column will contribute more to memory usage and I would like the ability to limit memory consumption if my problem does not require geospatial data.

However, as I read back over what I just wrote and consider the meaning of "canonical," I think I am persuading myself towards your view point. We have a canonical format and I think it makes sense to include all fields from the canonical format when a dataset has those fields. Do you share that view? And, what are your thoughts on having a method that returns data that includes the geospatial data and one that excludes the geospatial data?

@jarq6c
Copy link
Collaborator

jarq6c commented Feb 3, 2022

If the data source is inherently spatial, would it make sense for get to return a geopandas.GeoDataFrame? The canonical standard already includes a geometry column for this purpose.

Thanks for asking! I was planning on bringing this up as I got further along with this PR. My thinking is, as a user I want the ability to have the geospatial data and not to have the geospatial data. The geospatial data column will contribute more to memory usage and I would like the ability to limit memory consumption if my problem does not require geospatial data.

However, as I read back over what I just wrote and consider the meaning of "canonical," I think I am persuading myself towards your view point. We have a canonical format and I think it makes sense to include all fields from the canonical format when a dataset has those fields. Do you share that view? And, what are your thoughts on having a method that returns data that includes the geospatial data and one that excludes the geospatial data?

Emphasis on inherently spatial. If the original data source is a spatial format like a Shapefile, GeoJSON, or GeoCSV then I think we'd want to include a geometry column and return the data as a GeoDataFrame. There are other data sources like NWM NetCDF or NWIS JSON/WaterML that include spatial data but aren't necessarily inherently spatial data formats (i.e. lacking in explicit vector geometry). These formats might return latitude and longitude columns in a pandas.DataFrame instead.

@aaraney
Copy link
Member Author

aaraney commented Feb 3, 2022

If the data source is inherently spatial, would it make sense for get to return a geopandas.GeoDataFrame? The canonical standard already includes a geometry column for this purpose.

Thanks for asking! I was planning on bringing this up as I got further along with this PR. My thinking is, as a user I want the ability to have the geospatial data and not to have the geospatial data. The geospatial data column will contribute more to memory usage and I would like the ability to limit memory consumption if my problem does not require geospatial data.
However, as I read back over what I just wrote and consider the meaning of "canonical," I think I am persuading myself towards your view point. We have a canonical format and I think it makes sense to include all fields from the canonical format when a dataset has those fields. Do you share that view? And, what are your thoughts on having a method that returns data that includes the geospatial data and one that excludes the geospatial data?

Emphasis on inherently spatial. If the original data source is a spatial format like a Shapefile, GeoJSON, or GeoCSV then I think we'd want to include a geometry column and return the data as a GeoDataFrame. There are other data sources like NWM NetCDF or NWIS JSON/WaterML that include spatial data but aren't necessarily inherently spatial data formats (i.e. lacking in explicit vector geometry). These formats might return latitude and longitude columns in a pandas.DataFrame instead.

For this work, I agree with you and ill make the change to only return geopandas.GeoDataFrame's from the get method. However, the design decision to let a data source's format dictate our output format is not something i'm overly keen about. I don't think that's exactly what you meant, but I just wanted to say it more plainly. My opinion is that we try to ensure a common interface for a data source, even if that data source's source format and composition changes.

@jarq6c
Copy link
Collaborator

jarq6c commented Feb 3, 2022

I think we have some leeway since objects like geopandas.GeoDataFrame and dask.dataframe.DataFrame inherit from pandas.DataFrame and can be coerced into the canonical format. If we wanted to adhere to a strictly pandas based standard, you could include a switch parameter as you suggested.

For example, the nwm_client_new get method includes an analogous boolean switch called compute that returns a pandas.DataFrame if True (default behavior) or a dask.dataframe.DataFrame if `False.

@jarq6c
Copy link
Collaborator

jarq6c commented Feb 11, 2022

CDC_Social_Vulnerability_Index_2018 (FeatureServer)
https://services3.arcgis.com/ZvidGQkLaDJxRSJ2/arcgis/rest/services/CDC_Social_Vulnerability_Index_2018/FeatureServer

@aaraney
Copy link
Member Author

aaraney commented Mar 20, 2022

Looking back at this work, I think it would be helpful for me moving forward if we could decide what SVI fields our "flagship" api will support. I would like to support a "raw" svi api for users who want to use our tools to just get data svi in a dataframe format. This comment should provide background and enough information to get us started. Ill include my thoughts / opinions in a separate comment below.

Background

To, hopefully, start the conversation, the SVI has been released 5 times (2000, 2010, 2014, 2016, and 2018). The SVI calculates a relative percentile ranking in four themes categories for a give geographic extent:

  1. Socioeconomic
  2. Household Composition & Disability
  3. Minority Status & Language
  4. Housing Type & Transportation

The values used to calculate the percentile ranking for each of the four themes are summed, for each record, to calculate an overall percentile ranking.

Rankings are calculated relative to a given state or the entire United States. For all editions of the SVI, rankings are computed at the census tract spatial scale. Meaning, if you were to retrieve the 2018 SVI at the census tract scale, at the state coverage for the state of Alabama, you would receive 1180 records (number of census tracts in AL in 2010 census) where each ranked percentile is calculated relative to census tracts in Alabama. From 2014 onward, the SVI is also offered at the county scale in both the state and U.S. coverage products. The state coverage products allow inter-state comparison and the U.S. coverage allow national comparison at the census tract or county scales (2014 onward).

Luckily, facets used to calculate each theme are included in all datasets. Facets being one of the fields contributing to the calculation of a themes value (e.g. Per capita income). In the 2018 edition, there are 124 column headers. This number has fluctuated (mainly only increased) with new released of the SVI:

number of cols for each release:

  • 2000: 79
  • 2010: 107
  • 2014: 127
  • 2016: 125
  • 2018: 124

Question

The main question I would like to address through discussion is, what fields should be included in the canonical output for the flagship svi client get method? The question boils down to what location metadata fields and SVI fields do we think should be included?

Additional Information

Below are links for SVI documentation broken down by year:

"2000": "https://www.atsdr.cdc.gov/placeandhealth/svi/documentation/pdf/SVI2000Documentation-H.pdf"
"2010": "https://www.atsdr.cdc.gov/placeandhealth/svi/documentation/pdf/SVI-2010-Documentation-H.pdf"
"2014": "https://www.atsdr.cdc.gov/placeandhealth/svi/documentation/pdf/SVI2014Documentation_01192022.pdf"
"2016": "https://www.atsdr.cdc.gov/placeandhealth/svi/documentation/pdf/SVI2016Documentation_01192022.pdf"
"2018": "https://www.atsdr.cdc.gov/placeandhealth/svi/documentation/pdf/SVI2018Documentation_01192022_1.pdf"

@aaraney
Copy link
Member Author

aaraney commented Mar 20, 2022

My thoughts are this flagship get api should include a limited subset of fields from the SVI. If a user wants all the raw fields, they can use a get_all_fields method (I am very open to a more intuitive method name). For starters IMO, the included fields should be <name: description>:

state_fips: State FIPS code
state_name: State name
county_name: County name
fips: Census tract or county fips code
svi_edition: year corresponding to svi release (this assumes 2 SVI's will not be release in a given year in the future)
geometry: County or census tract simple features geometry
rank_theme_1: Socioeconomic
rank_theme_2: Household Composition / Disability
rank_theme_3: Minority Status / Language
rank_theme_4: Housing Type / Transportation
rank_svi: aggregated overall percentile ranking
value_theme_1: Socioeconomic
value_theme_2: Household Composition / Disability
value_theme_3: Minority Status / Language
value_theme_4: Housing Type / Transportation
value_svi: aggregated overall value; sum of values from themes 1, 2, 3, 4.

As always, this is a conversation and I hope we can find the best for hydrotools users together! I would really value some opinions on this topic!

Pinging @jarq6c.

@jarq6c
Copy link
Collaborator

jarq6c commented Mar 21, 2022

That synopsis is extremely useful! I may share that widely. Thanks a lot.

I have no objections to this initial subset for get. I think the only thing that irks me is the wide-format. Each dataframe row is a unique county or census tract, so we end up with really wide dataframes. In the future, we might consider a long_format parameter to generate dataframes more similar to the canonical format.

Something like this:

  state_fips state_name county_name  fips svi_edition geometry    value_theme value
0       0000         ZZ         foo  1111        2018  POLYGON  Socioeconomic  99.9
1       4444         XX         bar  2222        2018  POLYGON            All  26.3

As for additional metadata, you might consider:

  1. get returning a wide dataframe based on some parameter
  2. get returning multiple dataframes based on some parameter
  3. punting retrieval of metadata to a separate method with an interface similar to get (i.e. get_metadata)

Thanks for putting this together!

@aaraney
Copy link
Member Author

aaraney commented Mar 21, 2022

That synopsis is extremely useful! I may share that widely. Thanks a lot.

Great! Please do!

Right, I see that we share the same irk. There are just sooo many columns. I like your proposed solution. I think long term get should return a long-formatted dataframe. Alternatively, the shape of this data makes it a good candidate for using pandas multi-indexes, however I am hesitant. Ive not seen many libraries that return multi-index dataframes "in the wild" as they are a little niche. If you disagree, please interject.

Moving forward, I think we are on the same page for next steps for this PR. I will work get returning a wide-formated dataframe that follows the column schema I listed above. I am still learning details about the SVI so I think it makes sense to stay in a wide format during this "discovery" phase. Then as the PR moves forward and hopefully after a first review, we can revisit and formalize the long-format we want to implement and release 0.0.1 with the long format and any other relevant get_x_metadata methods. Thoughts?

@jarq6c
Copy link
Collaborator

jarq6c commented Mar 21, 2022

I use MultiIndexes frequently for intermediate processing, but often have to break out of MultiIndexes due to compatibility issues, mostly when attempting to write these dataframes to disk. The only methods I know of that return MultiIndex dataframes by default are groupby methods.

I agree, I think sticking to a more data-source-native wide format is the best way to go in the short term.

@hellkite500
Copy link
Member

hellkite500 commented Mar 21, 2022

Would it make sense to extend the get api using some kwargs? For example

#get the basic data (default fields, "wide format")
basic_data = client.get(arg1, arg2)
#get the basic data (default fields, "long format")
long_data = client.get(arg1, arg2, long_form=True)
all_fields = client.available_fields(year)
#get all data
all_data = client.get(arg1, arg2, fields=all_fields)
#get a single field
single_data = client.get(arg1, arg2, fields=all_fields[0])
#get data with multi index
data = client.get(arg1, arg2, multi_index=True)

The backend can be organized around some common functions/transformations, and the get api always returns data, but takes into account some of the ways a user may want to use it.

@jarq6c
Copy link
Collaborator

jarq6c commented Mar 23, 2022

Would it make sense to extend the get api using some kwargs? For example

#get the basic data (default fields, "wide format")
basic_data = client.get(arg1, arg2)
#get the basic data (default fields, "long format")
long_data = client.get(arg1, arg2, long_form=True)
all_fields = client.available_fields(year)
#get all data
all_data = client.get(arg1, arg2, fields=all_fields)
#get a single field
single_data = client.get(arg1, arg2, fields=all_fields[0])
#get data with multi index
data = client.get(arg1, arg2, multi_index=True)

The backend can be organized around some common functions/transformations, and the get api always returns data, but takes into account some of the ways a user may want to use it.

Yeah, this makes sense to me.

aaraney added 5 commits April 10, 2022 22:02
type encapsulates hydrotools canonical svi fields. in the future this
should be seperated to a hydrotools canonical module once a provider
class has been created.
…uture svi providers will specify a mapping from a location abbreviation to their naming needs
@aaraney
Copy link
Member Author

aaraney commented Apr 11, 2022

Would it make sense to extend the get api using some kwargs? For example

#get the basic data (default fields, "wide format")
basic_data = client.get(arg1, arg2)
#get the basic data (default fields, "long format")
long_data = client.get(arg1, arg2, long_form=True)
all_fields = client.available_fields(year)
#get all data
all_data = client.get(arg1, arg2, fields=all_fields)
#get a single field
single_data = client.get(arg1, arg2, fields=all_fields[0])
#get data with multi index
data = client.get(arg1, arg2, multi_index=True)

The backend can be organized around some common functions/transformations, and the get api always returns data, but takes into account some of the ways a user may want to use it.

I am not keen to use kwargs to change the output format in the get method. It would differ from other hydrotools' get methods convention. I would rather just use a long format by default and provide an example for going from the long format back to a wide format in documentation.


valid_years = typing.get_args(Year)
if year_str not in valid_years:
valid_years = sorted(valid_years)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work because valid_years contains str and int types. See below:

from hydrotools.svi_client import SVIClient

client = SVIClient()

print(client.svi_documentation_url(2020))
Traceback (most recent call last):
  File "main.py", line 7, in <module>
    print(client.svi_documentation_url(2020))
  File "/home/jregina/Projects/hydrotools/python/svi_client/src/hydrotools/svi_client/clients.py", line 199, in svi_documentation_url
    year = utilities.validate_year(year)
  File "/home/jregina/Projects/hydrotools/python/svi_client/src/hydrotools/svi_client/types/utilities.py", line 46, in validate_year
    valid_years = sorted(valid_years)
TypeError: '<' not supported between instances of 'int' and 'str'

@jarq6c
Copy link
Collaborator

jarq6c commented May 3, 2022

You might consider using categories for some values. Categories reduced the memory footprint from 15.0 MB to 1.0 MB in this example.

from hydrotools.svi_client import SVIClient

client = SVIClient()
gdf = client.get("AL", "census_tract", "2018")

print("BEFORE CATEGORIZATION")
print(gdf.info(memory_usage="deep"))

str_cols = gdf.select_dtypes(include=object).columns
gdf[str_cols] = gdf[str_cols].astype("category")

print("AFTER CATEGORIZATION")
print(gdf.info(memory_usage="deep"))
BEFORE CATEGORIZATION
<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 29500 entries, 0 to 29499
Data columns (total 11 columns):
 #   Column              Non-Null Count  Dtype   
---  ------              --------------  -----   
 0   state_name          29500 non-null  object  
 1   state_abbreviation  29500 non-null  object  
 2   county_name         29500 non-null  object  
 3   state_fips          29500 non-null  object  
 4   county_fips         29500 non-null  object  
 5   fips                29500 non-null  object  
 6   theme               29500 non-null  object  
 7   rank                29500 non-null  float64 
 8   value               29500 non-null  float64 
 9   svi_edition         29500 non-null  object  
 10  geometry            29500 non-null  geometry
dtypes: float64(2), geometry(1), object(8)
memory usage: 15.0 MB
None

AFTER CATEGORIZATION
<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 29500 entries, 0 to 29499
Data columns (total 11 columns):
 #   Column              Non-Null Count  Dtype   
---  ------              --------------  -----   
 0   state_name          29500 non-null  category
 1   state_abbreviation  29500 non-null  category
 2   county_name         29500 non-null  category
 3   state_fips          29500 non-null  category
 4   county_fips         29500 non-null  category
 5   fips                29500 non-null  category
 6   theme               29500 non-null  category
 7   rank                29500 non-null  float64 
 8   value               29500 non-null  float64 
 9   svi_edition         29500 non-null  category
 10  geometry            29500 non-null  geometry
dtypes: category(8), float64(2), geometry(1)
memory usage: 1.0 MB
None

# lowercase and strip all leading and trailing white spaces from str columns for consistent
# output and quality control
df_dtypes = df.dtypes
str_cols = df_dtypes[df_dtypes == "object"].index
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a dumb question, but does this always select only string columns?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it was a dumb question. I don't think it will always return a string column. I think datetime columns will also be included. Because of this comment, I changed this check to use pd.DataFrame.select_dtypes as you showed above.

@jarq6c
Copy link
Collaborator

jarq6c commented May 3, 2022

I'm having some trouble understanding the output, even after looking at the documentation. Here's an example:

from hydrotools.svi_client import SVIClient

client = SVIClient()
gdf = client.get(
    location="WY",
    geographic_scale="county",
    year="2016",
    geographic_context="national"
    )

gdf.to_file("wyoming_svi_county.geojson", driver="GeoJSON")

print(client.svi_documentation_url(2016))

Once I've got wyoming_svi_county.geoson I load it up into QGIS and view the attribute table
sample_output

Notice there are five entries for albany county for the svi theme. The rank is the same, but the values are different. This pattern repeats for the other themes. Is this expected output?

Copy link
Collaborator

@jarq6c jarq6c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the way this package is structured. The only real "bug" is that the intended error is never raised for invalid years. There could be some memory optimizations and we'll want to check that the current output is what's expected (and possibly document why it's like that).

Overall, great work!

@aaraney
Copy link
Member Author

aaraney commented May 20, 2022

I'm having some trouble understanding the output, even after looking at the documentation. Here's an example:

from hydrotools.svi_client import SVIClient

client = SVIClient()
gdf = client.get(
    location="WY",
    geographic_scale="county",
    year="2016",
    geographic_context="national"
    )

gdf.to_file("wyoming_svi_county.geojson", driver="GeoJSON")

print(client.svi_documentation_url(2016))

Once I've got wyoming_svi_county.geoson I load it up into QGIS and view the attribute table sample_output

Notice there are five entries for albany county for the svi theme. The rank is the same, but the values are different. This pattern repeats for the other themes. Is this expected output?

Thanks for finding this issue, @jarq6c! The issue was how I was melting the dataframes. Ive pushed a change that resolves the issue you brought to light. Thanks!

@aaraney
Copy link
Member Author

aaraney commented May 20, 2022

You might consider using categories for some values. Categories reduced the memory footprint from 15.0 MB to 1.0 MB in this example.
...

Awesome! Thanks for suggesting this! Ive pushed a change that now casts string columns to categories!

@aaraney
Copy link
Member Author

aaraney commented May 20, 2022

It looks like pip removed --use-feature=in-tree-build in 22.1.

Usage:   
  /opt/hostedtoolcache/Python/3.9.12/x64/bin/python3 -m pip install [options] <requirement specifier> [package-index-options] ...
  /opt/hostedtoolcache/Python/3.9.12/x64/bin/python3 -m pip install [options] -r <requirements file> [package-index-options] ...
  /opt/hostedtoolcache/Python/3.9.12/x64/bin/python3 -m pip install [options] [-e] <vcs project url> ...
  /opt/hostedtoolcache/Python/3.9.12/x64/bin/python3 -m pip install [options] [-e] <local project path> ...
  /opt/hostedtoolcache/Python/3.9.12/x64/bin/python3 -m pip install [options] <archive url/path> ...

option --use-feature: invalid choice: 'in-tree-build' (choose from '2020-resolver', 'fast-deps')

I removed this option when installing subpackages in all our github actions.

@aaraney
Copy link
Member Author

aaraney commented May 20, 2022

also, it seems like pytest does not like it when two test files have the same name even if they are in separate directories. see this gh actions log.

@aaraney
Copy link
Member Author

aaraney commented May 20, 2022

alright, so with the additions I made today, this is pretty much ready to be merged. I just need to add documentation in the form of a readme and add description to the PR. However, the tool is now functional! Ill get that done tomorrow! Have a great weekend

@aaraney
Copy link
Member Author

aaraney commented May 24, 2022

@jarq6c, just updated the readme and PR description. Once you and others are satisfied with the code, we should be ready to merge and release!

@jarq6c jarq6c self-requested a review May 24, 2022 19:28
Copy link
Collaborator

@jarq6c jarq6c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Passed all tests and output now makes sense. The input validation also appears to be working.

The only gotcha I ran into (which is my fault) is that geopandas.GeoDataFrame.to_file is not compatible with CategoricalDtype. This can be resolved by converting categories to strings before writing the file.

from hydrotools.svi_client import SVIClient

client = SVIClient()
gdf = client.get(
    location="WY",
    geographic_scale="county",
    year="2000",
    geographic_context="national"
    )

cols = gdf.select_dtypes("category")
for c in cols:
    gdf.loc[:, c] = gdf[c].astype(str)

gdf.to_file("wyoming_svi_county.geojson", driver="GeoJSON")

@jarq6c
Copy link
Collaborator

jarq6c commented May 24, 2022

We'll pin a release before and after merging this in. I'll pin a release after #191 is merged.

@aaraney
Copy link
Member Author

aaraney commented May 24, 2022

Great find regarding to_file's incompatibility. Do you think it merits mentioning that in documentation somewhere, @jarq6c?

@jarq6c
Copy link
Collaborator

jarq6c commented May 24, 2022

Great find regarding to_file's incompatibility. Do you think it merits mentioning that in documentation somewhere, @jarq6c?

Might be a good idea to at least add to the README.md example.

@jarq6c jarq6c merged commit f9e0638 into NOAA-OWP:main May 24, 2022
@aaraney
Copy link
Member Author

aaraney commented May 26, 2022

Live on pypi.

@jarq6c
Copy link
Collaborator

jarq6c commented May 26, 2022

Live on pypi.

Thanks! I was going to take this chance to look over the other packages before pinning a HydroTools v3.0.0 release with the svi_client. If you've got any pet issues you want resolved before the next major release feel free to mention them.

@aaraney
Copy link
Member Author

aaraney commented May 26, 2022

Live on pypi.

Thanks! I was going to take this chance to look over the other packages before pinning a HydroTools v3.0.0 release with the svi_client. If you've got any pet issues you want resolved before the next major release feel free to mention them.

sorry if I jumped the gun. I should have communicated with you before pushing to pypi.

As for pet issues. I would like to update the svi_client's readme as you suggested above. The other changes that I would like to include are a little larger in scope, so maybe they can be added in a future minor package release.

@aaraney aaraney deleted the svi_subpackage branch May 19, 2023 01:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants