Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Selecting a subset of fields while reading (for speed up) #429

Closed
oztalha opened this issue Mar 7, 2017 · 4 comments
Closed

Selecting a subset of fields while reading (for speed up) #429

oztalha opened this issue Mar 7, 2017 · 4 comments
Milestone

Comments

@oztalha
Copy link

oztalha commented Mar 7, 2017

This is a feature request. Porting a SO question: Only read specific attribute columns of a shapefile with [...] Fiona to speed up reading from a shapefile.

@oztalha oztalha changed the title supporting selecting subset of fields (to speed up reading from a shapefile) Selecting a subset of fields while reading (for speed up) Mar 7, 2017
@snorfalorpagus
Copy link
Member

This could be supported using the SetIgnoredFields functionality in OGR. The argument is a blacklist of fields to ignore. A whitelist (more akin to usecols in pandas) could be supported using a difference from a list of all the fields. Additionally, it's possible to ignore the special geometry field - I've had a situation before were I wasn't really interested in the geometry, just the attributes.

http://www.gdal.org/classOGRLayer.html#a5e0c3427f64249d1c35cefb487546b10

@sgillies
Copy link
Member

sgillies commented Mar 10, 2017

@oztalha @snorfalorpagus It looks like we have a choice between specifying fields to get or fields to skip. A big advantage of the latter seems to be that it can easily be a no-op when the field doesn't exist, whereas there's a decision to be made if I ask for field "foo" and it's not in the file: pass or raise an exception?

I'm tentatively putting this down as a new feature for 1.8.

@sgillies sgillies added this to the 1.8.0 milestone Mar 10, 2017
@snorfalorpagus
Copy link
Member

I agree that blacklisting would be easier to implement and is actually the interface that OGR provides. I think I'd find whitelisting more useful. Can we do both? Or is that confusing?

@snorfalorpagus
Copy link
Member

This was implemented in #488.

#469 is related.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants