Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deserialize Optimizations #82

Merged
merged 2 commits into from
Nov 29, 2017
Merged

Deserialize Optimizations #82

merged 2 commits into from
Nov 29, 2017

Conversation

jwass
Copy link
Contributor

@jwass jwass commented Nov 27, 2017

Hello, this PR has a couple of parsing optimizations in two commits:

  • When creating Vecs for GeoJSON objects, initialize with Vec::with_capacity(...) since we know what size to expect.
  • Avoid a couple of critical clone() calls.

Currently, when parsing from a string, the entire JSON value is cloned after it's parsed, then each Feature is cloned again when parsing a FeatureCollection. I believe this is due to serde_json's as_array() and as_object() functions returning borrowed references. I avoid those calls by doing the match directly (by adding the _owned macros), which allows moving those objects without cloning.

I tested this on a release target on a ~330MB GeoJSON file (all Polygons), current master parses in about 18 seconds. After these changes, it takes under 10s. Parsing using just serde_json is about 8.2 s.

I'm pretty new to Rust so feel free to let me know if these changes are a bad idea -- or any other suggestions.

Call Vec::with_capacity(...) when we know what the expected length will
be.
When parsing FeatureCollection and Feature JSON objects, avoid
a .clone() call.
@KodrAus
Copy link
Contributor

KodrAus commented Nov 28, 2017

Thanks @jwass!

@KodrAus
Copy link
Contributor

KodrAus commented Nov 28, 2017

bors r+

@bors
Copy link
Contributor

bors bot commented Nov 28, 2017

🔒 Permission denied

Existing reviewers: click here to make KodrAus a reviewer

@KodrAus
Copy link
Contributor

KodrAus commented Nov 28, 2017

@frewsxcv looks like bors says no :(

@jwass Thanks again for doing this! Historically we've had two serialisation frameworks to support, so optimisation for any particular one took a backseat. That's not the case anymore so we can freely make the serde implementation as fast as we want. There are probably lots more opportunities to speed things up so if you find any more then PRs would be really appreciated!

@urschrei
Copy link
Member

bors r+

@bors
Copy link
Contributor

bors bot commented Nov 29, 2017

🔒 Permission denied

Existing reviewers: click here to make urschrei a reviewer

@frewsxcv
Copy link
Member

@KodrAus @urschrei just made you both reviewers so you should have permissions to use bors in the future

bors r+

bors bot added a commit that referenced this pull request Nov 29, 2017
82: Deserialize Optimizations r=frewsxcv a=jwass

Hello, this PR has a couple of parsing optimizations in two commits:
- When creating `Vec`s for GeoJSON objects, initialize with `Vec::with_capacity(...)` since we know what size to expect.
- Avoid a couple of critical `clone()` calls.

Currently, when parsing from a string, the entire JSON value is cloned after it's parsed, then each Feature is cloned again when parsing a FeatureCollection. I believe this is due to serde_json's `as_array()` and `as_object()` functions returning borrowed references. I avoid those calls by doing the `match` directly (by adding the `_owned` macros), which allows moving those objects without cloning.

I tested this on a release target on a ~330MB GeoJSON file (all Polygons), current master parses in about 18 seconds. After these changes, it takes under 10s. Parsing using just serde_json is about 8.2 s.

I'm pretty new to Rust so feel free to let me know if these changes are a bad idea -- or any other suggestions.
@bors
Copy link
Contributor

bors bot commented Nov 29, 2017

Build succeeded

@bors bors bot merged commit b4ac405 into georust:master Nov 29, 2017
@jwass jwass deleted the opt branch November 29, 2017 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants