Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the expected performance overhead of using nom_locate? #89

Open
jcornaz opened this issue May 13, 2023 · 2 comments
Open

What is the expected performance overhead of using nom_locate? #89

jcornaz opened this issue May 13, 2023 · 2 comments

Comments

@jcornaz
Copy link

jcornaz commented May 13, 2023

Hi,

I wrote a parser using nom which was capable of parsing ~100 Mb/s of beancount syntax on my machine.

After introducing nom_locate by changing all input types from &str to LocatedSpan<&str> (you can look at the diff),
the parser performance has been halved, parsing only ~50 Mb/s of beancount syntax on the same machine.

It is no biggie. I was of course expecting a performance cost (as there is more work being done), and the performance is still largely acceptable (and still faster than pest). But, I was nevertheless surprised by the magnitude of the performance hit. I did not expect nom_locate to take half of the parsing time.

So, I am just asking, is what I observe expected? Have you observed a similar performance hit in your usages of nom_locate? Could it be that I am misusing nom_locate?

Again, this is not an issue. But as the "discussions" are disabled, I don't know how else to open a discussion on the subject.

@jcornaz jcornaz changed the title performance overhead of using `nom_locate What is the expected performance overhead of using nom_locate? May 13, 2023
@progval
Copy link
Collaborator

progval commented Jul 8, 2023

Thanks for the benchmark. nom_locate's overhead seems to be overwhelmingly in this function:

https://github.com/fflorent/nom_locate/blame/c61618312d96a51cd7b957831b03dfbbcc5f58c7/src/lib.rs#L665-L691

with roughly a quarter in memchr and the rest in the function itself (and callgrind seems unable to be more specific than this at high optimization level).

RUSTFLAGS="-C target-cpu=native" and memchr's libc feature don't seem to provide any speedup on my Alder Lake, so I'm afraid that's it.

I'd love to hear feedback from other users of the crate, though.

@progval progval pinned this issue Aug 13, 2023
@JulesGuesnon
Copy link

JulesGuesnon commented Nov 6, 2023

Hey! Quick feedback, I've been using this lib recently to implement a json parser and overall it was working well until I tried to parse canada.json. The file is quite small: 2.5mb but on the first time I tried to parse it, it took ~150s to parse it. Considering that citm_catalog.json is 1.65mb and I could parse it in 30ms there was a big issue. I did some profiling, and updated my parsing implementation, and it took down the parsing to ~29s which was way better, but again, way too slow. I ended up doing my own implementation of the input, and I could take down the parsing time to 40ms.

My implementation is way different than nom_locate's one, and I implemented it specifically for the need to have lines and cols number, so I don't think it's matching 100% the goals of this crate, but sharing it in case it might helps: https://github.com/JulesGuesnon/spanned-json-parser/blob/c5f6ade651a7f47c3fe08802c510e2d23286f10e/src/input.rs#L264-L300

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants