Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: parser speed (lexer rework) #474

Merged
merged 13 commits into from
Jan 24, 2024

Conversation

lukecotter
Copy link
Contributor

@lukecotter lukecotter commented Jan 15, 2024

Description

Parser performance improvement from v1.12.1

Testing shows between 1.5 and 2.7 times faster performance.

Screenshot 2024-01-24 at 19 45 53 Screenshot 2024-01-24 at 19 52 47

This is mainly down to

  • using a generator to create the tokens on demand and keep memory low as well and looping lines only once. This reduces GC overhead which is expensive.
  • cache namespaces to avoid having to re parse namespace strings which is expensive

Type of change (check all applicable)

  • 🐛 Bug fix
  • ✨ New feature
  • ♻️ Refactor
  • ⚡ Performance Improvement
  • 📝 Documentation
  • 🔧 Chore
  • 💥 Breaking change

[optional] Any images / gifs / video

Related Tickets & Documents

Related Issue #475
fixes #
resolves #
closes #

Added tests?

  • 👍 yes
  • 🙅 no, not needed
  • 🙋 no, I need help

Added to documentation?

  • 🔖 README.md
  • 🔖 CHANGELOG.md
  • 📖 help site
  • 🙅 not needed

[optional] Are there any post-deployment tasks we need to perform?

In testing parsing of a large 70mb log went from ~2s to 1.3s, roughly 35% (~1.3 times) faster.

- Improved performance by reducing down to 1 loop when creating and parsing log lines (generator function).
- Instead of placing all the line strings in memory upfront (giant array), we do it as each line needs them.
- The line iterator now only has one line in memory.
- Avoid high GC cost by manually splitting strings on new line character instead of using string split.
@lukecotter lukecotter marked this pull request as draft January 15, 2024 17:51
@lukecotter
Copy link
Contributor Author

Need to merge namespace work first
#473

@lcottercertinia lcottercertinia changed the title perf: parser speed perf: parser speed (lexer rework) Jan 16, 2024
@lcottercertinia lcottercertinia marked this pull request as ready for review January 24, 2024 22:20
@lcottercertinia lcottercertinia merged commit d18da63 into certinia:main Jan 24, 2024
3 checks passed
@lukecotter lukecotter deleted the perf-parser-speed branch February 2, 2024 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants