Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PERF]: Improved server-side serialization with orjson + async #1695

Conversation

tazarov
Copy link
Contributor

@tazarov tazarov commented Feb 3, 2024

Future work: Add orjson serialization at client-side

Description of changes

Summarize the changes made by this PR.

  • Improvements & Bug fixes
    • Added orjson serialization improving serialization performance especially on large batches 100+ docs 2x faster (tested with locust)
      • Added async body serialization further improving performance (tested with locust)
      • ⚠️ Fixed an issue with SQLite lack of case_sensitive_like for FastAPI Server thread pool connections

Test plan

How are these changes tested?

  • Tests pass locally with pytest for python, yarn test for js
  • Locust performance tests

For performance tests, 5 concurrent clients were querying, and 1 was adding pre-computed OpenAI (ada-02) embeddings (runtime ~1m)

Batch size: 100 (1000 also attached in a zip)

Tests with existing main codebase:

Screenshot 2024-02-03 at 17 52 39

Tests with orjson + async:

Screenshot 2024-02-03 at 17 53 17

1m-100batch-5concur.zip
1m-1000batch-5concur.zip

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs repository?

Refs

Copy link

github-actions bot commented Feb 3, 2024

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@tazarov
Copy link
Contributor Author

tazarov commented Feb 9, 2024

@HammadB, can you rerun failed tests?

@HammadB
Copy link
Collaborator

HammadB commented Feb 9, 2024

@tazarov yep - reran!

@tazarov tazarov force-pushed the feature/orjson-parser-async-endpoints branch from 8314c50 to ae4733c Compare February 22, 2024 11:16
- Added orjson serialization improving serialization performance especially on large batches 100+ docs 2x faster (tested with locust)
- Added async body serialization further improving performance (tested with locust)
- Added async handling with AnyIO of the more impactful server queries further reducing concurrent request response times

Future work:  Add orjson serialization at client-side
- Removed commented out code
- Added clarification for to_thread usage
- Configurable via `chroma_server_thread_pool_size` setting
@tazarov tazarov force-pushed the feature/orjson-parser-async-endpoints branch from d3b442b to d844c2f Compare February 29, 2024 09:54
@tazarov
Copy link
Contributor Author

tazarov commented Feb 29, 2024

@HammadB, rebased and removed the to_thread offloading.

Copy link
Contributor Author

@tazarov tazarov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HammadB, I think this is ready now.

@tazarov tazarov force-pushed the feature/orjson-parser-async-endpoints branch from fd0b2a0 to d844c2f Compare March 6, 2024 19:38
Copy link
Contributor Author

@tazarov tazarov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HammadB, I've started working through our discussion yesterday, and I think I'm making some assumptions as to the Auth (and other middlewares, e.g. rate-limit), which will not work well as they are right now. I think both Rate limiting and authz need to be made async which they are not rn.

The bottom line is that I can offload async body reads to a thread-pool (required as I want to still keep the methods sync) and then continue with the sync processing as it is now, until we have all parts ready to make FastAPI methods async.

…f the main loop

TODO: Telemetry/Authz has the potentially to negatively impact things - needs async refactor
Copy link
Contributor Author

tazarov commented Mar 8, 2024

@HammadB, I think this is good to go, but we need to have a quick follow-up about the following:

  • async telemetry
  • async auth
  • async rate-limit etc
  • async anything IO-bound that is run in the main event loop

@HammadB
Copy link
Collaborator

HammadB commented Mar 20, 2024

can we stack those changes on top and we can merge it all down coherently

@tazarov
Copy link
Contributor Author

tazarov commented Mar 28, 2024

@HammadB, closing this in favor of #1938 stack

@tazarov tazarov closed this Mar 28, 2024
beggers added a commit that referenced this pull request Apr 4, 2024
This is PR #1695 migrated to Chroma repo for stacking the outstanding
changes.

## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
- Added orjson serialization improving serialization performance
especially on large batches 100+ docs 2x faster (tested with locust)
- Added async body serialization further improving performance (tested
with locust)
- ⚠️ Fixed an issue with SQLite lack of case_sensitive_like for FastAPI
Server thread pool connections

## Test plan
*How are these changes tested?*

- [x] Tests pass locally with `pytest` for python, `yarn test` for js
- [x] Locust performance tests

For performance tests, **5** concurrent clients were querying, and **1**
was adding pre-computed OpenAI (ada-02) embeddings (runtime **~1m**)

**Batch size**: 100 (1000 also attached in a zip)


Tests with existing `main` codebase:

![Screenshot 2024-02-03 at 17 52
39](https://github.com/chroma-core/chroma/assets/1157440/5d87b4d5-4dae-48fe-908c-7c09db2a5abc)


Tests with orjson + async:

![Screenshot 2024-02-03 at 17 53
17](https://github.com/chroma-core/chroma/assets/1157440/d9818fdd-11c3-45c9-81dd-8baecbb638cf)


[1m-100batch-5concur.zip](https://github.com/chroma-core/chroma/files/14152062/1m-100batch-5concur.zip)

[1m-1000batch-5concur.zip](https://github.com/chroma-core/chroma/files/14152063/1m-1000batch-5concur.zip)


## Documentation Changes
*Are all docstrings for user-facing APIs updated if required? Do we need
to make documentation changes in the [docs
repository](https://github.com/chroma-core/docs)?*


## Refs

- https://showmax.engineering/articles/json-python-libraries-overview
- https://github.com/ijl/orjson
-
https://catnotfoundnear.github.io/finding-the-fastest-python-json-library-on-all-python-versions-8-compared.html

---------

Co-authored-by: Ben Eggers <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants