-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Make sure Client parameters are strings #1577
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
@beggers should we force port to be an |
chromadb/__init__.py
Outdated
@@ -255,3 +282,9 @@ def AdminClient(settings: Settings = Settings()) -> AdminAPI: | |||
|
|||
""" | |||
return AdminClientCreator(settings=settings) | |||
|
|||
|
|||
def _stringify_headers(headers: Optional[Dict[str, str]]) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we can be more suggestive with the type-hinting here. If headers
is already Dict[str, str]
then we would not need to cast. More explicitly,
def _stringify_headers(headers: Optional[Dict[str, str]]) -> None: | |
def _stringify_headers(headers: Optional[Dict[str, str | int]]) -> None: |
I would have suggested a SupportsStr
type but I think every(?) object in python has a __str__
method? see related exchange https://www.mail-archive.com/[email protected]/msg24545.html
@jeffchuber @rancomp for the reasons stated above we probably want headers to be |
@beggers my understanding of type-hints in python is similar to yours and I agree with you that PS, because practically anything can be stringified, I want to bring up the option to directly validate the values types through PSS, I thought about it. Do we want to mutate
|
Returning new headers makes sense, done. I still think the type annotation should be |
@beggers, what do you think of replacing the host and port altogether in favor of a URI-based endpoint? Here are some wins:
I know this does not fit in this PR, but we can work in this direction in parallel. Regarding the headers, all three versions of HTTP:
State that headers are US-ASCII chars so it does make sense to have them as Dict[str,str] without explicitly enforcing the type check as the underlying libs usually do this for us (or throw an appropriate error) |
I would prefer we keep our current separate |
chromadb/__init__.py
Outdated
host = str(host) | ||
port = int(port) | ||
ssl = bool(ssl) | ||
headers = _stringify_headers(headers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we insist all headers are strings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do this with the type annotations.
However type annotations are statically checked only when you feel like it and don't actually enforce anything at runtime -- this PR exists because a user was hitting a subtle bug when they passed in an int
for port
instead of a string
.
Our options here are to stringify headers or simply pass them as they're passed to us. @tazarov requested below that we pass them as-is and I don't feel strongly about it so that's what we do now. Removed the _stringify_headers
stuff.
return ClientCreator(tenant=tenant, database=database, settings=settings) | ||
|
||
|
||
def HttpClient( | ||
host: str = "localhost", | ||
port: str = "8000", | ||
port: int = 8000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what will happen if the user currently is passing a string? just a type error, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users can pass whatever they want -- these are just type hints which users can choose to statically check or not. If a user is currently passing a string we'll turn it into an int with the port = int(port)
line below. This will not break anyone.
chromadb/__init__.py
Outdated
""" | ||
|
||
# https://github.com/chroma-core/chroma/issues/1573 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to link to this issue in 4 separate places? just feels a little messy- id rather add one-liner with the rationale
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed these all to a comment, lmk what you think. Happy to change.
chromadb/__init__.py
Outdated
|
||
|
||
# Despite type hints, users may pass in non-string values for headers. | ||
def _stringify_headers(headers: Optional[Dict[str, str]]) -> Optional[Dict[str, str]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should be converting headers to strings here. Instead, we should let requests library error out on headers that are not strings () - psf/requests#3491 (comment) and https://github.com/psf/requests/blob/72eccc8dd8b7c272e520f22b0256386c80864e94/src/requests/utils.py#L1040
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, removed _stringify_headers
I have to disagree with it not bringing any value. Consider having only host and port; how would one add a path or query param in the URL (adding more params to the HttpClient would make it messier)? |
Sure, seems reasonable. I would be very happy if we supported a I would also be fine if we added an |
@@ -165,6 +173,13 @@ def HttpClient( | |||
if settings is None: | |||
settings = Settings() | |||
|
|||
# Make sure paramaters are the correct types -- users can pass anything. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the int conversions have a try/catch in case of failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're fine erroring if someone passes in a port that's not an int (same applies to other types). All we would do is throw another error.
## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - Stringify all paremeters to `Client`s which are meant to be strings. At present some parameters -- `port` in particular -- can be reasonably passed as integers which causes weird and unexpected behavior. - Fixes chroma-core#1573 ## Test plan *How are these changes tested?* - [ ] Tests pass locally with `pytest` for python, `yarn test` for js ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?*
Description of changes
Summarize the changes made by this PR.
Client
s which are meant to be strings. At present some parameters --port
in particular -- can be reasonably passed as integers which causes weird and unexpected behavior.Test plan
How are these changes tested?
pytest
for python,yarn test
for jsDocumentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs repository?