Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix to avoid overfloat and get rid of model_max_length #319

Merged
merged 9 commits into from
Aug 14, 2024
Merged

Conversation

I8dNLo
Copy link
Contributor

@I8dNLo I8dNLo commented Aug 6, 2024

Got rid of max_length=512 and parameter
Replaced it with maxsize to work with such situations. Checkout model_max_length

@I8dNLo I8dNLo requested review from generall and joein August 6, 2024 12:13
@joein joein added the bug Something isn't working label Aug 6, 2024
Copy link
Member

@joein joein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not fix the issue with the mentioned model

The correct value is actually under another field:
"max_length": 512

Also setting a parameter this high might be dangerous (e.g. if we ever set padding to max_length)

I would rather check both fields, take the minimum one, and if they are absent, set some default value.
But tbh, a default value also seems not to be the greatest idea, I think we should just be more careful with the models we are adding

(traceback regarding "does not fix the issue")

2024-08-06 19:45:13.789327 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running Add node. Name:'/embeddings/Add_1' Status Message: /Users/runner/work/1/s/onnxruntime/core/providers/cpu/math/element_wise_ops.h:560 void onnxruntime::BroadcastIterator::Append(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 512 by 10002

Update:
Some models tokenizer configs have different values for max_length and model_max_length.
In this particular example it is possible to use the model with RPE as written in the card.
We do not support such modifications, but we should investigate the existing models to choose the right way to fix the issue.

@I8dNLo I8dNLo requested a review from joein August 13, 2024 15:25
fastembed/common/preprocessor_utils.py Outdated Show resolved Hide resolved
@@ -148,7 +148,7 @@ def decompress_to_cache(cls, targz_path: str, cache_dir: str):
# Open the tar.gz file
with tarfile.open(targz_path, "r:gz") as tar:
# Extract all files into the cache directory
tar.extractall(path=cache_dir)
tar.extractall(path=cache_dir, filter="fully_trusted")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tar.extractall(path=cache_dir, filter="fully_trusted")
tar.extractall(path=cache_dir)

@I8dNLo I8dNLo merged commit 62607c2 into main Aug 14, 2024
17 checks passed
@I8dNLo I8dNLo deleted the 208-max-length branch August 14, 2024 12:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants