Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combining CLIPA-v2 and SigLIP (both big_vision based) models #660

Merged
merged 24 commits into from
Oct 20, 2023
Merged
Changes from 1 commit
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
64f4644
merge changes for clipa inference
zw615 Jul 26, 2023
546e8ae
update get_tokenizer to pass CI test; replace gelu_appoximate with ac…
zw615 Oct 5, 2023
8450c95
Merge remote-tracking branch 'zwclipa/clipa_torch_inference' into sig…
rwightman Oct 6, 2023
e3c2ea2
Temporary, cannot have a force tf dependency
rwightman Oct 6, 2023
0316911
Supporting SigLIP and CLIPA-v2 models (both sourced from big_vision j…
rwightman Oct 11, 2023
9d8385e
Merge remote-tracking branch 'origin/main' into siglip_clipa_models
rwightman Oct 11, 2023
39ba303
Fix some test failures, remove old v1 CLIPA configs, add add 336 H14 …
rwightman Oct 11, 2023
0724aab
Fix torchscript
rwightman Oct 11, 2023
f04eee8
Fix CoCa expand typo, force final LN after attentional pool
rwightman Oct 12, 2023
2f568cd
Merge branch 'main' into siglip_clipa_models
rwightman Oct 12, 2023
2c396d2
Used wrong default clean fn in SimpleTokenizer, put lower case back
rwightman Oct 12, 2023
e14f34b
Attempt to fix xlm roberta test w/ pretrained hf weight difference
rwightman Oct 12, 2023
3637f9d
SigLIP weights working. More changes to support differing image prepr…
rwightman Oct 17, 2023
72196f1
A typo and unused import
rwightman Oct 17, 2023
948d9e1
Merge remote-tracking branch 'origin/main' into siglip_clipa_models
rwightman Oct 17, 2023
c29cc9c
Fix two small issues, add hf_tokenizer_name to SigLIP models for non …
rwightman Oct 17, 2023
72b75bd
CLIPA reference temppory rwightman/ models for testing
rwightman Oct 17, 2023
b086ddb
Rename profile->profiler to avoid python naming conflict
rwightman Oct 18, 2023
05e9864
More tokenizer rework, add context_len as class attr set in factory, …
rwightman Oct 18, 2023
07f2c16
fix ViT-SO400M-14-SigLIP name
gabrielilharco Oct 19, 2023
d7542e4
Fix CoCa pool LN, improve clarity of ViT pooling logic
rwightman Oct 19, 2023
85f19b8
Exclude first/last tokens from tokens output of text models, should m…
rwightman Oct 19, 2023
a9d8d58
Add eval results for CLIPA + SigLIP models
gabrielilharco Oct 20, 2023
95ae868
Fixup bigG CLIPA config, 83.03 top-1 IN-1k
rwightman Oct 20, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Used wrong default clean fn in SimpleTokenizer, put lower case back
rwightman committed Oct 12, 2023
commit 2c396d25c5879b75e0ebcf995c0fcf31c731c99a
2 changes: 1 addition & 1 deletion src/open_clip/tokenizer.py
Original file line number Diff line number Diff line change
@@ -143,7 +143,7 @@ def __init__(
if canonicalize:
self.clean_fn = _canonicalize_basic_clean
else:
self.clean_fn = _whitespace_basic_clean
self.clean_fn = _lower_whitespace_basic_clean
self.vocab_size = len(self.encoder)
self.all_special_ids = [self.encoder[t] for t in special_tokens]