Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* fix: fix wrong PIL import * feat: add cast for better typing * feat: clean `CustomCollator` (mostly style edits) * style: clean colpali_processing_utils and add better typing * feat: factorize the ColPali processing utils in CustomCollator * feat: factorize the ColIdefics processing utils in CustomCollator * feat: restructure the `models` module * feat: big refacto of the collator classes * style: tweak bi-encoder losses * feat: add ColPaliConfig * doc: tweaks * build: remove all `import *` * feat: deprecate `TextRetrieverCollator` * feat: remove redundant `tokenizer` attribute from `BaseVisualRetrieverProcessor` * fix: address Manu's comments * fix: fix typos in `ColIdefics2Processor` * fix: fix HardNegCollator + style tweaks * doc: tweak * feat: deprecate HardNegDocmatixCollator * feat: revert removing abstract attribute `tokenizer` from BaseVisualRetrieverProcessor * doc: fix typos * feat: update `__init__.py` files * feat: fix typing for `ColPaliProcessor.from_pretrained` * feat: add better typing and remove prints from CustomEvaluator * feat: rename CustomEvaluator to CustomRetrievalEvaluator * feat: tweak `get_torch_device` * feat: turn `main_input_name` into ClassVar in ColPali * feat: better `from_pretrained` methods * feat: use PaliGemma tokenizer in `process_queries` * feat: modify the processor classes * feat: deprecate ColPaliConfig * feat: rename ColPaliProcessor init arg * feat: better `CustomRetrievalEvaluator` * feat: move `CustomRetrievalEvaluator` in `evaluation` module * feat: add input length guardrail in `CustomRetrievalEvaluator` * feat: add tests for ColPali * feat: add `hf_token` arg to `ColPaliProcessor` * Revert "feat: use PaliGemma tokenizer in `process_queries`" This reverts commit 7ec95cb. * feat: reduce mock images's size * build: remove `.vscode/` * feat: revert `embedding_dim` attribute to `dim` in ColPali * feat: put all model directories in 1st level of `models` module * build: update module path for models in config files * feat: sort models module by vlm backbone * fix: fix imports in tests * feat: rename all Idefics* classes to Idefics2* * feat: add missing processors for Bi* models * untested: processor is inherited directly * feat: inherit processor directly in ColIdefics2Processor * doc: update docstrings in processor classes * build: loosen dev deps * fix: add missing casts in processor tests * feat: restructure test file structure * fix: fix wrong init in Bi* processors * rename * fix: add texts query to list * fix: ruff * feat: remove unused __future__ imports * build: move pytest conifg to pyproject * feat: add logging in `get_torch_device` * feat: set default device to cpu in `test_retrieval_evaluator.py` * build: add "Ruff" and "Test" CI pipelines * build: add missing `pillow` dep * build: update ruff config in pyproject * build: move `mteb` to compulsory deps + format pyproject * build: tweak project details in pyproject * build: remove black and use ruff formatter instead * build: add missing HF_TOKEN secret in test CI * feat: remove all `|` for python 3.9 compatibility * feat: tweak ColPaliProcessor test * feat: add test for ColPali collator * build: remove `.python-version` * fix: fix typo in `compute_hardnegs.py` * build: unfreeze the numpy dep and make it compulsory * feat: deprecate `mteb` metrics and remove `mteb` dep * feat: tweak `CustomRetrievalEvaluator.evaluate` * feat: rename `CustomRetrievalEvaluator` to `RetrievalScorer` + tweaks * feat: add `CustomRetrievalEvaluator` as a `mteb` wrapper + update `ColModelTraining` * chore: update CHANGELOG * Add scorer in processor (#46) * add: scorer in processor * fix: lint * fix: tests * fix: bugs * fix: tests pass * fix: lint * fix: tony's coms * style: lint * fix: fix wrong typing in processor classes * fix: fix wrong `score` method override in processors --------- Co-authored-by: ManuelFay <[email protected]> Co-authored-by: Manuel Faysse <[email protected]>
- Loading branch information