Release v0.2.0 · illuin-tech/colpali

[0.2.0]

Large refactoring to adress several issues and add features. This release is not backward compatible with previous versions.
The models trained under this version will exhibit degraded performance if used with the previous version of the code and vice versa.

Branch

Added

Added multiple training options for training with hard negatives. This leads to better model performance !
Added options for restarting training from a checkpoint.

Changed

Optionally load ColPali models from pre-initialized backbones of the same shape to remove any stochastic initialization when loading adapters. This fixes 11 and 17.

Fixed

Set padding side to right in the tokenizer to fix misalignement issue between different query lengths in the same batch. Fixes 12
Add 10 extra pad token by default to the query to act as reasoning buffers. This enables the above fix to be made without degrading performance and cleans up the old technique of using tokens.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.2.0

[0.2.0]

Added

Changed

Fixed