Skip to content

v0.3.0: Extensive package refacto

Compare
Choose a tag to compare
@tonywu71 tonywu71 released this 10 Sep 15:29
· 54 commits to main since this release
f484161

Description

✨ This release is an extensive package refacto, making ColPali more modular and easier to use.

🚨 It is NOT backward-compatible with previous versions.

Features

Added

  • Restructure the utils module
  • Restructure the model training code
  • Add custom Processor classes to easily process images and/or queries
  • Enable module-level imports
  • Add scoring to processor
  • Add CustomRetrievalEvaluator
  • Add missing typing
  • Add tests for model, processor, scorer, and collator
  • Lint Changelog
  • Add missing docstrings
  • Add "Ruff" and "Test" CI pipelines

Changed

  • Restructure all modules to closely follow the transformers architecture
  • Hugely simplify the collator implementation to make it model-agnostic
  • ColPaliProcessor's process_queries doesn't need a mock image input anymore
  • Clean pyproject.toml
  • Loosen the required dependencies
  • Replace black with the ruff linter

Removed

  • Remove interpretability and eval_manager modules
  • Remove unused utils
  • Remove TextRetrieverCollator
  • Remove HardNegDocmatixCollator

Fixed

  • Fix wrong PIL import
  • Fix dependency issues

Full Changelog: v0.2.2...v0.3.0