Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: OCR-iser PDF #89

Open
ArtixJP opened this issue Nov 28, 2024 · 3 comments
Open

feat: OCR-iser PDF #89

ArtixJP opened this issue Nov 28, 2024 · 3 comments
Assignees

Comments

@ArtixJP
Copy link
Contributor

ArtixJP commented Nov 28, 2024

Benchmarking des solutions d'OCR (en vue d'une intégration à terme dans Albert-API)

@ArtixJP ArtixJP assigned ArtixJP and Jlutz75 and unassigned ArtixJP Nov 28, 2024
@leoguillaume leoguillaume self-assigned this Dec 2, 2024
@leoguillaume
Copy link
Contributor

@Jlutz75 que penses-tu de https://github.com/DS4SD/docling ?

@leoguillaume
Copy link
Contributor

@ArtixJP il y a cette solution de Nvidia mais qui nécessite 2 A100 : https://github.com/NVIDIA/nv-ingest, qu'en penses-tu ?

@Jlutz75
Copy link

Jlutz75 commented Dec 2, 2024

@leoguillaume je trouve qu'il a tendance à dégrader l'information et à zapper des choses ... Je teste https://github.com/adithya-s-k/omniparse aujourd'hui

En attendant, j'ai introduit la lecture des pdf "normaux" qui n'ont pas besoin d'OCR et je teste sur impôts.gouv.

@leoguillaume leoguillaume changed the title OCR-iser PDF feat: OCR-iser PDF Jan 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants