Better tagger models (more heavyweight) #623

dgarijo · 2024-01-25T11:31:46Z

Once the corpus is improved a bit, we should move towards Language models. Probably training a BERT model for tagging the text will provide better results than current classifiers. The model would be stored in Huggingface and download locally.

This would be a replacement for the current binary classifiers (at least text taggers)

dgarijo · 2024-02-29T10:24:24Z

Building some taggers with Ollama may be a good idea. However, they are heavy to run for now. Maybe not worth it?

dgarijo · 2024-04-23T16:29:49Z

Specially with things like Llama3 out, they seem very promising to address some of these problems. See https://github.com/ollama/ollama with 4.7 GB the Llama3 8b model. It will probably be way slower though!

dgarijo added the enhancement New feature or request label Jan 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better tagger models (more heavyweight) #623

Better tagger models (more heavyweight) #623

dgarijo commented Jan 25, 2024

dgarijo commented Feb 29, 2024

dgarijo commented Apr 23, 2024 •

edited

Loading

Better tagger models (more heavyweight) #623

Better tagger models (more heavyweight) #623

Comments

dgarijo commented Jan 25, 2024

dgarijo commented Feb 29, 2024

dgarijo commented Apr 23, 2024 • edited Loading

dgarijo commented Apr 23, 2024 •

edited

Loading