Skip to content

Latest commit

 

History

History
13 lines (13 loc) · 586 Bytes

ToDo.md

File metadata and controls

13 lines (13 loc) · 586 Bytes
  1. Extract data from pdf
    • Confert pdf to image (v)
    • Get text from image using Tesseract-ocr (v)
    • Determine if it is an invoice using ml
    • Process the text with ml to see if it is an invoice
    • Extract total amount, inv number and data
  2. Generate invoices of 10 different types
    • Get 10 types of invoice templates of google
    • Get fictive data to put on the invoice (company name, invoice ammount, etc)
    • Generate 1k of each invoice
  3. Create ml model to sort invoices from non invoices
    • Learn how to sort images using ML
    • Sort images using ml