feat(loader): implement markdown parsing in MathpixPDFReader #498

Add functionality to properly handle PDF content: - Add parse_markdown_text_to_tables method to separate tables and text - Fix load_data implementation to properly process documents - Fix lazy_load_data method - Improve document metadata handling for tables and text sections The loader now correctly processes PDFs through Mathpix API and converts content to proper Document objects.

…ation Remove early returns using super() in load_data and lazy_load_data methods that were preventing the actual implementation from being executed. This fixes the "not implemented" error while maintaining the full PDF reader functionality.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(loader): implement markdown parsing in MathpixPDFReader #498

feat(loader): implement markdown parsing in MathpixPDFReader #498

Commits on Nov 15, 2024

Commits on Nov 17, 2024

feat(loader): implement markdown parsing in MathpixPDFReader #498

Are you sure you want to change the base?

feat(loader): implement markdown parsing in MathpixPDFReader #498

Commits on Nov 15, 2024

Commits on Nov 17, 2024