Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(loader): implement markdown parsing in MathpixPDFReader #498

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Commits on Nov 15, 2024

  1. ✨ feat(loader): implement markdown parsing in MathpixPDFReader

    Add functionality to properly handle PDF content:
    - Add parse_markdown_text_to_tables method to separate tables and text
    - Fix load_data implementation to properly process documents
    - Fix lazy_load_data method
    - Improve document metadata handling for tables and text sections
    
    The loader now correctly processes PDFs through Mathpix API and converts content to proper Document objects.
    eliasjudin committed Nov 15, 2024
    Configuration menu
    Copy the full SHA
    0e25e9d View commit details
    Browse the repository at this point in the history

Commits on Nov 17, 2024

  1. fix(loader): remove super() calls blocking MathpixPDFReader implement…

    …ation
    
    Remove early returns using super() in load_data and lazy_load_data methods that were preventing the actual implementation from being executed. This fixes the "not implemented" error while maintaining the full PDF reader functionality.
    eliasjudin committed Nov 17, 2024
    Configuration menu
    Copy the full SHA
    307c5ae View commit details
    Browse the repository at this point in the history