This repository contains several Jupyter notebooks to be used with the Cloud Document AI Platform. Use the general notebooks to process any form type or the specialized notebooks for any of the solutions such as Procurement DocAI or Lending DocAI. These notebooks help you get started with extracting data from your documents whether you're bring your own form types or using one of our specialized parsers for invoices, receipts, tax forms and more.
You must have your own GCP project with billing enabled and have working knowledge of the following products:
- Google Cloud Storage
- Google Document AI Concepts and Processors
- (Optional) AI Platform Notebooks
- Set up your GCP project for Document AI following the Setup Guide.
- Enable the 'Document AI API' in your project in the Document AI Platform.
- Create or use an existing instance of AI Platform Notebook with Python 3 using the default configurations.
- In the notebook, go to Git > Clone a Repository and paste the repository URL.
- Install the required libraries in the notebook terminal
python -m pip install -r requirements.txt
Please note Colab and Jupyter notebooks are also work with these samples. However, additional authentication will be required for service accounts.
- Identify which form type or utility you would like to run through a processor.
- Create your processor using the instructions.
- Copy your processor id.
- Update the PROCESSOR_ID, PROJECT_ID and REGION variables in the notebook.
PROJECT_ID = "YOUR_PROJECT_ID_HERE"
LOCATION = "LOCATION" # Format is 'us' or 'eu'
PROCESSOR_ID = "PROCESSOR_ID" # Create processor in Cloud Console
Please note, the location must match the one assigned to the processor.
- Run the notebook.