This is a proof of concept repository for querying pdf
files. It allows the
user to interrogate a pdf
file on the CLI and returns a response in the CLI.
- You will need
python
3.11 or greater. - You will need
pipx
. Install it using the following commands in your CLI.
> py -3 -m pip install --user pipx
> py -3 -m pipx ensurepath
- You will need
poetry
. Install it using the following commands in your CLI.
> pipx install poetry
> poetry config virtualenvs.in-project = true
- You will need to use your
openai
API key. For now set it as an environment variable through your CLI. For some reason using.env
file along withpython-dotenv
does not work.
>$end:$env:OPENAI_API_KEY="your-key-here"
Note: All the CLi commands stated assume you are
Windows
withPowerShell
.
Use the following command in your CLI.
poetry install
Use the following command in your CLI.
poetry shell
In your CLI do the following.
askpdf --file "<path-to-pdf-file>" --question "Your question here"
# or use abbreviations
askpdf -f "<path-to-pdf-file>" -q "Your question here"
# if you ommit the --f parameter then the sample pdf file in this repository
# will be used
askpdf -q "Your question here"
# get help
askpdf -h
askpdf --help
- Fix issue with
openai
API KEY not being recognised when usingpython-dotenv
along with a .env file. - Use local large language model rather than sending requests to
openai
.