Python client to query the Grobid Quantities service API For more information about Grobid Quantities, please check the Grobid Quantities Documentation.
The client can be installed using pip:
pip install grobid-quantities-client
The CLI follows the following parameters:
python -m grobid_quantities.quantities --help usage: quantities.py [-h] --input INPUT [--output OUTPUT] [--base-url BASE_URL] [--config CONFIG] [--n N] [--force] [--verbose]
Client for the Grobid-quantities service
- optional arguments:
-h, --help show this help message and exit --input INPUT path to the directory containing PDF files or .txt (for processCitationList only, one reference per line) to process --output OUTPUT path to the directory where to put the results (optional) --base-url BASE_URL Base url of the service --config CONFIG path to the config file, default is ./config.json --n N concurrency for service usage --force force re-processing pdf input files when tei output files already exist --verbose print information about processed files in the console
Initialisation
from grobid_quantities.quantities import Quantities client = QuantitiesAPI(base_url=server_url:port)
- client.process_text(
- "I lost two minutes"
)
client.process_pdf(pdfFile)
client.parse_measures("from": "10", "to": "20", "unit": "km")
The response is a tuple where the first element is the status code and and the second element the response body as a dictionary. Here an example:
- (
200, {
"runtime": 123, "measurements": [
- {
"type": "value", "quantity": {
"type": "time", "rawValue": "two", "rawUnit": {
"name": "minutes", "type": "time", "system": "non SI", "offsetStart": 11, "offsetEnd": 18}, "parsedValue": {
"numeric": 2, "structure": {
"type": "ALPHABETIC", "formatted": "two"}, "parsed": "two"
}, "normalizedQuantity": 120, "normalizedUnit": {
"name": "s", "type": "time", "system": "SI base"}, "offsetStart": 7, "offsetEnd": 11
}
}
]
}
)