Skip to content

Latest commit

 

History

History
373 lines (304 loc) · 13.9 KB

usdimport-and-fhir-usdimport.md

File metadata and controls

373 lines (304 loc) · 13.9 KB

$import & /fhir/$import

$import is an implementation of the upcoming FHIR Bulk Import API. This is an asynchronous Operation, which returns url to monitor progress. There are two versions of this operation - /fhir/$import accepts data in FHIR format, /$import works with Aidbox format.

Resource requirements for all import operations:

Operation id resourceType
/$import Required Not required
/fhir/$import Required Not required

{% hint style="warning" %} Keep in mind that $import does not validate inserted resources for the sake of performance. Pay attention to the structure of data you insert and use the correct URL for your data format, i.e.: use /fhir prefix for FHIR data. {% endhint %}

{% hint style="info" %} Please consider using Asynchronous validation API to validate data after $import {% endhint %}

Example

{% tabs %} {% tab title="Request" %}

POST /fhir/$import
Accept: text/yaml
Content-Type: text/yaml

id: synthea
contentEncoding: gzip
inputs:
- resourceType: Encounter
  url: https://storage.googleapis.com/aidbox-public/synthea/100/Encounter.ndjson.gz
- resourceType: Organization
  url: https://storage.googleapis.com/aidbox-public/synthea/100/Organization.ndjson.gz
- resourceType: Patient
  url: https://storage.googleapis.com/aidbox-public/synthea/100/Patient.ndjson.gz

{% endtab %}

{% tab title="Response" %}

status: 200

{% endtab %} {% endtabs %}

Parameters

ParameterDescription
idIdentifier of the import
contentEncodingSupports gzip or plain (non-gzipped .ndjson files)
inputsResources to import
updateUpdate history for updated resources (false by default)

You can monitor progress by using id you provided in request body.

{% tabs %} {% tab title="Request" %}

GET /BulkImportStatus/synthea

{% endtab %}

{% tab title="Response (Not Finished)" %} Status

200

Body

time:
  start: '2023-05-15T14:45:33.28722+02:00'
type: aidbox
inputs:
  - url: >-
      https://storage.googleapis.com/aidbox-public/synthea/100/Encounter.ndjson.gz
    resourceType: Encounter
contentEncoding: gzip
id: >-
  synthea
resourceType: BulkImportStatus
meta:
  lastUpdated: '2023-05-15T12:45:33.278829Z'
  createdAt: '2023-05-15T12:45:33.278829Z'
  versionId: '129363'

{% endtab %}

{% tab title="Response (Finished)" %} Status

200

Body

time:
  end: '2023-05-15T14:45:33.820465+02:00'
  start: '2023-05-15T14:45:33.28722+02:00'
type: aidbox
inputs:
  - ts: '2023-05-15T14:45:33.819425+02:00'
    url: >-
      https://storage.googleapis.com/aidbox-public/synthea/100/Encounter.ndjson.gz
    total: 3460
    status: finished
    duration: 530
    resourceType: Encounter
status: finished
contentEncoding: gzip
id: >-
  synthea
resourceType: BulkImportStatus
meta:
  lastUpdated: '2023-05-15T12:45:33.278829Z'
  createdAt: '2023-05-15T12:45:33.278829Z'
  versionId: '129363'

{% endtab %}

{% tab title="Response (Failed)" %} Status

200

Body

time:
  end: '2023-05-15T14:45:33.820465+02:00'
  start: '2023-05-15T14:45:33.28722+02:00'
type: aidbox
inputs:
  - ts: '2023-05-15T14:45:33.819425+02:00'
    url: >-
      https://storage.googleapis.com/aidbox-public/synthea/100/Encounter.ndjson.gz
    error: '403: Forbidden'
    status: failed
    resourceType: Encounter
status: finished
contentEncoding: gzip
id: >-
  synthea
resourceType: BulkImportStatus
meta:
  lastUpdated: '2023-05-15T12:45:33.278829Z'
  createdAt: '2023-05-15T12:45:33.278829Z'
  versionId: '129363'

{% endtab %} {% endtabs %}

{% hint style="info" %} If you didn't provide id in request body, you can use content-location in response header. {% endhint %}

Result

ParameterTypeDescription
idstringIdentifier of the import
resourceTypestringType of resource where the progress of import operation is recorded.
Possible value: BulkImportStatus
metaobject
meta.createdAtstringTimestamp string at which the resource was created
meta.lastUpdatedstringTimestamp string at which the resource was updated last time
meta.versionIdstringVersion id of this resource
contentEncodingstringgzip or plain
timeobject
time.startstringTimestamp string at which the operation started in ISO format
time.endstringTimestamp string at which the operation was completed in ISO format.
Only present after the entire import operation has been completed
typestring

Data format type to be loaded.

Possible values: aidbox, fhir

inputsobject[]
inputs[].urlstringURL from which load resources
inputs[].resourceTypestringResource type to be loaded
inputs[].statusstring

Load status for each input.
Only present after the operation for this input has been completed.

Possible values: finished, failed

inputs[].totalintegerThe number of loaded resources.
Only present after the operation for this input has been completed successfully
inputs[].tsstringTimestamp string at which the loading was completed in ISO format.
Only present after the operation for this input has been completed
inputs[].durationintegerDuration of loading in milliseconds.
Only present after the operation for this input has been completed successfully
statusstring

Load status for all inputs.

Only present after the entire import operation has been completed.
After completed, this value is always finished, regardless of whether each input is finished or failed.

Possible value: finished

Note

For performance reasons $import does raw upsert into the resource table without history update. If you want to store the previous version of resources in history, please set update = true

With this flag, Aidbox will update the history for updated resources. For each resource:

  • if the resource was not present in DB before the import, the import time will be the same.
  • if the resource was present in DB before and it's updated during the import, it will double the time importing this resource because of the additional insert operation into the _history table.

/v2/$import on top of the Workflow Engine

Improved version of the $import operation, to enhance its reliability and performance. By implementing this operation on top of the workflow-engine, it allows the $import operation to be more reliable, continue work after restarts, and handle errors correctly. The Task API also enables the operation to accept multiple requests and execute them from a queue while simultaneously processing multiple items from the "inputs" field (with a default of two items processed simultaneously). Users can monitor the status of the operation through the monitoring.md.

In the future, the ability to list and cancel $import operations will be added, as well as detailed progress info on the operation.

Changes in the new $import API:

  1. Executing more than one import with the same id is not possible. Users can omit the `id` field from the request, allowing Aidbox to generate the ID.
  2. The status of the workflow can be accessed with a GET request to /v2/$import/<id> instead of /BulkImportStatus/<id>. The URL for the import status is returned in the content-location header of the $import request.

{% hint style="warning" %} This feature is not available in Multibox {% endhint %}

To start import make a POST request to /v2[/fhir]/$import:

{% tabs %} {% tab title="Request" %}

POST /v2/fhir/$import
Accept: text/yaml
Content-Type: text/yaml

id: synthea
contentEncoding: gzip
inputs:
- resourceType: Encounter
  url: https://storage.googleapis.com/aidbox-public/synthea/100/Encounter.ndjson.gz
- resourceType: Organization
  url: https://storage.googleapis.com/aidbox-public/synthea/100/Organization.ndjson.gz
- resourceType: Patient
  url: https://storage.googleapis.com/aidbox-public/synthea/100/Patient.ndjson.gz

{% endtab %}

{% tab title="Response" %} Status

200 OK

Headers

Content-Location:  /v2/$import/synthea

{% endtab %} {% endtabs %}

Parameters

ParameterDescription
idIdentifier of the import.
If you don't provide this, the id will be auto-generated. You can check it on Content-Location header in the response
contentEncodingSupports gzip or plain (non-gzipped .ndjson files)
inputs (required)

Resources to import

  • url - URL from which load resources
  • resourceType - Resource type to be loaded
updateUpdate history for updated resources (false by default)
allowedRetryCountSet the maximum number of import retries for each input (2 by default)

To check the status of the import make a GET request to /v2/$import/<id>:

As the operation is built on top of our workflow engine, the statuses and outcomes of the files and import as a whole are inherited from #task-statuses-and-outcomes

{% tabs %} {% tab title="Request" %}

GET /v2/$import/<id>

{% endtab %}

{% tab title="Response (In progress)" %} Status

200 OK

Body

type: fhir
inputs:
  - url: >-
      https://storage.googleapis.com/aidbox-public/synthea/100/Organization.ndjson.gz
    resourceType: Organization
    status: in-progress
  - url: >-
      https://storage.googleapis.com/aidbox-public/synthea/100/Encounter.ndjson.gz
    resourceType: Encounter
    status: waiting
  - url: https://storage.googleapis.com/aidbox-public/synthea/100/Patient.ndjson.gz
    resourceType: Patient
    status: waiting
contentEncoding: gzip
status: in-progress

{% endtab %}

{% tab title="Response (done - succeeded)" %} Status

200 OK

Body

type: fhir
inputs:
  - url: >-
      https://storage.googleapis.com/aidbox-public/synthea/100/Organization.ndjson.gz
    resourceType: Organization
    status: done
    outcome: succeeded
    result:
      imported-resources: 0
  - url: >-
      https://storage.googleapis.com/aidbox-public/synthea/100/Encounter.ndjson.gz
    resourceType: Encounter
    status: done
    outcome: succeeded
    result:
      imported-resources: 3460
  - url: https://storage.googleapis.com/aidbox-public/synthea/100/Patient.ndjson.gz
    resourceType: Patient
    status: done
    outcome: succeeded
    result:
      imported-resources: 124
contentEncoding: gzip
status: done
outcome: succeeded
result:
  message: All input files imported, 3584 new resources loaded
  total-files: 3
  total-imported-resources: 3584

{% endtab %}

{% tab title="Response (done - failed)" %} Status

200 OK

Body

type: fhir
inputs:
  - url: >-
      https://storage.googleapis.com/aidbox-public/synthea/100/Organization.ndjson.gz
    resourceType: Organization
    status: done
    outcome: succeeded
    result:
      imported-resources: 225
  - url: >-
      https://storage.googleapis.com/aidbox-public/synthea/100/Encounter.ndjson.gz
    resourceType: Encounter
    status: done
    outcome: failed
    error:
      message: '403: Forbidden'
  - url: >-
      https://storage.googleapis.com/aidbox-public/synthea/100/Patient.ndjson.gz
    resourceType: Patient
    status: done
    outcome: failed
    error:
      message: '403: Forbidden'
contentEncoding: gzip
status: done
outcome: failed
error:
  message: >-
    Import for some files failed with an error: task 'Encounter
    https://storage.googleapis.com/aidbox-public/synthea/100/Encounter.ndjson.gz
    failed

{% endtab %} {% endtabs %}

Import local file

Sometimes you want to import local file into local Aidbox. Possible solutions for local development:

Add volume to the aidboxone container (not aidboxdb):

volumes:
- ./Encounter.ndjson.gz:/resources/Encounter.ndjson.gz
# url: file:///resources/Encounter.ndjson.gz

Use tunneling e.g. ngrok:

python3 -m http.server 
ngrok http 8000
# url: https://<...>.ngrok-free.app/Encounter.ndjson.gz