-
-
Notifications
You must be signed in to change notification settings - Fork 3
Bulk Uploader Design
HUD(XML, CSV zip)files can be uploaded using the HMIS Admin application with the help of bulk upload microservice.
The bulk upload service is a microservice which has APIs to allow a user to upload a HUD file for processing. API documentation for bulk upload API can be found below. https://docs.hslynk.com/?urls.primaryName=Bulk%20Upload%20API
The uploaded files are saved in a private secure S3 bucket to be used by the bulk upload worker process.
The bulk uploader worker process takes care of validating the file and persisting the data into the HMIS Postgres database. It also has checks for deduping the data and also tracks validation errors which can be viewed via the Hmis Admin application. Below are the different workflow stages to efficiently track the status of a bulk upload. Once the data is completely loaded the data will be ready to access via the HMIS Apis. Below is a quick summary of the different work flow statuses in the bulk uploader process. Bulk Workflow Upload Status
- S3 = When a file is about to be pushed to S3.
- INITIAL = The HUD CSV/XML is in S3 and is ready for the bulkupload worker.
- INPROGRESS = Worker process is processing client records.
- ENROLLMENT = Worker process has is processing enrollment reords.
- C_CLIENT = Worker process has is processing records for all children elements for client.
- C_EMENT = Worker process has is processing records for all children elements for enrollment.
- EXIT = Worker process has is processing exit records.
- C_EXIT = Worker process has is processing records for all children elements for exit.
- DISAB = Worker process has is processing records for all the disabilities records.
- LIVE = Worker process has completed process of the entire file.
- ERROR = Worker process failed due to invalid fail format.
The worker process also makes REST calls to the "Client Dedup Microservice" which uses locally hosted "OPEN EMPI" application to determine a unique client(homeless person)
The bulk uploaded data is stored in the Hmis transactional database once the upload is completed and is accessible via the APIS.
Worker process that syncs the data from the Hmis transactional database to the Big Data warehouse (HBASE).
The data will be available for reporting once the data reaches the Big Data warehouse. Typically, we can expect the data to be available in the Big data warehouse within 2 hours after the upload was successfully processed.
- 4.10
- 4.11
- 5.1
- 6.12
- FY2020
Bulk upload screen Manage bulk upload screen Statistics on a bulk upload Errors and Validation screen for a bulk upload
The HMIS bulk uploader process is designed to be fault-tolerant. Typically when systems get data from an external source, it is highly possible that there may be various abnormalities with the data. Hence the data needs to staged and validated before processing.
- The file will not be processed and the bulk uploader will be in an ERROR state when the file is not in the HUD specific format.
- Each HMIS table has an Export_ID associated with it which makes rolling back very easy for a bulk upload if errors were encountered.