Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

[Task]: Add transformation for agency data #106

Closed
chouinar opened this issue Jun 25, 2024 · 1 comment · Fixed by #157
Closed

[Task]: Add transformation for agency data #106

chouinar opened this issue Jun 25, 2024 · 1 comment · Fixed by #157
Assignees

Comments

@chouinar
Copy link
Collaborator

Summary

See: https://docs.google.com/document/d/1EPZJyqTQruq-BkQoojtrkLqgVl8GrpfsZAIY9T1dEg8/edit#heading=h.9a4loabrp0a3 for details on these tables

Acceptance criteria

No response

@chouinar chouinar added this to the Transformation Work milestone Jun 25, 2024
@chouinar chouinar self-assigned this Jul 8, 2024
chouinar added a commit that referenced this issue Jul 11, 2024
chouinar added a commit that referenced this issue Sep 16, 2024
## Summary
Fixes #106

### Time to review: __10 mins__

## Changes proposed
Add transformations for agency data

## Context for reviewers
Agency data is structured oddly in the existing system, instead of being
in ordinary tables, its in a `tgroups` table that has values stored as
key-value pairs. We want to normalize that into something more workable,
so the transformation needs to work a bit differently than the
transformations of other tables.

For simplicity, I load all of the data for every agency (and later
filter to just what changed) as this removes a lot of weird edge cases
that we would have otherwise needed to consider. Only modified rows
actually get used, but we know we have the full set of data now.

## Additional information
I have a snapshot of the prod tgroups table and loaded it into my DB
locally and ran the transform script. In total, it takes ~2 seconds to
run and didn't hit any issues.

A set of the relevant metrics:
```
total_records_processed=1152
total_records_deleted=0
total_records_inserted=1152
total_records_updated=0
total_error_count=0
agency.total_records_processed=1152
agency.total_records_inserted=1152
TransformAgency_subtask_duration_sec=2.14
task_duration_sec=2.14
```

As a sanity test, I also loaded in the tgroups data from dev and tried
running it through. While it generally worked, there were 12 agencies
that failed because they were missing the ldapGp and AgencyContactCity
fields. I'm not certain if we want to do anything about that as they all
seemed to be test agencies based on the names.

---------

Co-authored-by: nava-platform-bot <[email protected]>
@acouch
Copy link
Member

acouch commented Sep 17, 2024

Issue migrated to HHS#2051

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants