Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The wrong child works are attached to parents #16

Closed
3 tasks
crisr15 opened this issue Aug 1, 2023 · 2 comments
Closed
3 tasks

The wrong child works are attached to parents #16

crisr15 opened this issue Aug 1, 2023 · 2 comments
Assignees
Labels
maintenance bills to maintenance

Comments

@crisr15
Copy link

crisr15 commented Aug 1, 2023

Story

https://assaydepot.slack.com/archives/C030UPFFP2A/p1707761167626329

Initially it was discovered that an identifier we thought was unique was not. So child works were being added to the wrong parent by this identifier. We believe this issue was fixed but the following records were discovered with the wrong child files.

Hello hello, here's a sample of some records for whom I'm still seeing mismatched children in AMS2:
cpb-aacip-207-33dz0cjb
cpb-aacip-191-010p2p45
cpb-aacip-207-375tb5z2
cpb-aacip-207-009w0vxr
cpb-aacip-191-612ngm9x
cpb-aacip-191-39x0kb6x
If you'd like a larger sample, anything from this link that was contributed from outside New Mexico or Georgia are the same type of mismatches https://americanarchive.org/catalog?f%5Bspecial_collections%5D%5B%5D=new-mexico-public[…]-collection&sort=asset_date+asc&f[access_types][]=digitized

We need to evaluate if these are new works and this issue needs to be re-evaluated, or if these are old works and we need to outline a strategy to fix the old records.

Acceptance Criteria

  • We have determined if these works were last updated after the fix for the identifiers was in place.
  • If these works are more recent a ticket for re-evaluating our fix has been created.
  • If these are from before the fix a ticket with plans for how to fix old records has been created.

Implementation Suggestions

given the local_instantiation_identifier and the importer name (directory of the xml files), find the xml file in that directory with the identifier and copy it to a new directory

given a directory of xml files, find all assets with the asset id in each file and remove them and any children from the system

given a directory of xml files that does not have corresponding assets or children, create a bulkrax importer that points at that directory and run it like normal

Code exists to thoroughly delete an asset and all its children (including sipity objects) (AssetDestroyer). This should be used since that step is very important (not sure if that logic has been valkyrized yet — it most likely has not).

Every record is in postgres. Most records are also in fedora. If postgres record disappears and fedora record doesn’t, valkyrie will re-migrate the fedora data to postgres. DELETE FEDORA VERSION OF RECORDS FIRST

Update AssetDestroyer to also destroy records and their children from valkyrie (postgres). Add #destroy_asset_resource_by_id and invoke it after the fedora version

Notes

The previous ticket may be on the Devops repo.

@crisr15 crisr15 added this to AMS / GBH Aug 1, 2023
@crisr15 crisr15 converted this from a draft issue Aug 1, 2023
@jillpe jillpe added this to the Ingestion Cleanup / Monitoring milestone Aug 3, 2023
@orangewolf orangewolf moved this to In Development in AMS / GBH Jan 2, 2024
@bkiahstroud bkiahstroud self-assigned this Feb 7, 2024
@bkiahstroud bkiahstroud added the maintenance bills to maintenance label Feb 7, 2024
@mivillesvik
Copy link

It was mentioned that a list of records known to be affected by this issue would be helpful, so I'm dropping this here! This is only a snapshot and is NOT the entire list of affected records

Snapshotofrecordsaffectedbymismatchedchildren.txt

@bkiahstroud bkiahstroud moved this from In Development to Code Review in AMS / GBH Mar 20, 2024
@bkiahstroud bkiahstroud moved this from Code Review to Deploy to Production in AMS / GBH Mar 23, 2024
@bkiahstroud bkiahstroud moved this from Deploy to Production to In Development in AMS / GBH Mar 23, 2024
@bkiahstroud
Copy link

The remainder of this task (deleting and recreating the broken Instantiations) was handed off to Drew during a meeting in May

@github-project-automation github-project-automation bot moved this from In Development to Done in AMS / GBH Aug 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance bills to maintenance
Projects
Status: Done
Development

No branches or pull requests

4 participants