Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TDD - approach to ingest bug #134

Merged
merged 4 commits into from
Jun 7, 2023
Merged

TDD - approach to ingest bug #134

merged 4 commits into from
Jun 7, 2023

Conversation

THOR300
Copy link
Contributor

@THOR300 THOR300 commented Jun 7, 2023

Description

When firing a document and event set of csvs at a locally deployed latest version of the backend with the latest production postgres db dump using the cclw ingest end point, the request fails.

This is as the metadata taxonomy values do not have allow_any values in the mapping. Here is an example after loading the latest production db state dump:
"topic":{"allow_blanks":true,"allowed_values":["Mitigation","Adaptation","Loss And Damage","Disaster Risk Management"]}

This is causing the following error during the cclw ingest:

{
   "written_at":"2023-06-05T12:33:22.085Z",
   "written_ts":1685968402085789000,
   "msg":"Unexpected error, validating law & policy CSV on ingest",
   "type":"log",
   "logger":"app.api.api_v1.routers.cclw_ingest",
   "thread":"AnyIO worker thread",
   "level":"ERROR",
   "module":"cclw_ingest",
   "line_no":230,
   "errors":"__init__() missing 1 required positional argument: 'allow_any'",
   "exc_info":"Traceback (most recent call last):\n  File \"/cpr-backend/app/api/api_v1/routers/cclw_ingest.py\", line 220, in ingest_law_policy\n    documents_file_contents, _ = _validate_cclw_csv(\n  File \"/cpr-backend/app/api/api_v1/routers/cclw_ingest.py\", line 382, in _validate_cclw_csv\n    validator = get_document_validator(db, context)\n  File \"/cpr-backend/app/core/ingestion/processor.py\", line 342, in get_document_validator\n    _, taxonomy = get_organisation_taxonomy(db, context.org_id)\n  File \"/cpr-backend/app/core/organisation.py\", line 33, in get_organisation_taxonomy\n    return taxonomy[0], {k: TaxonomyEntry(**v) for k, v in taxonomy[1].items()}\n  File \"/cpr-backend/app/core/organisation.py\", line 33, in <dictcomp>\n    return taxonomy[0], {k: TaxonomyEntry(**v) for k, v in taxonomy[1].items()}\n  File \"pydantic/dataclasses.py\", line 322, in pydantic.dataclasses._add_pydantic_validation_attributes.new_init\n  File \"pydantic/dataclasses.py\", line 294, in pydantic.dataclasses._add_pydantic_validation_attributes.handle_extra_init\nTypeError: __init__() missing 1 required positional argument: 'allow_any'\n",
   "filename":"cclw_ingest.py",
   "correlation_id":"2896fa38-039d-11ee-a1c4-0242ac130004"
}

An example for the UNFCCC can be seen below which does contain the allow_any value as true which explains why that ingest succeeded last week.
"author":{"allow_any":true,"allow_blanks":false,"allowed_values":[]}}

It would appear that during the tests the test database loads taxonomy values with allow_any for cclw. It is not entirely clear when this happens but logging during the tests proves that the the database does contain allow_any values for the cclw taxonomy.

Type of change

This PR updates the json object for the cclw taxonomy to include the allow_any field. From my understanding if we deploy the latest version of the backend and run the make migrations_docker_backend command this should load the updated taxonomy and thus allow the ingest to run successfully.
We then update the populate taxonomy module to identify if there is a difference between the declared taxonomy in code and that in the database and to update accordingly.

Please select the option(s) below that are most relevant:

  • Bug fix
  • New feature
  • Breaking change

How Has This Been Tested?

Extensive unit tests and running of the csv's locally against the endpoint.

Reviewer Checklist

Confirm that allow_any = False as default is correct?

  • The PR represents a single feature (small driveby fixes are also ok)
  • The PR includes tests that are sufficient for the level of risk
  • The code is sufficiently commented, particularly in hard-to-understand areas
  • Any required documentation updates have been made
  • Any TODOs added are captured in future tickets
  • No FIXMEs remain

Copy link
Contributor

@diversemix diversemix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect 👍

@THOR300 THOR300 marked this pull request as ready for review June 7, 2023 10:19
@THOR300 THOR300 merged commit 58a9b8b into main Jun 7, 2023
@THOR300 THOR300 deleted the bugfix/taxonomy-allow-any branch June 7, 2023 10:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants