add linkml validation to unit tests where nmdc-schema database records are made #270

aclum · 2024-10-10T00:27:30Z

See documentation on how to do this here

We discovered this week that even when using the python classes it is possible that the unit tests make records which would not be accepted by runtime, see #267 for details. Using linkml validation is a way to check these records w/o adding additional runtime API dependencies to the unit tests.

Alternatives considered: use json:validate endpoint to validate records.

mbthornton-lbl · 2024-11-19T18:17:27Z

Validation vs. the NMDC schema is implemented in watch_nmdc.py

import yaml
import linkml.validator
import importlib.resources
from functools import lru_cache

# import the materialized schema - schema version defined in pyproject.toml - and cache it
@lru_cache(maxsize=None)
def _get_nmdc_materialized():
    with importlib.resources.open_text("nmdc_schema", "nmdc_materialized_patterns.yaml") as f:
        return yaml.safe_load(f)

# validation of the nmdc.database before posting to the API:
job_dict = yaml.safe_load(yaml_dumper.dumps(job_database))
            # validate the database object against the schema
            validation_report = linkml.validator.validate(
                job_dict, self.nmdc_materialized, "Database"
            )
            if validation_report.results:
                logger.error(f"Validation error: {validation_report.results[0].message}")
                logger.error(f"job_dict: {job_dict}")
                continue
            else:
                logger.info(f"Database object validated for job {job.opid}")

mbthornton-lbl · 2024-11-19T18:19:16Z

Validation in run_import.py - basically the same thing:

# validate the database
        logger.info("Validating imported data")
        db_dict = yaml.safe_load(yaml_dumper.dumps(db))
        validation_report = linkml.validator.validate(db_dict, nmdc_materialized)
        if validation_report.results:
            logger.error(f"Validation Failed")
            for result in validation_report.results:
                logger.error(result.message)
            raise Exception("Validation Failed")
        else:
            logger.info("Validation Passed")

mbthornton-lbl · 2024-11-19T18:25:00Z

Same basic logic is used 1 unit test

test_imports.test_gold_mapper_map_workflow_executions

aclum mentioned this issue Oct 15, 2024

add tests for validating json records against nmdc-schema #261

Open

mbthornton-lbl self-assigned this Oct 16, 2024

mbthornton-lbl added this to 2024 - Sprint 50 - November 18 - 29, 2024 Nov 19, 2024

mbthornton-lbl moved this to In Review in 2024 - Sprint 50 - November 18 - 29, 2024 Nov 19, 2024

mbthornton-lbl moved this from In Review to In Progress in 2024 - Sprint 50 - November 18 - 29, 2024 Nov 20, 2024

mbthornton-lbl moved this from In Progress to In Review in 2024 - Sprint 50 - November 18 - 29, 2024 Nov 20, 2024

mbthornton-lbl moved this from In Review to Done in 2024 - Sprint 50 - November 18 - 29, 2024 Nov 21, 2024

mbthornton-lbl closed this as completed Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add linkml validation to unit tests where nmdc-schema database records are made #270

add linkml validation to unit tests where nmdc-schema database records are made #270

aclum commented Oct 10, 2024

mbthornton-lbl commented Nov 19, 2024

mbthornton-lbl commented Nov 19, 2024

mbthornton-lbl commented Nov 19, 2024

add linkml validation to unit tests where nmdc-schema database records are made #270

add linkml validation to unit tests where nmdc-schema database records are made #270

Comments

aclum commented Oct 10, 2024

mbthornton-lbl commented Nov 19, 2024

mbthornton-lbl commented Nov 19, 2024

mbthornton-lbl commented Nov 19, 2024