Skip to content

demeringo/datami-tests

Repository files navigation

datami-tests

Testing Datami widget to edit CSV files https://datami.multi.coop/ with an additional validation in CLI or through github action.

Purpose

Demonstrate and document how we can use Datami and other components to:

  • ease the edition of a csv file stored in Github
  • constraint the display of fields in the Datami widget
  • investigate how we can use github actions to ensure that the csv file structure and or content is valid according to a model.

The approach of Datami component is:

  1. to rely on Github to store the CSV file
  2. offer a html widget to visualize or edit content of the file for users who may not want to use Github directly
  3. automate the push of modifications to the data done via the widget as Github pull requests

usage flow

Content of the repository

datami component

Warning

The data validation and the data edition (widget) are configured using different set of files or data models. These data model use different syntax but have to be kept in sync manually !

Validating the CSV file

Note

This validation is unrelated to Datami or the use of the widget.

The Goal here is to be able to validate that the CSV file is consistent with the data model.

We can validate the file in CLI (local mode) or / and as a github action.

The output of the validation is easier (more direct) to read and analyse in local mode. But the github action still provides a report file that can be downloaded if validation fails.

Define the model used for validation

The model describing the file is project-list.resources.yaml.

Warning

In the case of multi valued columns, the validation involves describing the allowed values as a REGEXP.

Validating a data file locally (CLI)

Install frictionless package

pip install "frictionless[excel,json]" --pre

Validate data

Example with a file that contains invalid / not authorized values.

Line 3 of data contains a value csharp which is not in the pattern of authorized values (see project-list.resources.yaml) where the pattern or allowed values / types are defined.

Boaviztapi,https://github.com/Boavizta/,ready,csharp
cd examples/csv/data
frictionless validate project-list.resources.yaml

─────────────────────────────────────────────────────── Dataset ────────────────────────────────────────────────────────
                       dataset
┏━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ name         ┃ type  ┃ path             ┃ status  ┃
┡━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ project-list │ table │ project-list.csv │ INVALID │
└──────────────┴───────┴──────────────────┴─────────┘
──────────────────────────────────────────────────────── Tables ────────────────────────────────────────────────────────
                                                      project-list
┏━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Row ┃ Field ┃ Type             ┃ Message                                                                             ┃
┡━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 3   │ 4     │ constraint-error │ The cell "csharp" in row at position "3" and field "languages" at position "4" does │
│     │       │                  │ not conform to a constraint: constraint "pattern" is "^(rust|python|docker|other\   │
│     │       │                  │ tek)?(\|(rust|python|docker|other\ tek))*$"                                         │
└─────┴───────┴──────────────────┴─────────────────────────────────────────────────────────────────────────────────────┘

After fixing the data file (replace the csharp value by python|docker).

Boaviztapi,https://github.com/Boavizta/,ready,python|docker
frictionless validate project-list.resources.yaml
─────────────────────────────────────────────────────── Dataset ────────────────────────────────────────────────────────
                      dataset
┏━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ name         ┃ type  ┃ path             ┃ status ┃
┡━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ project-list │ table │ project-list.csv │ VALID  │
└──────────────┴───────┴──────────────────┴────────┘

Validate a data file in CI (github action)

See sample github action (.github/workflows/validate-sample-data.yml)

jobs:

  # Validate

  validate:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repository
        uses: actions/checkout@v2
      - name: Validate data
        uses: frictionlessdata/repository@v2
        with:
          resources: examples/csv/data/project-list.resources.yaml

Datami widget configuration

The widget is mainly configured using 2 distinct files:

Tip

The model itself make no distinction between mono-valued or multi-valued fields, the 2 types of fields are described as enum without cardinality. To distinguish mono-valued fields from multi-valued fields in edition, the widget configuration file use respectively the subtype tag (singular) vs tags (with a plural and an optional field separator).

References

About

Testing datami widget to edit and validate csv files

Resources

License

Stars

Watchers

Forks

Packages

No packages published