Testing Datami widget to edit and validate CSV files https://datami.multi.coop/.
Demonstrate and document how we can use Datami and other components to:
- ease the edition of a csv file stored in Github
- constraint the display of fields in the Datami widget
- investigate how we can use github actions to ensure that the csv file structure and or content is valid according to a model.
The approach of Datami component is:
- to rely on Github to store the CSV file
- offer a html widget to visualize or edit content of the file for users who may not want to use Github directly
- automate the push of modifications to the data done via the widget as Github pull requests
- examples/csv/data: data files and related resources files to validate in CI
- data (csv file):project-list.csv
- definition (model file) for data validation (CLI or CI) project-list.resources.yaml
- examples/csv/model: model for the csv data (be used by the widget)
- Table schema project-list.frictionless-table-schema.json
- example/csv/widget: widget and widget configuration examples
- configuration file for the widget: project-list.fields-custom-properties.json
- configured widget:project-list-widget.html
- .github/workflows: actions to automate validation
- Example github action that validates data file on pull request validate-sample-data.yml
[!INFO] This validation is unrelated to Datami or the use of the widget.
The Goal here is to be able to validate that the CSV file is consistent with the data model.
The model describing the file is project-list.resources.yaml.
We can validate the file in CLI (local mode) or / and as a github action.
The output of the validation is easier (more direct) to read and analyse in local mode. But the github action still provides a report file that can be downloaded if validation fails.
pip install "frictionless[excel,json]" --pre
Example with a file that contains invalid / not authorized values.
Line 3 of data contains a value csharp
which is not in the pattern of authorized values (see project-list.resources.yaml) where the pattern or allowed values / types are defined.
Boaviztapi,https://github.com/Boavizta/,ready,csharp
cd examples/csv/data
frictionless validate project-list.resources.yaml
─────────────────────────────────────────────────────── Dataset ────────────────────────────────────────────────────────
dataset
┏━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ name ┃ type ┃ path ┃ status ┃
┡━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ project-list │ table │ project-list.csv │ INVALID │
└──────────────┴───────┴──────────────────┴─────────┘
──────────────────────────────────────────────────────── Tables ────────────────────────────────────────────────────────
project-list
┏━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Row ┃ Field ┃ Type ┃ Message ┃
┡━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 3 │ 4 │ constraint-error │ The cell "csharp" in row at position "3" and field "languages" at position "4" does │
│ │ │ │ not conform to a constraint: constraint "pattern" is "^(rust|python|docker|other\ │
│ │ │ │ tek)?(\|(rust|python|docker|other\ tek))*$" │
└─────┴───────┴──────────────────┴─────────────────────────────────────────────────────────────────────────────────────┘
After fixing the data file (replace the csharp
value by python|docker
).
Boaviztapi,https://github.com/Boavizta/,ready,python|docker
frictionless validate project-list.resources.yaml
─────────────────────────────────────────────────────── Dataset ────────────────────────────────────────────────────────
dataset
┏━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ name ┃ type ┃ path ┃ status ┃
┡━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ project-list │ table │ project-list.csv │ VALID │
└──────────────┴───────┴──────────────────┴────────┘
See sample github action (.github/workflows/validate-sample-data.yml
)
jobs:
# Validate
validate:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v2
- name: Validate data
uses: frictionlessdata/repository@v2
with:
resources: examples/csv/data/project-list.resources.yaml