Document datasets in Statistics Norway
- Open https://jupyter.dapla-staging.ssb.no or another Jupyter Lab environment
- Datadoc comes preinstalled in Statistics Norway environments. Elsewhere, run Run
pip install ssb-datadoc
to install - Upload a dataset to your Jupyter server (e.g. https://github.com/statisticsnorway/datadoc/blob/master/klargjorte_data/befolkning/person_testdata_p2021-12-31_p2021-12-31_v1.parquet)
- Run the demo.ipynb Notebook
- Datadoc will open in the notebook
Poetry is used for dependency management. Poe the Poet is used for running poe tasks within poetry's virtualenv. Upon cloning this project first install necessary dependencies, then run the tests to verify everything is working.
- Python >=3.10
- Poetry, install via
curl -sSL https://install.python-poetry.org | python3 -
poetry install
poetry run pre-commit install
poetry run poe test
poetry add <python package name>
poetry add --group dev <python package name>
To run the project locally:
poetry run poe datadoc
To run the project locally in Jupyter run:
poetry run poe jupyter
A Jupyter instance should open in your browser. Open and run the cells in the .ipynb
file to demo datadoc.
docker run -p 8050:8050 \
-v $HOME/.config/gcloud/application_default_credentials.json/:/application_default_credentials.json \
-e GOOGLE_APPLICATION_CREDENTIALS="/application_default_credentials.json" \
datadoc
Run the relevant version command on a branch e.g.
poetry version patch
poetry version minor
Commit with message like Bump version x.x.x -> y.y.y
.
Open and merge a PR.
Use Github to tag and release.