Skip to content

statisticsnorway/datadoc-editor

Repository files navigation

Datadoc

Datadoc Unit tests Code coverage PyPI version Code style: black

Document datasets in Statistics Norway

Usage

DataDoc in use

From Jupyter

  1. Open https://jupyter.dapla-staging.ssb.no or another Jupyter Lab environment
  2. Datadoc comes preinstalled in Statistics Norway environments. Elsewhere, run Run pip install ssb-datadoc to install
  3. Upload a dataset to your Jupyter server (e.g. https://github.com/statisticsnorway/datadoc/blob/master/klargjorte_data/befolkning/person_testdata_p2021-12-31_p2021-12-31_v1.parquet)
  4. Run the demo.ipynb Notebook
  5. Datadoc will open in the notebook

Contributing

Local environment

Poetry is used for dependency management. Poe the Poet is used for running poe tasks within poetry's virtualenv. Upon cloning this project first install necessary dependencies, then run the tests to verify everything is working.

1. Prerequisites

  • Python >=3.10
  • Poetry, install via curl -sSL https://install.python-poetry.org | python3 -

2. Install dependencies

poetry install

3. Install pre-commit hooks

poetry run pre-commit install

4. Run tests

poetry run poe test

Add dependencies

Main

poetry add <python package name>

Dev

poetry add --group dev <python package name>

Run project locally

To run the project locally:

poetry run poe datadoc

Run project locally in Jupyter

To run the project locally in Jupyter run:

poetry run poe jupyter

A Jupyter instance should open in your browser. Open and run the cells in the .ipynb file to demo datadoc.

Running the Dockerized Application Locally

docker run -p 8050:8050 \
-v $HOME/.config/gcloud/application_default_credentials.json/:/application_default_credentials.json \
-e GOOGLE_APPLICATION_CREDENTIALS="/application_default_credentials.json" \
datadoc

Release process

Run the relevant version command on a branch e.g.

poetry version patch
poetry version minor

Commit with message like Bump version x.x.x -> y.y.y.

Open and merge a PR.

Use Github to tag and release.