Demonstration of a project generated with the Data Science project extension of PyScaffold
A longer description of your project goes here...
In order to set up the necessary environment:
- review and uncomment what you need in
environment.yml
and create an environmentdemo-dsproject
with the help of conda:conda env create -f environment.yml
- activate the new environment with:
conda activate demo-dsproject
NOTE: The conda environment will have demo-dsproject installed in editable mode. Some changes, e.g. in
setup.cfg
, might require you to runpip install -e .
again.
Optional and needed only once after git clone
:
-
install several pre-commit git hooks with:
pre-commit install # You might also want to run `pre-commit autoupdate`
and checkout the configuration under
.pre-commit-config.yaml
. The-n, --no-verify
flag ofgit commit
can be used to deactivate pre-commit hooks temporarily. -
install nbstripout git hooks to remove the output cells of committed notebooks with:
nbstripout --install --attributes notebooks/.gitattributes
This is useful to avoid large diffs due to plots in your notebooks. A simple
nbstripout --uninstall
will revert these changes.
Then take a look into the scripts
and notebooks
folders.
- Always keep your abstract (unpinned) dependencies updated in
environment.yml
and eventually insetup.cfg
if you want to ship and install your package viapip
later on. - Create concrete dependencies as
environment.lock.yml
for the exact reproduction of your environment with:For multi-OS development, consider usingconda env export -n demo-dsproject -f environment.lock.yml
--no-builds
during the export. - Update your current environment with respect to a new
environment.lock.yml
using:conda env update -f environment.lock.yml --prune
βββ AUTHORS.md <- List of developers and maintainers.
βββ CHANGELOG.md <- Changelog to keep track of new features and fixes.
βββ CONTRIBUTING.md <- Guidelines for contributing to this project.
βββ Dockerfile <- Build a docker container with `docker build .`.
βββ LICENSE.txt <- License as chosen on the command-line.
βββ README.md <- The top-level README for developers.
βββ configs <- Directory for configurations of model & application.
βββ data
β βββ external <- Data from third party sources.
β βββ interim <- Intermediate data that has been transformed.
β βββ processed <- The final, canonical data sets for modeling.
β βββ raw <- The original, immutable data dump.
βββ docs <- Directory for Sphinx documentation in rst or md.
βββ environment.yml <- The conda environment file for reproducibility.
βββ models <- Trained and serialized models, model predictions,
β or model summaries.
βββ notebooks <- Jupyter notebooks. Naming convention is a number (for
β ordering), the creator's initials and a description,
β e.g. `1.0-fw-initial-data-exploration`.
βββ pyproject.toml <- Build configuration. Don't change! Use `pip install -e .`
β to install for development or to build `tox -e build`.
βββ references <- Data dictionaries, manuals, and all other materials.
βββ reports <- Generated analysis as HTML, PDF, LaTeX, etc.
β βββ figures <- Generated plots and figures for reports.
βββ scripts <- Analysis and production scripts which import the
β actual PYTHON_PKG, e.g. train_model.
βββ setup.cfg <- Declarative configuration of your project.
βββ setup.py <- [DEPRECATED] Use `python setup.py develop` to install for
β development or `python setup.py bdist_wheel` to build.
βββ src
β βββ demo_dsproject <- Actual Python package where the main functionality goes.
βββ tests <- Unit tests which can be run with `pytest`.
βββ .coveragerc <- Configuration for coverage reports of unit tests.
βββ .isort.cfg <- Configuration for git hook that sorts imports.
βββ .pre-commit-config.yaml <- Configuration of pre-commit git hooks.
This project has been set up using PyScaffold 4.2.2 and the dsproject extension 0.7.2.post1+g9295912.