DataLad aims to deliver a data distribution. Original motive was to provide a platform for harvesting data from online portals and exposing collected data in a readily-usable form from Git-annex repositories, while fetching data load from the original data providers.
It is currently in a "prototype" state, i.e. a mess. It is functional for many use-cases but not widely used since its organization and configuration will be a subject for a considerable reorganization and standardization. Primary purpose of the development is to catch major use-cases and try to address them to get a better understanding of the ultimate specs and design.
Unfortunately there is not that much of unittests, but there are few "functionality" tests aiming to address main use-cases.
Some tests use testing repositories which are available as submodules
under the datalad/tests/testrepos
submodule (two tier- to not pollute
top repository submodules namespace). To enable those tests do
git submodule update --init --recursive
or clone with --recursive
option originally.
On Debian-based systems we recommend to enable NeuroDebian since we use it to provide backports of recent fixed external modules we depend upon:
apt-get install patool python-bs4 python-git python-joblib git-annex
or otherwise you can use pip to install Python modules
pip install -r requirements.txt
and will need to install git-annex using appropriate for your OS means
MIT/Expat
It is in a prototype stage -- nothing is set in stone yet -- but already usable in a limited scope.