Datapipe is a real-time, incremental ETL library for Python with record-level dependency tracking.
The library is designed for describing data processing pipelines and is capable of tracking dependencies for each record in the pipeline. This ensures that tasks within the pipeline receive only the data that has been modified, thereby improving the overall efficiency of data handling.
At the moment these branches are active:
master
- current development state, will be promoted to0.13.x
series release once readyv0.13
- current stable versionv0.11
- legacy stable version (v0.12
was skipped due reasons)
At the moment, the datapipe library is under active development. Versions:
v0.*.*
It should be expected that each minor version is not backward compatible with
the previous one. That is, v0.7.0
is not compatible with v0.6.1
. Dependencies
should be fixed to the exact minor version.
After stabilization and transition to the major version v1.*.*
, the common
rules will apply: all versions with the same major component are compatible.