ETL - Extract / Translate / Load

Provides a basic directory structure and template files for setting up a DataLoader using the ETL methodology.

Installation

pip install git+https://gitlab.com/jayemar/etl.git

Basic Usage

from etl.dataloader import DataLoader
dl = DataLoader()

train_gen = dl.retrieve_data(<ml_cfg>)
test_gen = dl.get_test_data()
valid_gen = dl.get_validation_data()

Config File

The config file can be in either JSON or YAML format. Fields are optional unless otherwise stated.

Fields

data_dir: directory where data is located; path can be absolute or relative to directory of task.py
batch_size: number of records per batch
epochs: number of epochs to run through during training
train_size: decimal ratio of training data
test_size: decimal ratio of test data
valid_size: decimal ratio of validation data

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
etl		etl
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ETL - Extract / Translate / Load

Installation

Basic Usage

Config File

Fields

About

Releases

Packages

Languages

jayemar/etl

Folders and files

Latest commit

History

Repository files navigation

ETL - Extract / Translate / Load

Installation

Basic Usage

Config File

Fields

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages