Skip to content

Releases: carefree0910/carefree-learn

carefree-learn 0.4.1

22 Apr 01:50
Compare
Choose a tag to compare

Release Notes

We're happy to announce that carefree-learn released v0.4.x, which is much clearer, much more unified, and also much more lightweight!

Main Changes

In the v0.3.x era, Pipelines (e.g., MLPipeline, CVPipeline) are implemented in a tricky and unpleasant way - they overly depend on inheritance, causing hundreds of thousands of duplicated / unneeded / workaround codes.

Same problems also occur at the data module: MLData is powerful but nobody can maintain it, CVData utilizes third party libraries pretty well but nobody knows how to use it.

In v0.4.x, we abstracted out the true Pipeline structure: it should just be a series of Blocks. When it is running, each Block do its own job. When saving/loading, each Block saves/loads its own assets.

Under this design, we are able to refactor the original Pipelines into a unified one. What's more exciting is that the data module can also be refactored into the new Pipeline structure, like this.

Documentation

v0.4.x is still under heavy development, so currently the documentation is out of date. I'll try to update it ASAP and before that, it might be a good start to look at the examples, which cover quite a few use cases!

carefree-learn 0.3.2

03 Oct 09:50
Compare
Choose a tag to compare

Miscellaneous fixes and updates.

carefree-learn 0.3.1

13 Sep 07:09
Compare
Choose a tag to compare

Miscellaneous fixes and updates.

carefree-learn 0.3.0

20 Jun 12:21
Compare
Choose a tag to compare

We're happy to announce that carefree-learn released v0.3.x, which made it much more light-weight!

carefree-learn 0.2.5

20 Jun 06:04
Compare
Choose a tag to compare

Miscellaneous fixes and updates.

carefree-learn 0.2.4

16 Jun 04:56
Compare
Choose a tag to compare

Miscellaneous fixes and updates.

carefree-learn 0.2.3

29 Apr 10:36
Compare
Choose a tag to compare

Miscellaneous fixes and updates.

carefree-learn 0.2.2

29 Jan 03:42
Compare
Choose a tag to compare

Miscellaneous fixes and updates.

carefree-learn 0.2.1

29 Oct 05:39
Compare
Choose a tag to compare

Release Notes

We're happy to announce that carefree-learn released v0.2.x, which made it capable of solving not only tabular tasks, but also other general deep learning tasks!

Introduction

Deep Learning with PyTorch made easy 🚀!

Like many similar projects, carefree-learn can be treated as a high-level library to help with training neural networks in PyTorch. However, carefree-learn does more than that.

  • carefree-learn is highly customizable for developers. We have already wrapped (almost) every single functionality / process into a single module (a Python class), and they can be replaced or enhanced either directly from source codes or from local codes with the help of some pre-defined functions provided by carefree-learn (see Register Mechanism).
  • carefree-learn supports easy-to-use saving and loading. By default, everything will be wrapped into a .zip file, and onnx format is natively supported!
  • carefree-learn supports Distributed Training.

Apart from these, carefree-learn also has quite a few specific advantages in each area:

Machine Learning 📈

  • carefree-learn provides an end-to-end pipeline for tabular tasks, including AUTOMATICALLY deal with (this part is mainly handled by carefree-data, though):
    • Detection of redundant feature columns which can be excluded (all SAME, all DIFFERENT, etc).
    • Detection of feature columns types (whether a feature column is string column / numerical column / categorical column).
    • Imputation of missing values.
    • Encoding of string columns and categorical columns (Embedding or One Hot Encoding).
    • Pre-processing of numerical columns (Normalize, Min Max, etc.).
    • And much more...
  • carefree-learn can help you deal with almost ANY kind of tabular datasets, no matter how dirty and messy it is. It can be either trained directly with some numpy arrays, or trained indirectly with some files locate on your machine. This makes carefree-learn stand out from similar projects.

When we say ANY, it means that carefree-learn can even train on one single sample.

For example

import cflearn

toy = cflearn.ml.make_toy_model()
data = toy.data.cf_data.converted
print(f"x={data.x}, y={data.y}")  # x=[[0.]], y=[[1.]]


This is especially useful when we need to do unittests or to verify whether our custom modules (e.g. custom pre-processes) are correctly integrated into carefree-learn.

For example

import cflearn
import numpy as np

# here we implement a custom processor
@cflearn.register_processor("plus_one")
class PlusOne(cflearn.Processor):
    @property
    def input_dim(self) -> int:
        return 1

    @property
    def output_dim(self) -> int:
        return 1

    def fit(self, columns: np.ndarray) -> cflearn.Processor:
        return self

    def _process(self, columns: np.ndarray) -> np.ndarray:
        return columns + 1

    def _recover(self, processed_columns: np.ndarray) -> np.ndarray:
        return processed_columns - 1

# we need to specify that we use the custom process method to process our labels
toy = cflearn.ml.make_toy_model(cf_data_config={"label_process_method": "plus_one"})
data = toy.data.cf_data
y = data.converted.y
processed_y = data.processed.y
print(f"y={y}, new_y={processed_y}")  # y=[[1.]], new_y=[[2.]]

There is one more thing we'd like to mention: carefree-learn is Pandas-free. The reasons why we excluded Pandas are listed in carefree-data.


Computer Vision 🖼️

  • carefree-learn also provides an end-to-end pipeline for computer vision tasks, and:
    • Supports native torchvision datasets.

      data = cflearn.cv.MNISTData(transform="to_tensor")

      Currently only mnist is supported, but will add more in the future (if needed) !

    • Focuses on the ImageFolderDataset for customization, which:

      • Automatically splits the dataset into train & valid.
      • Supports generating labels in parallel, which is very useful when calculating labels is time consuming.

      See IFD introduction for more details.

  • carefree-learn supports various kinds of Callbacks, which can be used for saving intermediate visualizations / results.
    • For instance, carefree-learn implements an ArtifactCallback, which can dump artifacts to disk elaborately during training.

Examples

Machine Learning 📈 Computer Vision 🖼️
import cflearn
import numpy as np
x = np.random.random([1000, 10])
y = np.random.random([1000, 1])
m = cflearn.api.fit_ml(x, y, carefree=True)

import cflearn
data = cflearn.cv.MNISTData(batch_size=16, transform="to_tensor")
m = cflearn.api.resnet18_gray(10).fit(data)

Please refer to Quick Start and Developer Guides for detailed information.

Migration Guide

From 0.1.x to v0.2.x, the design principle of carefree-learn changed in two aspects:

Framework

  • The DataLayer in v0.1.x has changed to the more general DataModule in v0.2.x.
  • The Model in v0.1.x, which is constructed by pipes, has changed to general Model.

These changes are made because we want to make carefree-learn compatible with general deep learning tasks (e.g. computer vision tasks).

Data Module

Internally, the Pipeline will train & predict on DataModule in v0.2.x, but carefree-learn also provided useful APIs to make user experiences as identical to v0.1.x as possible:

Train

v0.1.x v0.2.x
import cflearn
import numpy as np
x = np.random.random([1000, 10])
y = np.random.random([1000, 1])
m = cflearn.make().fit(x, y)

import cflearn
import numpy as np
x = np.random.random([1000, 10])
y = np.random.random([1000, 1])
m = cflearn.api.fit_ml(x, y, carefree=True)

Predict

v0.1.x v0.2.x
predictions = m.predict(x)
predictions = m.predict(cflearn.MLInferenceData(x))

Evaluate

v0.1.x v0.2.x
cflearn.evaluate(x, y, metrics=["mae", "mse"], pipelines=m)
cflearn.ml.evaluate(cflearn.MLInferenceData(x, y), metrics=["mae", "mse"], pipelines=m)

Model

It's not very straight forward to migrate models from v0.1.x to v0.2.x, so if you require such migration, feel free to submit an issue and we will analyze the problems case by case!

carefree-learn 0.1.16

09 Apr 01:12
Compare
Choose a tag to compare

Release Notes

carefree-learn 0.1.16 improved overall performances.

Optimizer

MADGRAD (4466c9f) & Ranger (acdeec4) are now introduced.

Reference: Best-Deep-Learning-Optimizers.

Misc

  • Fixed ddp when np.ndarray is provided (969a6c8).
  • Fixed RNN when bidirectional is True (be974df) (6ef49f7).