Roadmap

Perhaps the best analogy fo the scope and extensions of ML is scipy and sickits which build on top of numpy while ML provides an abstraction on top of pytorch to support GPU acceleration whenever possible. Therefore, the tentative roadmap for ML to evolve is composed of core, learn, vision, audio and text.

The core package is focusing on general I/O and computing support that involves a variety of media formats, storage efficiency as well as distributed and parallel executions across multiple nodes and GPU devices. The learn package aims to support many machine learning and optimization techniques. Those vision, audio and text packages provide a consistent interface to access SOTA models on the three modalities. In the following, the feature set in each package are targeted for version v1.0.

Package Dependences and Distributions

External dependencies are inevitable to increase the storage requirement significantly. Unless unavailable, the package and dependency management should be done by some package manager such as conda explicitly. This accordingly addresses the major concern of a deployment platform with or without GPU/CUDA support by installing ML through conda with cpuonly to save the space required by cudatoolkit related packages.

Specific dependencies must be made explicit with custom recipes if necessary, for example:

TensoRT
MXNet

Dependency Conflicts

Known dependency conflicts with pytorch should be avoided whenever possible:

jpeg<=9b with opencv requiring jpeg>=9d from conda-forge

Features

Learning

Online hard example mining
Annotation UI

Training

Management
- Experiment and trial management
- progress and results visualization
Optimization
- Resource coordination
- Mixed precision
- Quantization

IO

Ease of access to diverse date sources
Efficient preprocessing pipeline
Efficient data storage

Runtime

Profiling
Debugging
Deployment
- Scalability from edge devices to cloud backend

APIs

app.py*: torch for dist.init_process_group
argparse.py
cv.py*: PIL, cv2, torch, torchvision for accimage
extension.py: torch
io.py*: h5py, tables, torch
logging.py
math.py*: round with tensor
profiler.py: line_porfiler
random.py*: torch for seeding
shutil.py
statistics.py: pandas
vis.py: visdom
tasks: ignite, torch

Extensions

nn

Learn

supervised
self-supervised or unsupervised
reinforcement
meta
few-shot
analytics
- sequence event analytics engine
- gradient descent based nearest neighbor search

Data

data
annotation

Modalities

av: high level interfaces to vision and audio
vision
- Real-time tracking and detection
  - batch detection
  - SiamMask
audio
text
- BERT and variants

Optimizations:

compression: model compression
nas: neural architecture search
tuning: hyperperparameter tuning

System

cuda
distributed
multiprocessing
requests
sys
utils*: yaml, psutil, torch for grad*

Native

csrc: native CPU/CUDA source
- nms
- RoIAlign
- RoIPool
ops: native operations wrapper

Provide feedback

Saved searches

Use saved searches to filter your results more quickly