This project contains tools for deep neural net co-design for custom hardware accelerators.
In essence this is an extension of the PyTorch framework that implements hardware-efficient features, including
- custom low-precision numerical formats and arithmetic,
- fine-grain structured weight sparsity, and
- custom operator approximation logic.
In addition, the project provides a set of optimization tools for co-design using the above features.
pip install dmx-compressor
Given a PyTorch model, e.g. Net()
, wrap it in a DmxModel
container:
from dmx.compressor.modeling import DmxModel
model = DmxModel.from_torch(Net())
Here model
is functionally equivalent to Net()
, and all torch
functionalities are still available, but model
is equipped with d-Matrix specific features, making it ready for co-design configuration and/or optimization, at training time or post-training.
See advanced topics for further details.
model.dmx_config
is a dictionary that contains all, and only those, configurations that affect the functional behavior of the model, different from the behavior of the original Net()
.
Use method model.transform()
to set these configurations, through application of configuration rules.
See advanced topics for engineering of configuration rules.
There are two predefined special rule sets config_rules.BASELINE
and config_rules.BASIC
; the former is a dummy that does not change the original model's functional behavior, whereas the latter brings the model to a functional state that is equivalent to basic-mode execution on d-Matrix's hardware, e.g.
from dmx.compressor import config_rules
model = model.transform(
model.dmx_config,
*config_rules.BASIC,
)
To leverage the popularity of Hugging Face's pipeline API for inference, we extend transformers.pipeline()
to dmx.compressor.modeling.hf.pipeline()
, which retains all existing functionality of pipelines while enabling model transformation and configuration for deployment on d-Matrix hardware.
from dmx.compressor.modeling.hf import pipeline
pipe = pipeline(
task="text-generation",
model="facebook/opt-125m",
dmx_config="BASIC", # make the model deployable on d-Matrix backend
...
)
# Deploy pipe the same way as Hugging Face provides.
For more detailed information, go over the following documents on specific topics. Find more usage examples here.
- Configurations
- Numerics
- Weight sparsity
- Custom approximation logic