DJL v0.14.0 release
DJL v0.14.0 updates the engines PyTorch to 1.9.1 and introduces several new features:
Key Features
- Upgrades PyTorch engine to 1.9.1
- Adds support for Neuron SDK 1.16.1
- Adds autoscale in djl-serving for AWS Inferentia model
- Introduces OpenCV extension to provide high performance image processing
- Adds support for older version of PyTorch engine, user now can use PyTorch 1.8.1 with latest DJL
- Adds support for precxx11 PyTorch native library in auto detection mode
- Adds AWS Inferentia support in djl-bench
- Adds support for TorchServe .mar format, user can deploy TorchServe model archive in djl-serving
Enhancement
- Introduces several new features in djl-serving:
- Adds autoscale feature for AWS Inferentia (#31)
- Creates SageMaker hosting compatible docker image for AWS Inferentia (#36)
- Adds auto detect number of neuron cores feature for AWS Inferentia (#34)
- Adds autoscale support for SageMaker style .tar.gz model (#35)
- Adds support to load torchserve model (#32)
- Adds support to pip installed dependency per model (#37)
- Adds custom environment variable support for python engine (#29)
- Adds nested folder support in model archive file (#38)
- Improves model status with model version support (#25)
- Adds model warn up feature for python engine. (#23)
- Adds WorkLoadManager.unregisterModel (#33)
- Adds testing tool to test python model locally (#22)
- Adds set python executable path for python engine (#21)
- Creates Workflow for ModelServing (#26)
- Adds OpenCV extension (#1331)
- Introduces several new features in djl-bench:
- Adds support for AWS Inferentia (#1329)
- Introduces several new features in Apache MXNet engine:
- Implements LayerNorm for Apache MXNet (#1342)
- Introduces several new features in PyTorch engine:
- Upgrades PyTorch to 1.9.1 (#1297)
- Implements padding to bert tokenizer (#1328)
- Makes pytorch-native-auto package optional (#1326)
- Adds support to use different version of PyTorch native library (#1323)
- Adds map_location support for load model from InputStream (#1314)
- Makes map_location optional (#1312)
- Introduces several new features in TensorFlow Lite engine:
- Makes tensor-native-auto package optional (#1301)
- Introduces several API improvements:
- Adds support for nested folder in model archive (#1349)
- Improves translator output error message (#1348)
- Improves Predictor API to support predict with device (#1346)
- Improves BufferedImageFactory.fromNDArray performance (#1339)
- Adds support for downloading .mar file (#1338)
- Adds debugging toString to Input and Output (#1327)
- Refactors BERT Translator and Tokenizer (#1318)
- Makes question answering model serving ready (#1311)
- Refactors minMaxWorkers from ModelInfo to WorkerPool (#30)
Documentation and examples
- Adds huggingface Inferentia serving example (#184)
- Adds AWS SageMaker hosting document
- Adds python hybrid engine demo (#182)
- Improves DJL examples project gradle build script (#1344)
Breaking change
- PyTorch 1.9.1 no longer supports Amazon Linux 2, AL2 user has to use pytorch-native-cpu-precxx11
- Image.Type is removed and Image.duplicate() function no longer take Image.Type as input
- Image.getSubimage() is renamed to Image.getSubImage()
- PaddlePaddle model loading may break due to prefix changes.
Bug Fixes
- Fixes 2nd inference throw exception bug (#1351)
- Fixes calculation for SigmoidBinaryCrossEntropyLoss from sigmoid (#1345)
- Fixes jar model url download bug (#1336)
- Fixes memory in Trainer.checkGradients (#1319)
- Fixes NDManager is closed bug (#1308)
- Fixes PyTorch GPU model loading issue (#1302)
- Fixes MXNet EngineException message (#1300)
- Fixes python resnet18 demo model GPU bug (#24)
- Fixes python engine get_as_bytes() bug (#20)
Contributors
This release is thanks to the following contributors:
- Frank Liu (@frankfliu)
- Jake Lee (@stu1130)
- enpasos (@enpasos)
- Qing Lan (@lanking520)
- Zach Kimberg (@zachgk)