Updated installation instructions for on-device training package #192

carzh · 2024-06-25T21:36:43Z

To reflect the updated installation instructions for installing onnxruntime for training. Also adds requirements.txt for mobilebert example

Addresses ONNXRuntime issue #21149

GeorgeS2019 · 2024-06-29T04:45:03Z

@carzh

This note book masked_language_modeling/mobilebert_offline.ipynb needs to be updated to the latest 1.18.1 version.
The examples needs clear requirements information on how to run, especially in CUDA in both notebook and c# code.
https://github.com/microsoft/onnxruntime-training-examples/blob/master/on_device_training/desktop/csharp/masked_language_modeling/mobilebert_offline.ipynb

carzh · 2024-06-29T05:03:47Z

What do you mean update the notebook to the latest 1.18.1 version? Are you running into issues with running the notebook with 1.18.1?

The README for the masked language modeling example already includes the updated installation instructions for ONNXRuntime, and this example was written for CPU EP only.

GeorgeS2019 · 2024-06-29T05:09:01Z

Try this on the latest 1.18.* onnxruntime-training
import onnxruntime.training.onnxblock as onnxblock

GeorgeS2019 · 2024-06-29T05:10:26Z

The whole problem with onnxruntime-training is lack of specific information on requirements for CUDA 12.* and lack of testing that when using c#

microsoft/onnxruntime#21212
microsoft/onnxruntime#21197

carzh · 2024-07-01T17:23:11Z

I'll update the notebook & add a requirements.txt file for the on-device training example.

We can add adding a CUDA C# example to the backlog. I understand your frustration with the lack of documentation and we do need to improve ONNXRuntime documentation, but creating good documentation and examples also takes time.

GeorgeS2019 · 2024-07-01T17:35:32Z

@carzh
Thx for updating the documentation and training codes. Onnxrunetime training will become increasingly important, due to the possibilities to combine with generative AI e.g. Phi3 through AI orchestration using semantic kernel

carzh · 2024-07-01T18:31:19Z

Try this on the latest 1.18.* onnxruntime-training import onnxruntime.training.onnxblock as onnxblock

Ah, I gave a try, and did not run into any issues with that import line. What error do you run into and with what version of ONNXRuntime package? (ie, are you using onnxruntime-training-cpu?)

GeorgeS2019 · 2024-07-02T20:48:02Z

@carzh

I can not use python to upgrade onnxruntime-training from 1.15.1 to 1.18.1
No windows version is available

The link to download the windows version is hard to find
microsoft/onnxruntime#21149 (comment)
pip install -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT/pypi/simple/ onnxruntime-training-cpu

mobilebert-uncased.ckpt is not created. Only checkpoint file is written.

https://github.com/carzh/onnxruntime-training-examples/blob/cebaf5cb5077007163d16720f139c5c847df66e6/on_device_training/desktop/csharp/masked_language_modeling/csharp_console_app/Program.cs#L36C84-L36C96

carzh · 2024-07-02T21:01:14Z

@GeorgeS2019 The link to download the windows version is available at onnxruntime.ai. All the documentation should point to that installation table.

Try the following:

pip uninstall onnxruntime-training -y
python -m pip install cerberus flatbuffers h5py numpy>=1.16.6 onnx packaging protobuf sympy setuptools>=41.4.0
pip install -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT/pypi/simple/ onnxruntime-training-cpu --no-cache-dir

I appended the no-cache-dir flag to ensure that pip doesn't pick up on any locally cached onnxruntime-training python packages.

GeorgeS2019 · 2024-07-02T21:03:37Z

@carzh

I managed to download onnxruntime-training python package for windows: 1.18.0
I could run the notebook, but the generated checkpoint file is only called => checkpoint

Problem1:
mobilebert-uncased.ckpt is not created. Only checkpoint file is written.
Problem2:
All attempt to load the create file fail, not sure which Onnxruntime and Onnxruntime-Training nuget to use.

string checkpointPath = Path.Combine(parentDir, "training_artifacts", "mobilebert-uncased.ckpt");

https://github.com/carzh/onnxruntime-training-examples/blob/cebaf5cb5077007163d16720f139c5c847df66e6/on_device_training/desktop/csharp/masked_language_modeling/csharp_console_app/Program.cs#L36C84-L36C96

GeorgeS2019 · 2024-07-02T21:07:17Z

There are confusion with Cuda 12 support.
Which version of onnxruntime and onnxruntime-training I could use for Cuda 12?

When running the notebook python, which version of python package I need for cuda 12? 1.18.1?

GeorgeS2019 · 2024-07-02T21:14:08Z

For Cuda 12, Onnruntime-training in windows for both python and windows is not supported it seems

carzh · 2024-07-02T22:31:49Z

There are confusion with Cuda 12 support. Which version of onnxruntime and onnxruntime-training I could use for Cuda 12?

Use the most up-to-date package unless the example specifies otherwise. The latest release includes CUDA 12 support, but CUDA 12 is not supported for all configurations of ORT, as it looks like you've discovered.

For example, on-device training with Python for Linux supports CUDA 12, but on-device training with Python for Windows does not. The easiest way to check if a configuration supports CUDA 12 is with the installation table on onnxruntime.ai.

When running the notebook python, which version of python package I need for cuda 12? 1.18.1?

The README specifies to use the python package for CPU. The C# example was written for CPU EP, and not CUDA.

I'll try reproducing the issues you are running into later today and I'll push some updates to make it clearer that the masked language modeling example is for CPU.

GeorgeS2019 · 2024-07-12T16:12:18Z

I'll try reproducing the issues you are running into later today

Any update?

updated installation instructions

eaacf04

added requirements.txt for the offline step for mobilebert example

cebaf5c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated installation instructions for on-device training package #192

Updated installation instructions for on-device training package #192

carzh commented Jun 25, 2024 •

edited

Loading

GeorgeS2019 commented Jun 29, 2024 •

edited

Loading

carzh commented Jun 29, 2024

GeorgeS2019 commented Jun 29, 2024 •

edited

Loading

GeorgeS2019 commented Jun 29, 2024 •

edited

Loading

carzh commented Jul 1, 2024

GeorgeS2019 commented Jul 1, 2024 •

edited

Loading

carzh commented Jul 1, 2024 •

edited

Loading

GeorgeS2019 commented Jul 2, 2024 •

edited

Loading

carzh commented Jul 2, 2024

GeorgeS2019 commented Jul 2, 2024 •

edited

Loading

GeorgeS2019 commented Jul 2, 2024

GeorgeS2019 commented Jul 2, 2024

carzh commented Jul 2, 2024

GeorgeS2019 commented Jul 12, 2024

Updated installation instructions for on-device training package #192

Are you sure you want to change the base?

Updated installation instructions for on-device training package #192

Conversation

carzh commented Jun 25, 2024 • edited Loading

GeorgeS2019 commented Jun 29, 2024 • edited Loading

carzh commented Jun 29, 2024

GeorgeS2019 commented Jun 29, 2024 • edited Loading

GeorgeS2019 commented Jun 29, 2024 • edited Loading

carzh commented Jul 1, 2024

GeorgeS2019 commented Jul 1, 2024 • edited Loading

carzh commented Jul 1, 2024 • edited Loading

GeorgeS2019 commented Jul 2, 2024 • edited Loading

carzh commented Jul 2, 2024

GeorgeS2019 commented Jul 2, 2024 • edited Loading

GeorgeS2019 commented Jul 2, 2024

GeorgeS2019 commented Jul 2, 2024

carzh commented Jul 2, 2024

GeorgeS2019 commented Jul 12, 2024

carzh commented Jun 25, 2024 •

edited

Loading

GeorgeS2019 commented Jun 29, 2024 •

edited

Loading

GeorgeS2019 commented Jun 29, 2024 •

edited

Loading

GeorgeS2019 commented Jun 29, 2024 •

edited

Loading

GeorgeS2019 commented Jul 1, 2024 •

edited

Loading

carzh commented Jul 1, 2024 •

edited

Loading

GeorgeS2019 commented Jul 2, 2024 •

edited

Loading

GeorgeS2019 commented Jul 2, 2024 •

edited

Loading