create tensor of float16 as input using the Java API #7003

andyguo007 · 2021-03-13T00:13:51Z

Hi team,

In order to run a fp16 model, do we have a way to create a tensor of float16 as input using the Java API? e.g.,
OnnxTensor.createTensor(env, FloatBuffer.wrap(pixels), shape, ***);

Thanks,

Craigacp · 2021-03-13T18:52:45Z

There is no way to do that in the Java API at the moment. It supports the outbound transformation (i.e. a model can produce a fp16 output and it will be converted into a FloatBuffer or float array), but the inbound transformation is not allowed.

What's your fp16 input type in Java? Are you storing them in short or do you want to downcast a float into fp16?

andyguo007 · 2021-03-15T21:13:45Z

There is no way to do that in the Java API at the moment. It supports the outbound transformation (i.e. a model can produce a fp16 output and it will be converted into a FloatBuffer or float array), but the inbound transformation is not allowed.

What's your fp16 input type in Java? Are you storing them in short or do you want to downcast a float into fp16?

The input type in Java is float. I converted the model from fp32 to fp16, wondering if there is a way to use the "accelerated(fp16) model".

Craigacp · 2021-03-15T21:19:03Z

So you'd like to be able to pass in a float[] or FloatBuffer and have ORT downcast the floats into fp16 before storing them in the OnnxTensor?

andyguo007 · 2021-03-15T21:53:39Z

So you'd like to be able to pass in a float[] or FloatBuffer and have ORT downcast the floats into fp16 before storing them in the OnnxTensor?

Yes.

Since the fp16 model expects fp16 input, currently I am getting this error:
ai.onnxruntime.OrtException: Error code - ORT_INVALID_ARGUMENT - message: Unexpected input data type. Actual: (N11onnxruntime17PrimitiveDataTypeIfEE) , expected: (N11onnxruntime17PrimitiveDataTypeINS_9MLFloat16EEE)

Craigacp · 2021-03-15T22:25:30Z

Ok. If you don't need to persist fp16 values in Java, and are fine with storing floats on the Java side then it's easier to implement, but still will require a bunch of changes to the Java side of ORT. I'll put it on the list of things to do.

MankaranSingh · 2023-01-27T13:08:08Z

Hi, Any updates on this ?

Craigacp · 2023-01-27T13:19:35Z

The support is still not available in the ORT Java API, though Java 20 will likely have conversion methods fp32 <-> fp16 which will make it easier to implement (https://download.java.net/java/early_access/jdk20/docs/api/java.base/java/lang/Float.html#floatToFloat16(float)).

For the time being you can take an ONNX model in fp16 and add a fp32 -> fp16 cast node to the start of it using the ONNX python tooling (or edit the protobuf in java).

MankaranSingh · 2023-01-27T13:26:12Z

Hey @Craigacp, thanks for quick reply. I have a float16 model and would like to add a node to convert the float32 input to float16, Is there any sample code available ? My model has multiple named input nodes and would like to preserve the naming.
Thanks!

MankaranSingh · 2023-01-27T13:35:26Z

On a side note, is it possible to implement a method that just takes byte-buffer as inputs and let user specify what type it is? The conversion will take place in the c/c++ side in JNI layer.

Craigacp · 2023-01-27T15:00:59Z

I don't have sample code, but you can load the protobuf and add cast nodes. Preserving the names will be trickier as you'll need to rename the existing input layer and that tends to ripple.

We could implement something that accepted a byte buffer and a type, but you'd still have to prepare the fp16 values in Java to put into the byte buffer, and it would be significantly easier to mess things up by accidentally specifying the wrong type, or getting the endian-ness wrong.

MankaranSingh · 2023-01-29T16:46:40Z

Ended up running this script in reverse to convert whole model to float32. Needed that anyways because most onnx to 'other format' model converters dont support Cast operator.

@yuslepukhin

### Description The Java API currently only supports fp16 output tensors which it automatically casts to floats on the way out. This PR adds support for creating fp16 and bf16 tensors (from `java.nio.Buffer` objects or as the output of models, creation from Java short arrays is not supported), along with efficient methods for casting `FloatBuffer` into `ShortBuffer` filled with fp16 or bf16 values and vice versa. The fp16 conversions use a trick to pull in the efficient conversion methods added to Java 20, falling back to ports of the MLAS methods otherwise. The Java 20 methods can be special cased by the C2 JIT compiler to emit the single instruction on x86 and ARM which converts fp32<->fp16, or the vectorized versions thereof, so they should be quite a bit faster than the MLAS ported one. ### Motivation and Context fp16 and bf16 are increasingly popular formats and we've had several requests for this functionality. Fixes #7003. cc @yuslepukhin @cassiebreviu --------- Co-authored-by: Scott McKay <[email protected]>

@yuslepukhin

### Description The Java API currently only supports fp16 output tensors which it automatically casts to floats on the way out. This PR adds support for creating fp16 and bf16 tensors (from `java.nio.Buffer` objects or as the output of models, creation from Java short arrays is not supported), along with efficient methods for casting `FloatBuffer` into `ShortBuffer` filled with fp16 or bf16 values and vice versa. The fp16 conversions use a trick to pull in the efficient conversion methods added to Java 20, falling back to ports of the MLAS methods otherwise. The Java 20 methods can be special cased by the C2 JIT compiler to emit the single instruction on x86 and ARM which converts fp32<->fp16, or the vectorized versions thereof, so they should be quite a bit faster than the MLAS ported one. ### Motivation and Context fp16 and bf16 are increasingly popular formats and we've had several requests for this functionality. Fixes #7003. cc @yuslepukhin @cassiebreviu --------- Co-authored-by: Scott McKay <[email protected]>

Xiao-Nine · 2024-05-05T13:26:19Z

Sry, I noticed that this issue has been closed，but how to create tensor of float16 as input? I haven't found any interfaces for me to use in onnxruntime_gpu 1.17.3. An exception (java.lang.ClassCastException: class java.nio.HeapByteBuffer cannot be cast to class java.nio.ShortBuffer) has occurred when I try to use OnnxTensor.createTensor(this.env, inputBuffer, INPUT_SHAPE, OnnxJavaType.FLOAT16)

Craigacp · 2024-05-06T00:09:48Z

There's an example in the tests - https://github.com/microsoft/onnxruntime/blob/main/java/src/test/java/ai/onnxruntime/OnnxTensorTest.java#L298 which shows creating a direct byte buffer, taking a short buffer view of it and then writing in fp16 values to it. Alternatively you can use this method to prepare a suitable ShortBuffer from a FloatBuffer - https://github.com/microsoft/onnxruntime/blob/main/java/src/main/jvm/ai/onnxruntime/platform/Fp16Conversions.java#L67. What's the stack trace of the exception?

Xiao-Nine · 2024-05-07T10:39:49Z

Thank you for your reply.
My envs:

JDK17
onnxruntime_gpu 1.17.3

I try to create a OnnxTensor by ByteBuffer like https://github.com/microsoft/onnxruntime/blob/main/java/src/test/java/ai/onnxruntime/OnnxTensorTest.java#L298.
but an exception occurred:

java.lang.ClassCastException: class java.nio.HeapByteBuffer cannot be cast to class java.nio.ShortBuffer (java.nio.HeapByteBuffer and java.nio.ShortBuffer are in module java.base of loader 'bootstrap')

	at ai.onnxruntime.OrtUtil.prepareBuffer(OrtUtil.java:524)
	at ai.onnxruntime.OnnxTensor.createTensor(OnnxTensor.java:754)
	at ai.onnxruntime.OnnxTensor.createTensor(OnnxTensor.java:610)
	at ai.onnxruntime.OnnxTensor.createTensor(OnnxTensor.java:589)
	at com.example.demo.DemoApplicationTests.modelTest(DemoApplicationTests.java:58)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)

Craigacp · 2024-05-07T13:32:31Z

You need to pass in a direct ByteBuffer allocated with ByteBuffer.allocateDirect then call asShortBuffer() on it. Otherwise it tries to cast the buffer you pass in back as a ShortBuffer to copy it into a direct ByteBuffer and fails. I'll work at improving the checks so it accepts byte buffers of any kind and does the copy, but for the moment you need to prepare it appropriately.

askhade added api:Java issues related to the Java API core runtime issues related to core runtime type:support labels Mar 15, 2021

faxu removed the type:support label Aug 18, 2021

Craigacp mentioned this issue Jul 14, 2023

[java] Adds support for fp16 and bf16 tensors #16703

Merged

skottmckay closed this as completed in #16703 Jul 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

create tensor of float16 as input using the Java API #7003

create tensor of float16 as input using the Java API #7003

andyguo007 commented Mar 13, 2021

Craigacp commented Mar 13, 2021

andyguo007 commented Mar 15, 2021

Craigacp commented Mar 15, 2021

andyguo007 commented Mar 15, 2021

Craigacp commented Mar 15, 2021

MankaranSingh commented Jan 27, 2023

Craigacp commented Jan 27, 2023

MankaranSingh commented Jan 27, 2023

MankaranSingh commented Jan 27, 2023

Craigacp commented Jan 27, 2023

MankaranSingh commented Jan 29, 2023 •

edited

Loading

Xiao-Nine commented May 5, 2024

Craigacp commented May 6, 2024

Xiao-Nine commented May 7, 2024

Craigacp commented May 7, 2024

create tensor of float16 as input using the Java API #7003

create tensor of float16 as input using the Java API #7003

Comments

andyguo007 commented Mar 13, 2021

Craigacp commented Mar 13, 2021

andyguo007 commented Mar 15, 2021

Craigacp commented Mar 15, 2021

andyguo007 commented Mar 15, 2021

Craigacp commented Mar 15, 2021

MankaranSingh commented Jan 27, 2023

Craigacp commented Jan 27, 2023

MankaranSingh commented Jan 27, 2023

MankaranSingh commented Jan 27, 2023

Craigacp commented Jan 27, 2023

MankaranSingh commented Jan 29, 2023 • edited Loading

Xiao-Nine commented May 5, 2024

Craigacp commented May 6, 2024

Xiao-Nine commented May 7, 2024

Craigacp commented May 7, 2024

MankaranSingh commented Jan 29, 2023 •

edited

Loading