Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] add data type check to array setter #1975

Merged
merged 4 commits into from
Sep 4, 2022

Conversation

KexinFeng
Copy link
Contributor

@KexinFeng KexinFeng commented Aug 29, 2022

Description

Fix #1970. The test code is therein.

details

A poential place to convert the datatype from the input data type to the target array data type is in

public static void copyBuffer(Buffer src, ByteBuffer target) {

Here it does the job of:

2. Multiple dataType maps to the same Buffer type, we have to handle them differently

But it is not ideal to conver the Buffer type inside java
https://stackoverflow.com/questions/38745123/bytebufferasshortbuffer-cannot-be-cast-to-java-nio-floatbuffer
Cannot cast 'java.nio.HeapIntBuffer' to 'java.nio.FloatBuffer'

So here we simply throw an exception when datatypes don't match.

@@ -212,6 +212,16 @@ public NDArray get(NDIndex index) {
/** {@inheritDoc} */
@Override
public void set(Buffer data) {
DataType arrayType = getDataType();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we check list this:

  1. We should always allow ByteBuffer to set for any DataType
  2. Multiple dataType maps to the same Buffer type, we have to handle them differently

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find that in the implementation of set(Buffer data) in different engines, like MXNet, PyTorch, TensorFlow, the buffer is already converted to ByteBuffer before being fed into the engines. Do you mean we should add this to NDArrayAdapter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By check listing those, is it out of the purpose of automatically converting the input datatype into the target array datatype?

@codecov-commenter
Copy link

codecov-commenter commented Aug 30, 2022

Codecov Report

Merging #1975 (b3beab7) into master (bb5073f) will decrease coverage by 2.27%.
The diff coverage is 67.74%.

@@             Coverage Diff              @@
##             master    #1975      +/-   ##
============================================
- Coverage     72.08%   69.81%   -2.28%     
- Complexity     5126     5899     +773     
============================================
  Files           473      584     +111     
  Lines         21970    26159    +4189     
  Branches       2351     2824     +473     
============================================
+ Hits          15838    18262    +2424     
- Misses         4925     6521    +1596     
- Partials       1207     1376     +169     
Impacted Files Coverage Δ
api/src/main/java/ai/djl/modality/cv/Image.java 69.23% <ø> (-4.11%) ⬇️
...rc/main/java/ai/djl/modality/cv/MultiBoxPrior.java 76.00% <ø> (ø)
...rc/main/java/ai/djl/modality/cv/output/Joints.java 71.42% <ø> (ø)
.../main/java/ai/djl/modality/cv/output/Landmark.java 100.00% <ø> (ø)
...main/java/ai/djl/modality/cv/output/Rectangle.java 72.41% <0.00%> (ø)
...i/djl/modality/cv/translator/BigGANTranslator.java 21.42% <0.00%> (-5.24%) ⬇️
...odality/cv/translator/BigGANTranslatorFactory.java 33.33% <0.00%> (+8.33%) ⬆️
.../cv/translator/InstanceSegmentationTranslator.java 0.00% <0.00%> (-86.59%) ⬇️
...nslator/InstanceSegmentationTranslatorFactory.java 7.14% <0.00%> (-11.04%) ⬇️
.../cv/translator/SemanticSegmentationTranslator.java 0.00% <0.00%> (ø)
... and 499 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@KexinFeng KexinFeng requested a review from frankfliu August 30, 2022 05:00
@frankfliu frankfliu force-pushed the issue_type branch 2 times, most recently from 65c2548 to 8bdf9a1 Compare September 1, 2022 21:29
Change-Id: Iac7155e469cc5c2918c4452eb95b4c9a2ef9cb43
@@ -404,9 +404,18 @@ NDManager getAlternativeManager() {
* @throws IllegalArgumentException if buffer size is invalid
*/
public static void validateBufferSize(Buffer buffer, DataType dataType, int expected) {
boolean isByteBuffer = buffer instanceof ByteBuffer;
DataType type = DataType.fromBuffer(buffer);
if (!isByteBuffer && type != dataType) {
Copy link
Contributor Author

@KexinFeng KexinFeng Sep 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following mismatch case will escape the check:
type != dataType && buffer is ByteBuffer, but dataType is not one of the byte types = {DataType.UINT8, DataType.INT8, DataType.BOOLEAN}

Suggested change
if (!isByteBuffer && type != dataType) {
if (arrayType != inputType ) {
DataType[] types = {DataType.UINT8, DataType.INT8, DataType.BOOLEAN};
if (!isByteBuffer || Arrays.stream(types).noneMatch(x -> x == dataTypeType)) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ByteBuffer should always be allowed. It's the memory representation of all data types. All NDArray has toByteBuffer() method. The JNI only accept ByteBuffer, if we only accept matching Buffer, we have to covert ByteBuffer to Buffer and copy the Buffer to ByteBuffer to pass to JNI. It's not efficient for Hybrid engine. We trying to achieve 0 copy between pytorch and onnx engines. It rely on ByteBuffer.

@KexinFeng KexinFeng merged commit 5873328 into deepjavalibrary:master Sep 4, 2022
@KexinFeng KexinFeng deleted the issue_type branch October 31, 2022 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

set Function of NDArray, like set(int[] data) should compare the datatype of NDArray and datatype of data
3 participants