Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate bytes with Uint8List #219

Open
truongsinh opened this issue Mar 12, 2019 · 9 comments
Open

Generate bytes with Uint8List #219

truongsinh opened this issue Mar 12, 2019 · 9 comments
Labels
feature request perf Related to runtime performance

Comments

@truongsinh
Copy link

truongsinh commented Mar 12, 2019

Ref: https://groups.google.com/a/dartlang.org/forum/#!topic/misc/JpNkRRLI9_w, daegalus/dart-uuid#35

More and more packages are explicitly switching to Uint8List

Right now, with the folling proto

message BluetoothCharacteristic {
  bytes value = 6;
}

The generated dart code (by Activated protoc_plugin 16.0.1.) is

import 'dart:core' show int, bool, double, String, List, Map, override;

import 'package:protobuf/protobuf.dart' as $pb;

class BluetoothCharacteristic extends $pb.GeneratedMessage {
  static final $pb.BuilderInfo _i = new $pb.BuilderInfo('BluetoothCharacteristic')
    ..a<List<int>>(6, 'value', $pb.PbFieldType.OY)
    ..hasRequiredFields = false
  ;

  BluetoothCharacteristic() : super();
  BluetoothCharacteristic.fromBuffer(List<int> i, [$pb.ExtensionRegistry r = $pb.ExtensionRegistry.EMPTY]) : super.fromBuffer(i, r);
  BluetoothCharacteristic.fromJson(String i, [$pb.ExtensionRegistry r = $pb.ExtensionRegistry.EMPTY]) : super.fromJson(i, r);
  BluetoothCharacteristic clone() => new BluetoothCharacteristic()..mergeFromMessage(this);
  BluetoothCharacteristic copyWith(void Function(BluetoothCharacteristic) updates) => super.copyWith((message) => updates(message as BluetoothCharacteristic));
  $pb.BuilderInfo get info_ => _i;
  static BluetoothCharacteristic create() => new BluetoothCharacteristic();
  BluetoothCharacteristic createEmptyInstance() => create();
  static $pb.PbList<BluetoothCharacteristic> createRepeated() => new $pb.PbList<BluetoothCharacteristic>();
  static BluetoothCharacteristic getDefault() => _defaultInstance ??= create()..freeze();
  static BluetoothCharacteristic _defaultInstance;

  List<int> get value => $_getN(0);
  set value(List<int> v) { $_setBytes(0, v); }
  bool hasValue() => $_has(0);
  void clearValue() => clearField(6);
}

We can see that we still have

  List<int> get value => $_getN(0);
  set value(List<int> v) { $_setBytes(0, v); }

Expected generated code:

  Uint8List get value => $_getN(0);
  set value(Uint8List v) { $_setBytes(0, v); }

One concern might be that this is more or less a breaking change in strong mode.

@sigurdm
Copy link
Collaborator

sigurdm commented Mar 13, 2019

Can you be more specific?
GeneratedMessage.writeToBuffer already returns a Uint8List.

@truongsinh
Copy link
Author

@sigmundch I updated the description to have current behavior and desired behavior.

@njskalski
Copy link

Just hit the same wall.

@rajveermalviya
Copy link

@sigurdm Any traction on this?

most of the dart:core is using Uint8List for bytes, instead of List<int>, since Dart v2.5.0.

ref: dart-lang/sdk#36900

@osa1 osa1 self-assigned this Apr 11, 2022
@osa1
Copy link
Member

osa1 commented Jul 15, 2022

Relatedly, we currently do not convert List<int> values to Uint8List when a field is set. For example:

  @$pb.TagNumber(15)
  set optionalBytes($core.List<$core.int> v) {
    $_setBytes(14, v);
  }

which calls

/// For generated code only.
void $_setBytes(int index, List<int> value) => _fieldSet._$set(index, value);

So the value is stored as List<int>, which is wasteful because I suspect most subtypes won't be as compact as Uint8List. Also when serializing, we allocate a Uint8List every time we serialize a bytes field:

case PbFieldType._BYTES_BIT:
_writeBytesNoTag(
value is TypedData ? value : Uint8List.fromList(value));
break;

I think the "ideal" solution would be to take Uint8List values in setters and return Uint8List in getters, and store bytes values as Uint8List. Uint8List is the most precise type for what a "bytes" field is, and it's more efficient than storing each byte as an int.

That's a breaking change though and I'm not sure if we can afford it internally. So a close second could be keeping the API as-is, but storing the values as Uint8List. I think technically this would not be considered a breaking change, though I wouldn't surprise if it breaks some code because someone sets a bytes field an int array or something like that, and downcasts the getter return value.

@osa1 osa1 added the perf Related to runtime performance label Aug 10, 2022
@osa1
Copy link
Member

osa1 commented Aug 11, 2022

Also when serializing, we allocate a Uint8List every time we serialize a bytes field:

I just realized that this is not true, because Uint8List is a subtype of TypedData, so in the code

case PbFieldType._BYTES_BIT:
_writeBytesNoTag(
value is TypedData ? value : Uint8List.fromList(value));
break;

We don't allocate a Uint8List if the field value is already an Uint8List.

Another strange thing that I realized while reading the 4 lines shown above is that when we set a bytes field an List<int> with integers larger than max value of a byte, truncation is done when serializing, not when setting the field. Example:

syntax = "proto3";

message M1 {
  bytes b = 1;
}
M1 m = M1();
m.b = <int>[99999];
print(m.b); // [99999]
print(m.writeToBuffer()); // [10, 1, 159]

The reason why this is strange is because if I send this message to another program I would expect the local value for the message and the received message value to be the same, but if both the sender and the receiver use the bytes field for the same operation they will potentially do different things as the bytes values are different.

Oh, and also, the JSON map and proto3 JSON serializers throw an exception when serializing this field.. So the library is also inconsistent in handling of such values.

@brandsimon
Copy link

I understand that this could break someones code and is not backwards compatible, but I think having an Uint8List as bytes is the correct way to do it. So how about adding a comment to the .proto file or a command-line option for the protobuf-compiler which changes the resulting class? So people can choose which type they want.

@osa1 osa1 removed their assignment Dec 30, 2022
@hagerf
Copy link

hagerf commented Mar 19, 2023

Hi
Is there any development on this? I agree with everyone here that Uint8List is the proper type for bytes. But is it probable that these changes will be made anytime or are we probably stuck with current implementation for the foreseeable future? Thanks

@temeddix
Copy link

This is indeed a serious issue. List<int> is known to be much slower than Uint8List. Any progress?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request perf Related to runtime performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants