-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactoring: allow usage of UBX protocol related stuff only #3
Conversation
Use common workspace for two crates, to speedup build, plus it is possible to use override section in workspace's Cargo.toml, so it is possible to publish real crates (not workspace) without any modification in Cargo.toml
at now there is problem with generated code, it for reason generate trait implementation with wrong method names
now we don't build syn different versions (outdated num_derive dependencies)
I removed strange no_mangle and inline(never). This was attempt to emulate `cargo expand` with gdb?
This need to make refactoring of ublox_derive much simplier
I'm not sure whether splitting via crates or via feature flags is better. Part of me thinks since we have the separate crates already for |
I am not plan go too far here. I would create MVP to represent my view how UBX protocol parser/generator + derive should work together on the basis of few packets. And then we can discuss the way to split syntax, the way to divide into crates and modules and so on. If we talk about spliting on crates or use feature flags, |
Alright sounds good... looks reasonable so far. |
I reached the point where it is time to discuss "derive syntax". cargo install cargo-expand
cd ubx_protocol && cargo expand . I would like discuss the syntax first, and then move to how split crates issue. The main idea in compare with previous derive syntax - self contained description. So we can generate the most of the code automatically. Side note: The idea of
StructPayload {
field1: value1,
filed2: value2
}.to_packet(); And get packet with checksum and so on,
and we convert this human readable What do you think? |
And also I try to make syntax of attributes similar to serde as much as possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some notes and questions.
I would be curious to hear your thoughts on the TestPacket3
code in ublox_derive/test/test.rs and the associated bitfields
test. Unfortunately, the unit tests for that subcomponent seem to be super broken, I don't remember leaving it that way :/
Either way, you can run cargo expand --lib
on the main component to get the expansion for NavPosLLH. It generates two traits, NavPosLLHGetter
and NavPosLLHSetter
, and then several structs - a NavPosLLHRef
struct for reading data, a NavPosLLHMutRef
struct for modifying buffers, and a NavPosLLH
struct for lazy people with lots of memory.
ublox_derive/tests/test.rs
Outdated
GpsFixOk = bit0, | ||
DiffSoln = bit1, | ||
WknSet = bit2, | ||
TowSet = bit3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see an example offhand, is it possible to specify a field which spans multiple bits? (for example, mapMatching
in NavStatus). In my prototyping I had used hi:lo
syntax, which I stole from Verilog, which uses that for the same thing.
Related question, is it possible to have a bitfield map to an enum type? In the above example, I might want the enum values None, Valid, Used, and DeadReckoning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to specify a field which spans multiple bits?
Yes, in example:
HotStart = mask0,
the idea for bitflags is mimic code generated by https://docs.rs/bitflags/1.2.1/bitflags/ , the most popular crate as I know for this kind of problem. In other words create strong typing for work with bits to prevent errors.
Related question, is it possible to have a bitfield map to an enum type? In the above example, I might want the enum values None, Valid, Used, and DeadReckoning.
Not sure what you mean, enum and bitflags is divided because of it is possible to have exactly one enum value and several bitflags values.
For bitflags you can give alias to any subset of bits via mask
syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see, HotStart
isn't a field that spans multiple bits per se, it's a specific combination of setting all the bits. But each field is still a one-bit field.
How would you represent the mapMatching
field in the fixStat
bitfield in NavStatus? Would that be:
None = mask0,
Valid = mask64,
Used = mask128,
DR = mask192,
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you represent the mapMatching field in the fixStat bitfield in NavStatus? Would that be:
Yep, general idea is something like this:
Valid = mask64,
Used = mask128,
DR = mask192,
Though we need some proper syntax to represent 0 bits,
None
is not mask0
, it actually 0 in 7 and 6 bits.
May be something like this:
None = 0_bit7 | 0_bit6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having to specify the decimal numbers 192
for a bitfield type seems error-prone to me (what if we typo and type 191?)
Having the ability to directly retrieve those two bits and treat them as an enum is still type-safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having to specify the decimal numbers 192 for a bitfield type seems error-prone to me (what > if we typo and type 191?)
Having the ability to directly retrieve those two bits and treat them as an enum is still type-safe.
When I design this syntax I have in mind specification like:
0x0000 Hotstart
0x0001 Warmstart
0xFFFF Coldstart
so user can just copy/paste:
Hotstart = Mask0x0000
Warmstart = Mask0x0001
Coldstart = Mask0xFFFF
and that's all,
Obviously we can also support syntax like:
X = bit1 | bit0
where in specification there is notion of bits, not exact value.
To summarize:
X = mask IntegerConstant
X = 0_bitN | ...
X = bitN | ..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, yes, there are a few cases where we have a known "preset" for a bitfield. How does this interact with the rest of the flags? Is Warmstart | Clkd
acceptable?
I think Hotstart/Warmstart/Coldstart might belong in a higher layer than raw protocol parsing. We could provide some convenience methods to make some different types of packet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Warmstart | Clkd acceptable?
If we follow bitflags interface then yes.
I think Hotstart/Warmstart/Coldstart might belong in a higher layer than raw protocol parsing.
What use-case have in mind? For me this is just convenient pre-defined bitset nothing more.
So I can pass:
nav_bbr_mask: NavBbrMask::Warmstart
if I want use pre-defined bitset with one more bit why not?
nav_bbr_mask: NavBbrMask::Warmstart | NavBbrMask::Clkd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing this is clearly user error: nav_bbr_mask: NavBbrMask::ColdStart | NavBbrMask::Clkd
(or at least, sloppy code on the user's part), if we allow NavBbrMask::Warmstart | NavBbrMask::Clkd
then we have to allow both.
ublox_derive/tests/test.rs
Outdated
NoFix = 0, | ||
DeadReckoningOnly = 1, | ||
Fix2D = 2, | ||
Fix3D = 3, | ||
GpsDeadReckoningCombine = 4, | ||
TimeOnlyFix = 5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What advantage to you get from specifying this inline versus referencing enums specified elsewhere?
Some enums are used in multiple places, for example this gpsFix
field of NavStatus has the same values as gpxFix
in NavSol, and ostensibly the same values as the fixType
field in NavPvt (the latter seems to be more GNSS-oriented vs. the GPS-specific type).
(here and elsewhere, I'm looking at the u-blox 7 protocol specification, that's what I have handy)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What advantage to you get from specifying this inline versus referencing enums specified
elsewhere?
The whole idea to have "safe" interface. User will not work with bits directly only via restricted interface. To get this to work "ublox_derive" have to know all members of enum,
plus it can generate proper enum:
#[repr(TYPE_FROM_PACKET)]
enum Foo {
// user specified values
Unknown(TYPE_FROM_PACKET)
}
Plus without knowledge about enum variants it is impossible to generate
TYPE_FROM_PACKET -> Foo converter like:
let x: TYPE_FROM_PACKET
match x {
0 => Foo::Val1,
}
it is possible to convert Foo -> TYPE_FROM_PACKET via as TYPE_FROM_PACKET
, but not other way around.
I don't take into consideration "shared" enums,
for them I suppose it is possible:
#[derive(UbxDefineSubTypes)]
enum Foo {
}
to pass information about this enum into "ublox_derive".
But "inline" syntax is more nice way, because of we can not edit already existing types
via proc_macro compiler plugin. So user have to add things like Unknown
variant by hands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the "user" in this case is users of ublox_derive
, which should really just be contributors to this project. End users shouldn't be needing to define their own enums. So, I'm not too hurt if we have to hand-specify our own Unknown
field for enums where it makes sense.
Deriving a trait on the enum to make safety should work just fine, for example the FromPrimitive
trait I believe gives us a safe way to do it (although instead of Unknown
, it throws an error, which I think might be preferable - if the device gives us a packet we can't fully understand, should we parse it successfully?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I give it another thought, I suppose it is almost impossible for proc_macro script.
#[derive(UbxDefineSubTypes)]
enum Foo {
}
#[derive(UbxDefineSubTypes)]
struct Pack {
}
ublox_derive get the Pack
and Foo
via separate functions calls.
And the order also unspecified,
#[proc_macro_derive(UbxDefineSubTypes, attributes(ubx))]
pub fn ubx_packet_subtypes(input: TokenStream) -> TokenStream {
In other words it is possible to ubx_packet_subtypes can be invoked in parallel, in separate threads for Foo
and Pack
, and during processing Pack
we can not get information about what is Foo
.
But it would be possible via build.rs
, the usage of build script I suppose allow us even do better, like combining checking of all defined packages automatically, so we can generate:
match (class, id) {
(NavStatus::class, NavStatus::id) => ...,
(NavPosLLH::class, NavPosLLH::id) => ...,
without human involvement.
Deriving a trait on the enum to make safety should work just fine, for example the
FromPrimitive trait I believe gives us a safe way to do it
Actually I am not fond of small crates for 2+2.
I have reasonably small project workspace of 40 private packages and dependencies tree from crates.io something like ~350 crates. And I have bunch of duplicates which I can not fix,
like three num_traits: 0.1.43, 0.2.11...
Why use third party create for something as simple as:
match n {
1 => Some(MyEnum::V1),
2 => Some(MyEnum::V2),
_ => None,
}
if the device gives us a packet we can't fully understand, should we parse it successfully?
I suppose we should parse it successfully in case of unknown bits are "reserved" one.
The purpose of "reserved" is exactly is this - changing of protocol without breaking software.
Why oppose it? I suppose the developer who care about protocol changing, also not expect this behavior.
If user cares about protocol version he/she can request version via UBX protocol and check.
I suppose reporting error for 10 types of packets that they contains bits in reserved area is not nice thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I have bunch of duplicates which I can not fix
Coming from a C/C++/Python world, I can see how this is unfortunate. But how bad is it here in Rust? Unless you do cargo clean
a lot, you would never even know.
I suppose we should parse it successfully in case of unknown bits are "reserved" one
There's two types of "reserved" we have to care about:
- Reserved fields being populated. For example, the
reserved1
field of NavPvt, or bits 7 through 3 of thevalid
bitfield in NavPvt. - Receiving new values for existing enums, e.g. if we got 0x13 for the
fixType
field of NavPvt.
In the first case, I agree, we should ignore set bits and pass them on. In the second case, though, I'm less convinced - if a user is relying on fixType, it complicates their code to have to handle the Unknown
case. And what can they do when they get that? At best, they can error anyway. I don't want user code to look like this:
match fixType {
NoFix => { fail() }
DeadReckoning => { fail() }
Unknown(0x13) => {
// ublox doesn't know about this packet @TODO: Move upstream
do_something()
}
Unknown(_) => {
fail();
}
}
Or whatever. If a user needs to get value 0x13 out of fixType
, I want them to post an issue on this repo saying "I need to get 0x13 from this packet." (they can, of course, still do the modification locally so they don't have to wait for a fix) That way their code stays cleaner, and we get notified (and ideally a PR!) whenever ublox changes up the protocol.
In other words it is possible to ubx_packet_subtypes can be invoked in parallel
As long as the enum and the struct are separated by a trait the enum implements (such as FromPrimitive), is this a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like it'd be better to centralize all of the code into X
But there is NO centralization into X. In general only last version of X receive fix.
And if have A depend on X1, B depend on X2, and C depend on X3,
only C receive proper fix that I can get via cargo update
,
I need manually patch B and A via override
in proper cargo section.
So in ideal world it may be works, but in reality each crate in dependency tree is problem.
Because always there is probability that you have to become it's maintainer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, for dates it gets complicated quickly. So I guess we can take it on a case-by-case basis, which is fine - some enums get Unknown
fields (such as fixType
), some don't (such as NavDgps::status
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But at least you can patch B and A, and you know that they have the bug, and you know how to fix the bug. If A, B, and C each have separate implementations of X, and C's implementation has a bug, how would you ever find the bug in A or B?
From the point of view of the owners of A, B, and C, reimplementing X may be a lot of work - not only in writing the code, but in finding all the roadblocks that X has already solved for them.
For complicated things, the benefit grows (such as HTTP servers) - for simple things, the risk is low (what sort of bug are you worried about in num_traits
that will affect your end-user code?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But at least you can patch B and A, and you know that they have the bug
I don't know that they have the bug. I only know that X3 has bug and it was fixed.
I have to dig X1 and X2 to find it out. In that case if A and B provide similar functionality to X and don't use X as dependency I can also reread their code.
And after each major release code may be completely rewritten. So I don't see big difference,
if A and B "embed" functionality of X, or they use outdated X version,
in both cases I should look at their code.
For complicated things, the benefit grows
Don't get me wrong, I am against tiny crates like num_traits
that doesn't contain any complex functionality, for such kind of crates there are more minuses than pluses.
For complex or "big" enough functionality separate crate is obviously better.
what sort of bug are you worried about in num_traits that will affect your end-user code?
It provides some functions, I am not sure that all corner cases for all possible number types are covered. And it is "trait" package. That mean that:
//A crate
impl num_traits::XYZ for A {}
//B crate
impl num_traits::XYZ for B {}
//my crate
fn some_func<T: num_traits::XYZ>(_: T) {}
doesn't work, because of num_traits::XYZ for different versions are not the same (of course if "semver trick" was not used).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can also reread [A and B's] code
This does not sound easier than checking to see if they depend on some particular version of a crate. Checking for old crate versions can even be done automatically, without having to look at their code.
doesn't work, because of num_traits::XYZ for different versions are not the same
Sure, and this is inconvenient, but if B had implemented their own B::XYZ
trait, you'd be no better off.
ublox_derive/src/output.rs
Outdated
} | ||
impl #payload_struct { | ||
#[inline] | ||
fn to_packet(self) -> [u8; #packet_size] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the fixed-size case, should we instead of returning a value, take a &mut [u8; #packet_size]
, so that a user can populate a packet into a buffer that they already have?
Continuing on this, a user may not want to instantiate an extra struct, if they already have a transmit buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so that a user can populate a packet into a buffer that they already have?
This would be not nice interface I suppose.
User's buffer can be not exactly the same size. In this case it would be impossible to get &mut [u8; #packet_size]
without unsafe, &mut [u8]
can not be converted to &mut [u8; #packet_size]
without unsafe.
Also from compiler with optimization point of view this doesn't help a lot.
#[inline]
fn foo(s: S) -> [u8; X] { ... }
let mut my_buf[u8; X];
my_buf = foo(S {});
After inline foo
s: S
should be copied exactly into my_buf
without
usage of intermediate buffer, something like "copy elision elimination" from C++.
Actually the idea for sender that compiler eliminate almost all stuff and it would be like this:
struct PackPayoud {...}
impl PackPayoud {
pub fn to_packet(self) -> [u8; X] { ... }
}
let buf = PackPayload { x1: f1, x2: f2}.to_packet();
->
let mut buf: [u8; X];
buf[..] = init_header;
buf[6] = to_bits(f1);
buf[end] = calc_crc();
So at the end it would be code without any copy data around.
Continuing on this, a user may not want to instantiate an extra struct, if they already have a transmit buffer.
I don't consider this use-case, because of I think user can only send "configure" packages.
And I don't see much sense for this use-case to change configuration many times times like this:
cfg1.pref1 = 1;
cfg1.pref1 = 2;
cfg1.pref1 = 3;
I suppose configuration will be settled only once after start of program and that's all.
For this use-case I suppose there is need for your Setter
, but what use-case for this,
what packet user have to send in this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case it would be impossible to get &mut [u8; #packet_size] without unsafe
I believe try_into works for this:
use std::convert::TryInto;
pub fn bar(data: &mut [u8; 12]) {
data[0] = 5;
}
pub fn foo(data: &mut [u8]) {
bar(data.try_into().unwrap());
}
Of course, I unwrap there, but you should return the Result on error.
For this use-case I suppose there is need for your Setter, but what use-case for this,
what packet user have to send in this way?
I had imagined a scenario where e.g. you have a DMA buffer for your UART, which you've already allocated for previous transfers. For example, some chips in NXP's LPC series allow DMA to the UART (not any that I'm using, but I would if I could).
Even if you haven't already allocated the DMA buffer, you still need to pin it in memory somewhere, so that you can pop the stack before the DMA completes.
I agree that the usecase of "I have a particular config packet which I need to send multiple times with slight variations" is unusual.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be that with the compiler optimizations you can still call to_packet
into a local variable, then copy the local variable into the DMA buffer and have it still be zero copy. However, the code doesn't "look" zerocopy, so it takes more brainpower for me to think "oh, this code is actually pretty fast."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose in this case it is better to accept std::io::Write
as argument.
It is already implemented for Vec<u8>
and it is more nice way to do things,
for example user can reallocate and pin DMA buffer if there are no space.
And common users can just pass Vec<u8>
and care about ask about suitable size,
allocate size and that call foo
? Also I supose std::io::Write
is implemented for &mut [u8]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Secretly, I want to permit no_std
in the future, so std::io::Write
might be a bit heavyweight. Also it might not still be zerocopy? I don't know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Secretly, I want to permit no_std in the future, so std::io::Write might be a bit heavyweight. Also > it might not still be zerocopy? I don't know.
We actually need one function, so I suppose it is possible to define it in such way for no_std case:
#[error]
enum NotEnoughMem {}
trait MyWrite {
fn write(&mut self, buf: &[u8]) -> Result<usize, NotEnoughMem>;
}
impl MyWrite for Vec<u8> {
fn write(&mut self, buf: &[u8]) -> Result<usize, NotEnoughMem> {
unimplemented!();
}
}
impl<'a> MyWrite for &'a mut [u8] {
fn write(&mut self, buf: &[u8]) -> Result<usize, NotEnoughMem> {
unimplemented!();
}
}
ublox_derive/src/output.rs
Outdated
} | ||
impl #payload_struct { | ||
#[inline] | ||
fn to_packet(self, out: &mut Vec<u8>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, we should be able to take &mut [u8]
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, we should be able to take &mut [u8] here.
I should note that this is interface for dynamic-sized packages.
And change to &mut [u8]
makes it squishy, because of what it there is no enough space into buffer? In this case we should return error and user should allocate more space,
plus it is Rust, so he/she should zero intialize new allocated bytes or use unsafe
.
This make interface squishy, while usage of Vec<u8>
make it super simpliy, we allocate as many space as we need without "the second call" and write data right after allocation without need extra assign to zero.
May be there is no need to send variable-sized packages at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user may not be able to allocate more memory, or allocating more memory may be more complicated then expanding a vector. What if the packet had a get_formatted_size
method that returned the size?
I think some of the AID packets are variable-sized to the chip. Otherwise I can't think of anything off the top of my head.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the packet had a get_formatted_size
I suppose std::io::Write
will be better, because of get_formatted_size
can be effective and with bad API:
impl FixedSizePack {
const fn get_formatted_size() -> usize;
}
impl VarSizePack1 {
const fn get_formatted_size(variable_field1_size: usize) -> usize;
}
impl VarSizePack2 {
const fn get_formatted_size(variable_field2_size: usize, variable_field3_size: usize) -> usize;
}
Or not effective with good api:
trait GetPacketSizeInsideWire {
fn get_formatted_size(&self) -> usize;
}
In the first case user can allocate space on the stack, without VLA (Variable-length array) / Heap,
in the second API is consistent, but user have to use "heavy" methods for memory allocations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, either the user will have to "just allocate enough" on the stack, or will have to use heap options. If we pass a Vec
then we're forcing the user to use a Vec (I'd have to think more about std::io::Write
, but as mentioned above, I don't think it's no_std
).
I saw them before start working on this. And the problem as I see several problems with it:
I suppose it is not fun to dig that I need multiply to
Side note: |
I agree, I like the idea of handlers you mentioned elsewhere - I think simply adding some handlers for this conversion would solve this pretty well.
What if the As mentioned elsewhere, I think |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the Setter trait, instead of having a set for each field, had a single set method that took a parameter for each field?
There is no named arguments in Rust, so:
fn foo(a: u8, b: u8)
would be foo(a, b)
,
and you can mix a and b in the call if they have the same types,
while with struct
initialization you describe exactly what you want:
Foo {
a: value1,
b: value1,
}
you describe exactly what you want (not sure how many neighbour fields with the same type we have thought),
the other reason is optimizer, when you pass struct by value, there is the high chance of reusing
caller stack frame by callee plus inling. With passing many parameters via trait call I am not sure will be that optimized or not.
Plus clippy will be not fond of methods with number of arguments > N,
and I suppose the user also not fond of usage such API if there are many arguments.
Alright. My main worry in this regard is that, although some tests with godbolt do seem to suggest that the compiler inlines everything very well, I can't convince myself that it will always inline in every situation (and in the cases it does inline, it's not clear to the user that it is zero copy). I'll have to play with the compiler some more to see if I can find any corner cases. |
@lkolbly I suppose our use case is perfectly fit for optimizer:
So, I think if you found "corner case" it would be bug/missed optimization in optimizer. |
Well, I'm imagining some situation where somebody passes a constructed packet to some other function to be sent:
It's less clear to me how this could be optimized to zerocopy, assuming |
But in which situation this is possible, fn full_reset(dev: &mut Device) -> ... {
CfgRst {
}.to_packet()
..
} When you want send saved almanac or ephemeris or something like this your prefer to pass them, not packet by itself. So why user would want pass packet by itself, not it's binary representation (already packed) or key parameters for packet construction? |
If sending the packet to the device requires complication (like configuring DMA streams or whatever), you may want a generic function to handle sending packets:
|
But this require |
That's a fair point. Should we generate such a trait though, for this usage? It seems like a reasonable use case to me. |
Even so, though, the API still doesn't look like it's zerocopy, to the user. Maybe it's conventional in Rust to assume that more things get inlined and I'm just old and wrong, but I think it would be nice for the API (at least at this level) to look like it zerocopies. |
Let's summarize use-cases:
CfgXPayload1 {
}.to_packet(dma_buf_allocator);
transfer_data(dma_buf_allocator.buf);
wait_ack();
repeat for CfgX2, CfgX3
{
let buf_with_known_i_dont_want_to_care = CfgX1 {}.to_packet();
spi_transfer(&buf_with_known_i_dont_want_to_care)?;
wait_ack()?;
}
repeat for CfgX2, CfgX3 From one hand there is not a lot of gain to replace 3 lines with one function, |
I'm fine with offering both methods as options. |
But is not all Rust's stdlib and Rust by itself constructed in similar way? it.skip(1).map(|x| x + 1).take(5).any(|x| *x == Y) it is safe because of usage iterator instead of indexes and fast because of compiler We actually do exactly the same: Safe:
you can not miss parameter or pass wrong one, So isn't idea of this API similar to Rust by itself? |
But implementation would be tricky one. It is can duplicate code (so we increase compile time)
May be change naming like: To indicate that this structure just for passing parameters |
I'd have to think about it some more when I get home, but would it work to generate a NavPosLLHBuilder struct which can be converted into the NavPosLLH struct which is generated now? |
And then |
So I think I like your argument with the iterators. But, playing around in godbolt, is it possible to make a trait for a function which returns Notably, |
Rough idea: trait UbxEncodable {
type Output;
fn encode(self) -> Self::Output;
}
struct Pack;
impl UbxEncodable for Pack {
type Output = [u8; 4];
fn encode(self) -> Self::Output {
unimplemented!();
}
} |
Ah, okay, I was trying to make the trait variable be the number of bytes returned, which I couldn't get to work. You'd have to add some trait bounds to
won't compile out of the box. I think for our use cases, taking a |
It can be fixed in such way: trait UbxEncodable {
type Output: Sized + AsRef<[u8]>;
fn encode(self) -> Self::Output;
}
struct Pack;
impl UbxEncodable for Pack {
type Output = [u8; 4];
fn encode(self) -> Self::Output {
unimplemented!();
}
}
fn foo<T: UbxEncodable>(data: &mut Vec<u8>, packet: T) {
let encoded = packet.encode();
data.extend_from_slice(encoded.as_ref());
} |
The last commit should address almost all issues:
Plus it contains real implementation of all methods except The only part that is partly implemented is bit flags vs predefined mask.
define not all bits, but Coldstart mask is Also relatively big change is switch from "derive" to build script. This is because of I want to:
Switching to build script also give ability to use log crate and to write unit tests. |
I would prefer to avoid this if possible, build scripts are inelegant and subject to less clear rules to the user (of the build script).
Assuming we wanted to do this (it doesn't sound too hard to do by hand - an element in an enum, and a branch here or there), could we just generate the struct matcher using a build script, and use derive macros for everything else?
This lets us write tests that the build script outputs exactly some particular sequence of tokens, but is that valuable to test? Suppose we write the scripts so that it generates some code that looks like: pub fn dgps_status(&self) -> DGPSCorrectionStatus {
let val = self.0[5usize];
<DGPSCorrectionStatus>::from_unchecked(val)
} but then in a few years we decide we want to write generate the code like this: fn raw_dgps_status(&self) -> u32 { self.0[5usize] }
pub fn dgps_status(&self) -> DGPSCorrectionStatus {
let val = self.raw_dgps_status();
<DGPSCorrectionStatus>::from_unchecked(val)
} Will the tests fail, and would we want the tests to fail, in this situation? I think we actually do want tests that look like: let data = [...];
let s = StatusRef(data);
assert!(dgps_status() == DGPSCorrectionStatus::Foobar); which directly test the behaviour that we actually want.
I'm not sure I follow what you mean here. Also, where is |
Not sure that I agree. For example "derive crate" should be added as dependency, while build script dependencies are part of
"By hand" is error prone, if someday we will support all packets of ubx protocol it would be huge amount of cases, so probability of error in case usage of "by hand" will be big enough.
It is possible, but why? If we already decide to use build script, why use something else, this make code more complex, because of two ways to process code that partly overlaps. Why not use just one?
I think so, for example test suite of compiler doesn't just compile and run small cases, it also generates assemble and match it, like this one. For example we can validate that there is no index-of-bound error in our code. Rust check this is in runtime, but I suppose you agree that better detect it in compile time. In other words we can check different invariants on code generation stage in compare with code running stage.
But is it how all unit tests works? If you make code refactoring in many cases (not all of course) you have to also change unit tests, this is one of disadvantages of unit tests.
But actually we don't need this
in fact by protocol definition (and as we discussed earlier) there are reserved bits in field that described by this type, but how we can represent them:
is invalid Rust syntax (compile time error),
But we loose many things:
enum GpsFix {
|
This is true, there are some cases where you have to refactor the unit tests. But I would prefer to keep those cases to a minimum. Making the unit test "matching a token stream" means that pretty much any change requires changing the unit tests. In the case of the compiler, the output of the compiler actually is a specific stream of instructions, so if you changed the compiler and ended up changing the output, that would be a change the user cares about. The user of ublox_derive doesn't care one bit about the token stream, only about how the generated structs behave. To your point about detecting index errors at compile-time vs. runtime, unit tests basically run at compile time anyway, so we wouldn't detect these errors any sooner or later. And if we can detect these errors at unit-test-time, that would be automatic, whereas having to write the correct, exact token stream by hand is more error prone (if the written token stream to test against is wrong, we wouldn't detect any bugs unless there were more unit tests elsewhere of the actual behaviour)
By hand is error prone, yes. But if that becomes a problem we might be able to use a macro for this after all, if we do something like:
This is true, but we don't need a build script to filter out CfgRate in this example.
I haven't tried it so I might be wrong, but it seems like we should be able to do this just fine with a derive crate (well, maybe the term isn't really "derive" crate, since it isn't a "derive" macro). The derive crate used here adds all sorts of methods to the struct, and the resulting struct has none of the same fields. It seems like we should be able to do similar magic with enums.
Ah, thank you, one of the pitfalls of using ctrl+f to search through reviews :P |
I am obviously will write unit tests, but I'm not sure that I have time to write unit tests with 100% coverage and run some kind of fuzzy testing. You, I suppose, miss the point that become obvious for me during writing another "code generation" project: And for such kind of unit tests it is great to fail time to time, because of then they print input and output and you review generated code. With code generation you missed "code review" step and unit tests can be full replacement for this step. So if you change something in code generation it is really great if test fail, and it is really important to match simple test cases exactly, so you can make sure that code generator works as expected. For example: let mut output_buffer = [0u8; X];
let packet_len_bytes = packet_len.to_le_bytes();
output_buffer[4] = packet_len_bytes[0];
output_buffer[4] = packet_len_bytes[0]; you see the last line should be
Oh I see, I used only |
ubx_protocol crate has almost reached point when I can use it in my project.
What do you think about the rest, the code that generated to support packages parsing and generation, parsing algorithm (it is algorithm that I used in benchmark here) ? |
I haven't had a chance to look at everything, but just looking just at the generated code, it looks reasonable, except that I'm not sure you need to take a
? Also, for the To your point about unit tests, I see now the benefit of having one test checking the token output (although really, there should be unit tests for packets longer than 256 bytes). But I wouldn't want all of the testing to rely on tests like that. |
On stage of packet formatting we already can calculate amount of memory,
Actually I suppose we can use interface like that:
And then we can write to
but this semantic tell that we consume writer, But I suppose you worry about dynamic dispatching vs static dispatch?
? So let's move to our disagreements.
For me this is not important. Let's use one crate as you suggested, actually two - one for "code generator" plus already existing one - ublox. And use features to disable all functionality except UBX protocol. Agree? |
Hm, I guess leave NotEnoughMemory how it is. The create_packet signature looks good to me (would it make sense to call it That crate architecture sounds fine, we do need a separate one for the code generator. |
I am closing this PR, and will create the new one to better match the results of our discussion. |
Alright, sounds good. I'm planning (but haven't started) to make a crate which might have some overlap with the functionality we're talking about here, basically just a crate to handle describing and accessing device register maps on embedded devices. Some aspects of which I think are very similar to what we're discussing here (specifically, the syntax of defining what bits are where). Would you want to move this discussion toward this new crate? If not that's fine, it'll probably be a few weekends at least before I'm able to start any real prototyping, so I think it still is fine to make a PR for adding this functionality here. |
In fact, if it would be 100% "for fun" project, I would not use any proc macro here at all. So if it would be 100% for fun project for you, I would recommend this approach, may as discussed on some linux kernel conference generate the whole code, including algorithm from hardware description. |
Eh, I've written pdf parsers before. It's very scary. In my ideal world, we get ublox to export their register maps directly as XML or something machine-readable. |
It is not that bad, because of there is no need write pdf parser.
There is also stext format, XML, but not as easy to read.
But it is "social hacking" project, not 100% pure developer project. |
This PR should add ability to use only UBX protocol related stuff in dependent crates.
As discussed in #1 this PR has several goals:
from ublox crate
u8
addition and so on.