Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RTS: remove repr(packed) attributes from RTS structs #2764

Merged
merged 10 commits into from
Sep 10, 2021

Conversation

osa1
Copy link
Contributor

@osa1 osa1 commented Sep 6, 2021

This is to remove some of the potential undefined behaviors (UB). It will also
remove some of the syntactic noise in #2761.

Rust has alignment restrictions for types and fields beyond the hardware
limitations. This means even on Wasm (which has no alignment restrictions) we
have some restrictions to deal with.

Recently the compiler started to check for some of the obvious sources of UB
(rust-lang/rust#82523). One of these is when taking a
pointer or reference to a packed struct. Since packed means no padding
between fields, the fields may be unaligned, in which case getting a reference
to them would be UB.

To fix this, this PR removes packed attributes and refactors the Bits64
type to make sure that the layout is as before. It turns out for types other
than Bits64 we don't need packed: all fields are word sizes so the compiler
does not add any padding. For Bits64, we had a u64 field which is aligned
on 64-bit boundary. To avoid this we now split 64-bit payload into two 32-bit
fields.

A new module static_checks added to make sure struct sizes are as expected.
Assertions in this module are checked in compile time.

No extra work needed to align objects. All objects need 4 bytes alignment
(checked with core::mem::align_of). The compiler already aligns static
objects
, and in runtime we only allocate whole words. So both static and
dynamic objects are always aligned. Debug mode assertions added in GC to check
object alignment.

@osa1 osa1 requested review from crusso, nomeata and ulan September 6, 2021 08:34
@github-actions
Copy link

github-actions bot commented Sep 6, 2021

Comparing from 0dc1ac9 to 3b43827:
In terms of gas, 1 tests improved and the mean change is -0.0%.
In terms of size, 3 tests improved and the mean change is -0.0%.

@RalfJung
Copy link

RalfJung commented Sep 6, 2021

Since packed means no padding between fields

It means that, and it means that the required alignment of the struct itself is 1. So e.g.

#[repr(packed)]
struct My64(u64);

Here, My64 has a required alignment of 1, so if you put it in another struct, no padding will be added. This matches the "packed" attribute in C.

Note that we also need to make sure all references and pointers need to be
aligned (should be on word boundary, but I can't find this mentioned in
anywhere in the reference).

Alignment is not on word boundaries, alignment is a per-type property. You are already familiar with size being a per-type property: every type has a size, which can differ depending on the target. Exactly the same is true with alignment: every type has an alignment requirement, which can differ depending on the target.

So a &u16 has to always be aligned for the alignment requirement of a u16 (which is target-specific and can be obtained via align_of::<u16>; on x86-64 Linux this is 2), but a &u64 has to be aligned for the alignment requirement of a u64 (which will typically be higher; on x86-64 Linux this is 8).

The compiler mostly takes care of this. All static are aligned according to the requirement of their type, similar for all let-bound variables and all Box. When you create your own heap allocation, the Layout passed to the allocation function defines both the required size and alignment of the new allocation.

When you only write safe code, the compiler already ensures that this is all done correctly. The documentation of unsafe operations should all explicitly say that alignment is required. If you think there is a place in the reference that should talk more explicitly about alignment, please open an issue!

// We have two 32-bit fields instead of one 64-bit to avoid aligning the fields on 64-bit
// boundary.
bits_hi: u32,
bits_lo: u32,
Copy link

@RalfJung RalfJung Sep 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative here might be to use [u8; 8], and then do the load with something like bits.as_ptr().cast::<u64>().read_unaligned(); that should translate to a single wasm load.

(But be aware of endianess if you access the bytes directly... I am a bit surprised to see a new getter but not a new setter/constructor that properly encodes data into these fields.)

Copy link
Contributor Author

@osa1 osa1 Sep 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit surprised to see a new getter but not a new setter/constructor that properly encodes data into these fields

These objects are allocated by the generated code and the field is written using Wasm i64.store. We don't need to allocate them in the RTS so we don't have setter or allocation function for it.

Wasm is little endian so we have low bits first, then high bits. (just updated the code changing the order of fields, but the preview above your comment does not seem to be updated)

@osa1
Copy link
Contributor Author

osa1 commented Sep 6, 2021

Alignment is not on word boundaries, alignment is a per-type property. You are already familiar with size being a per-type property: every type has a size, which can differ depending on the target. Exactly the same is true with alignment: every type has an alignment requirement, which can differ depending on the target.

So a &u16 has to always be aligned for the alignment requirement of a u16 (which is target-specific and can be obtained via align_of::; on x86-64 Linux this is 2), but a &u64 has to be aligned for the alignment requirement of a u64 (which will typically be higher; on x86-64 Linux this is 8).

Thanks for pointing this out again. I updated PR description for what we need to do. Object we allocate all need 4 byte alignment, so we need to make sure static objects are aligned on word boundary (4 bytes). In runtime we only allocate words, so once we align heap base on word boundary all allocations will be aligned.

@RalfJung
Copy link

RalfJung commented Sep 6, 2021

So it seems like pointers and references need to be on word boundary regardless of the pointed type.

You deleted that comment so maybe it is not worth replying any more, but just to clear up the understanding: the alignment of T is what pointers/references to T have to uphold. So the alignment of &i32 is relevant for &&i32. (Or maybe I misunderstood and this is what you meant.)

Object we allocate all need 4 byte alignment, so we need to make sure static objects are aligned on word boundary (4 bytes). In runtime we only allocate words, so once we align heap base on word boundary all allocations will be aligned.

So I take it your allocation happens somewhere outside of the Rust code? Then yes you have to ensure that all your structs are happy with whatever alignment that allocator provides.

@osa1
Copy link
Contributor Author

osa1 commented Sep 6, 2021

You deleted that comment so maybe it is not worth replying any more, but just to clear up the understanding: the alignment of T is what pointers/references to T have to uphold. So the alignment of &i32 is relevant for &&i32. (Or maybe I misunderstood and this is what you meant.)

Yeah I deleted it because I realized that I misunderstood this part.

So I take it your allocation happens somewhere outside of the Rust code? Then yes you have to ensure that all your structs are happy with whatever alignment that allocator provides.

Correct.

@nomeata
Copy link
Collaborator

nomeata commented Sep 7, 2021

Hi @RalfJung, how nice to see you here! What brought you to this part of the internet? Dfinity didn't poach you, did they? :-)

pub struct Bits64 {
pub header: Obj,
pub bits: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So was the RTS code actually broken before? Or did the repr(packed) attribute save us?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It had undefined behavior because alignment of this field was not as expected by the compiler. We don't know any bugs that this caused so far.

repr(packed) is to avoid padding between fields, so it potentially breaks alignment. For this object we don't want any padding between header and the payload so we either need repr(packed) or to split it into two u32 fields. The former causes undefined behavior because u64 fields need to be aligned on 8 byte boundaries (as reported by core::mem::align_of::<u64>()), so we now use two u32 fields.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for the explanation!

Copy link
Contributor Author

@osa1 osa1 Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't know any bugs that this caused so far.

I've added some more assertions in the last commits. It turns out all objects are already aligned (compiler aligns static objects, see PR description) as expected so we don't have any bugs.

@RalfJung
Copy link

RalfJung commented Sep 8, 2021 via email

@osa1
Copy link
Contributor Author

osa1 commented Sep 9, 2021

Any volunteers to review this? Please check the updated PR description first.

Copy link
Collaborator

@nomeata nomeata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me

rts/motoko-rts/src/static_checks.rs Show resolved Hide resolved
@osa1 osa1 added the automerge-squash When ready, merge (using squash) label Sep 10, 2021
Copy link
Contributor

@crusso crusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rubberstamping

@mergify mergify bot merged commit f3d1a87 into master Sep 10, 2021
@mergify mergify bot deleted the osa1/remove_struct_packing branch September 10, 2021 09:00
@mergify mergify bot removed the automerge-squash When ready, merge (using squash) label Sep 10, 2021
@osa1 osa1 mentioned this pull request Sep 10, 2021
osa1 added a commit that referenced this pull request Sep 22, 2021
This is in preparation for #2790 and #2706. With #2790 we will start
using rest of the headers for GC metadata (set some of the high bits)
which will break compacting GC as we won't be able to distinguish a
heap address from an object header by checking if the value is larger
than the max. tag value. This check assumes a heap address cannot be
smaller than the max. tag value, which holds because we have at least 64
KiB Rust stack, and then static data for the canister.

With the high bits of headers set, it's possible that some of the
headers will have a value larger than 64 * 1024. So the current check no
longer works.

To allow distinguishing heap locations from headers, this PR refactors
objects tags so that they will all have the least significant bit set.
Since objects and fields are all word aligned (so have the lowest 2 bits
unset, this invariant was established in #2764), we can now check the lowest
bit and distinguish an address from a header.
mergify bot pushed a commit that referenced this pull request Sep 22, 2021
In compacting GC we need to distinguish a heap location (object or field
address) from object headers. Currently this is done by checking if the value
is smaller than or equal to the largest tag. Because first 64 KiB of the heap
is for Rust stack, as long as the largest tag is smaller than 65,536, we can
assume that values smaller than 65,536 are headers.

This way of checking if a value is a header or an address causes problems when
we want to use rest of the object headers to store more information. Examples:

- In #2706 we will use one bit in the header to mark large objects. At least
  initially, we won't be compacting large objects, so mark-compact GC won't see
  large objects and so won't have to care about large header values. But we may
  want to do compaction on large objects, or store other information (maybe
  mark bits, or generation numbers).

- We may want to store number of untagged (scalar) and tagged fields in object
  headers and merge some of the different object types. For example, instead of
  having 3 tags for `Variant`, `Some`, and `MutBox`, we could have one tag, and
  use rest of the headers to indicate that variants will have one scalar, one
  tagged fields, mutable objects will have just one tagged field, etc.

- We could have `SmallBlob` and `SmallArray` types for blobs and arrays with
  lenghts smaller than 65,535 (16 bits length field). This would save us one
  word for small blobs and arrays.

- We don't have to rely on Rust stack being large enough so that largest tag
  will still be small enough to be a valid address in heap.

In this PR we update tags so that they always have the lowest bit set. Since
objects and fields are all word aligned (so have the lowest 2 bits unset, this
invariant was established in #2764), this allows checking the lowest bit to
distinguish an address from a header. With this we can freely use the rest of
the bits in headers.

While this PR currently does not unblock any PRs, it's nice to have this
flexibility for the future changes, and these changes do not have any
downsides. (mo-rts.wasm grows 0.03%, 58 bytes)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants