Skip to content

Array Structs in Steel

Denis Merigoux edited this page May 28, 2020 · 1 revision

Motivation

The C language allows the definition of flat data structure composed of nested struts and arrays. These flat data structures are used everywhere in systems programming, as their size can be computed at compile time and they can be statically or stack-allocated. In the following, we will refer to these data structures as arraystructs.

Low* v2 does not offer the feature of defining and storing arraystruct in memory. This has led in Low* application to a painful design pattern of emulating an arraystruct with a flat buffer of bytes and a lot of accessor functions for each field or cell of the arraystruct.

The reason why the arraystruct feature could not be added to Low* v2 is that Low* v2 only allows types in universe 0 to be stored in memory as a legacy of how the memory model was designed at the time. One of the downsides of this universe restriction is that we cannot store inductive types inside a memory reference. But the catch is that arraystruct are inherently recursive in their nature, and need an inductive datatype to be modeled in F* (or any proof assistant). A workaround experiment was attempted in August 2017 by Tahina, but never caught on because the workaround is based on restricting what you can store in your arraystruct so that the description can fit in universe 0.

Now that Steel is around, this universe-0 restriction is gone from the memory model and we can store higher-universe values in the memory. Hence another attempt at arraystructs! The file corresponding to this new attempt is here : https://github.com/FStarLang/FStar/blob/denismerigoux_pcm/ulib/experimental/Steel.PCM.ArrayStruct.fst.

First attempt: with descriptors

This section describes the contents of Steel.PCM.ArrayStruct.fsti

We have to describe first the shape of the arraystructs that will be stored in memory. For that, we set up an inductive datatype (note the higher universe for the resulting type) :

noeq type array_struct_descriptor : Type u#(a+1) =
  | Base: a: Type u#a -> array_struct_descriptor
  | Array: cell_descriptor:array_struct_descriptor u#a -> len:usize -> array_struct_descriptor
  | Struct: field_descriptors: (field_id ^-> option (array_struct_descriptor u#a)) ->
    array_struct_descriptor

From the descriptor, we can infer a type : Seq.lseq for arrays, DependenMap.t for structs. The user can then define its own PCM to govern the values, but we offer easy operators for defining "pointwise" PCMs for structs and arrays given the PCMs at the leaves.

Drawbacks : we're going away from the very shallow embedding of the language that we want here. Then we would have to define paths for accessing the structure, etc and dealing with dependent maps is cumbersome.

Second attempt: brainstorming session

Proposal 1

We don't do arraystructs and rely on the existing Low* mechanisms to extract F* structs into C records. However, the F* records could not be "split" in memory. Here are the objectives that the Steel team agreed to set for the arraystructs :

  • we want to be able to take the address of fields or indexes and pass them around as first-class pointers
  • we want to update the values of those sub-objects into functions that are not aware of the parent object context

Because of this requirement, we have to come up with a system to have separation-logic compatible arraystructs that also extract.

Proposal 2

Something that was investigated a while ago was yo view the sub-objects (fields, indexes) as lenses to the parent object. However, this experiment didn't work out, partly because you have to store the parent object in the reference itself, making the reference type higher-universe and severely limiting the storing of references of references of references. There were maybe other limiting factors ? (@Aseem, @Nik ?)

Proposal 3

We rely on Everparse. Structs are declared in the Everparse DSL, then compiled to F* code that provides getters and setters for arrays indexes and struct fields. We would have to tweak Everparse to make this compatible with default logic though, because Everparse models its structs as big array of bytes, whereas we have a very structured memory model in Steel.

Proposal 4

This proposal is the first attempt described earlier.

Proposal 5

In this proposal, the user if free to define its own arraystructs as F* records and seq, and PCM to govern those arraystructs. A helper library like the one in proposal 4 could be used to lower the barrier for new users to use those.

The question of extraction now becomes orthogonal of the question of PCMs and separation logic. However, we have to map the user-defined arraystruct to things that Kremlin can understand and extract as field updates, index-taking, etc. For that we define a language of attributes coming with proof obligation that the user can use to decorates the types they declare.

What is the attribute language for extraction that we need ? Here are some items: * address of field * field update * field read * address of array index * index read * index update

Clone this wiki locally