feat: Implement copy on write optimization for arrays in brillig #3118

jfecher · 2023-10-11T21:18:36Z

Description

Problem*

Use of array_set in a loop can greatly slow down brillig code since under the hood on master, brillig will currently copy the entire array in order to return a new array with one element changed.

Resolves (couldn't find an existing issue for this, do we have one?)

Summary*

This PR adds the copy on write optimization to brillig in order to optimize this usecase. To implement this, arrays in brillig are now reference counted and there is a new Instruction::IncrementRc instruction in SSA to increment reference count (1). During an array_set in brillig, the reference count is checked. If it is 1, brillig will mutate the existing array instead of copying to a new one. If not, brillig will fall back to creating a new array, although crucially this new array will always have reference count 1, allowing each subsequent set to be a mutated one. This leads to exponential speedup in loops using array_set.

The reference count is increased when creating a new variable via let and when passing values to functions. The reference count is also (currently) never decreased - so the analysis is rather pessimistic. We could look into changing this in the future, but for now it is sufficient since the first array_set performs a copy and the rest will still be optimized.

The following example:

fn main() {
    let array = [0; 5000];
    assert(foo(array) != 100);
    assert(array[5] == 0);
}

unconstrained fn foo<N>(mut array: [Field; N]) -> Field {
    let copy = array;
    let mut sum = 0;

    for i in 0 .. array.len() {
        array[i] = i;
        sum += i;
    }

    assert(copy[5] == 0);
    sum
}

Runs in 58s on master (release) on my machine, and 0.8s with the optimization on this branch. Results may vary of course, and will be slower when compiling nargo on the debug profile.

Note that this test also includes array aliases so we can see the mutations are safe to perform without affecting other values.

Documentation

This PR requires documentation updates when merged.
- I will submit a noir-lang/docs PR.
- I will request for and support Dev Rel's help in documenting this PR.

Additional Context

Regarding (1), a possible alternate design is to encode & expose reference counts directly in SSA rather than keeping them only in brillig. For example, arrays could instead be represented as a tuple of (reference_count, array) where reference_count: &mut Field. From here, we could have each array_set either translate to an if under the hood:

  v1 = load reference_count
  v2 = eq v1, Field 1
  jmp_if v2, then: b2, else: b3
b2:
  // array has 1 rc
  array_set_mut array, index, value
  jmp b4(array)
b3:
  // array has > 1 rc
  v3 = array_set array, index, value
  jmp b4(v3)
b4(v4: [Field]):
  ... continue with v4 for the result of the array set ...

This would greatly bloat the ssa however, so I decided against it. An alternative would be modifying the array_set instruction to accept the reference count as an argument:

v1 = load reference_count
v2 = array_set array, index, value, v1

Then internally (in brillig, as it is ignored in ssa) we would check the reference count for each array set operation similarly to how is currently done in this PR.

Additionally, this is still a draft because I've left some TODO's / messy bits in the code to clean up. The core approach should be ready for review though or we can discuss whether the alternative approach above would be better.

PR Checklist*

I have tested the changes locally.
I have formatted the changes with Prettier and/or cargo fmt on default settings.

jfecher · 2023-10-11T21:24:42Z

Looks like there's an odd value in one of the brillig tests. Anyone have any insight on this? Very possible I added the wrong expected registers for each reference count in some of the brillig tests.

TomAFrench · 2023-10-11T21:54:02Z

This is gonna need an update the ACIR serialisation logic in bb unfortunately.

kevaundray · 2023-10-11T22:35:30Z

I think the approach here is reasonable -- Alvaro who has been thinking about this for a while is away for a week, so he won't be able to chime in at the moment.

The original issue was because of the noir-rsa program, could we get a comparison vs master with that program: https://github.com/SetProtocol/noir-rsa ?

jfecher · 2023-10-12T19:10:27Z

When testing on the noir-rsa repo, I'm currently getting an out of bounds read for brillig memory on this PR for some reason. When handling the Load opcode it is trying to read memory at index 2262, but the length is also only 2262.

sirasistant

Only skimmed through the code, I might have misunderstood something. Will make a more complete review later today or early tomorrow 😄

compiler/noirc_evaluator/src/brillig/brillig_gen/brillig_block.rs

sirasistant · 2023-10-17T13:58:12Z

compiler/noirc_evaluator/src/brillig/brillig_ir.rs

            RegisterOrMemory::HeapArray(HeapArray { pointer, .. }) => {
                self.load_instruction(pointer, variable_pointer);
            }
-            RegisterOrMemory::HeapVector(HeapVector { pointer, size }) => {
+            RegisterOrMemory::HeapVector(HeapVector { pointer, size, reference_count: _ }) => {


I think we should read reference_count from stored array and vector variables

sirasistant · 2023-10-17T13:58:28Z

compiler/noirc_evaluator/src/brillig/brillig_ir.rs

+            RegisterOrMemory::HeapArray(HeapArray { pointer, size, reference_count: _ }) => {
                self.store_instruction(variable_pointer, pointer);
                let size_constant = self.make_constant(Value::from(size));
                self.store_instruction(size_pointer, size_constant);
                self.deallocate_register(size_constant);
            }
-            RegisterOrMemory::HeapVector(HeapVector { pointer, size }) => {
+            RegisterOrMemory::HeapVector(HeapVector { pointer, size, reference_count: _ }) => {


symmetric to the load, I think we need to store those

We also would need to update allocate_variable_instruction implementation to take into account that now variables can be up to 3 items (pointer, length and reference count)

I don't understand these load and store methods or how I'd extend them to add a reference count. It looks like they only take in a source_variable for the array pointer, and a address_register that we use to store the size? I'm not sure why we'd store the size where the address goes, nor am I sure how this would be extended to store a reference count as well.

acvm-repo/brillig/src/opcodes.rs

jfecher · 2023-10-17T17:02:53Z

I've found that this PR fails the following program at the assert:

global M = 5000;

fn main() {
    let array = [0; M];
    let res = foo(array);
    assert(res == M * 2, "foo(array) != M * 2");
}

unconstrained fn foo<N>(mut array: [Field; N]) -> Field {
    let copy = array;
    let mut sum = 0;

    // dep::std::println(array.len());

    for i in 0 .. array.len() {
        array[i] = 2;
        sum += array[i];
    }

    sum
}

Yet, if the println is uncommented, it will then pass.

Edit: Nevermind, this was fixed by fixing extract_registers

jfecher · 2023-10-17T18:30:40Z

After some debugging, it looks like the rc == 1 checks in the brillig vm are all being seen as array_size == 1 or 5000 == 1 in this case, so they're never triggered. In the most recent example above.

sirasistant · 2023-10-18T09:41:31Z

After more thinking I think the approach taken for brillig_gen won't work. If we add one more register to store the reference count, consider the following SSA:

brillig fn foo f1 {
  b0(v0: [Field; 5000]):
    v57 = allocate
    store v0 at v57
    inc_rc v0
    v58 = allocate
    store Field 0 at v58
    inc_rc v0
    jmp b1(Field 0)
  b1(v6: Field):
    v59 = lt v6, Field 5000
    jmpif v59 then: b2, else: b3
  b2():
    v61 = load v57
    v62 = array_set v61, index v6, value Field 2
    v63 = add v6, Field 1
    store v62 at v57
    v64 = load v58
    v65 = array_get v62, index v6
    v66 = add v64, v65
    store v66 at v58
    jmp b1(v63)
  b3():
    v60 = load v58
    return v60
}

v0 would get assigned two registers, let's say, R0 and R1. R0 would have the pointer to the items and R1 would have the reference count value. When storing v0 inside v57, we'd store in memory the pointer and the initial RC value (1). When processing inc_rc v0 we'd increase R1 but the reference counter stored in memory in the previous instruction wouldn't get incremented.
I see two approaches to be able to implement this:

make arrays and slices a reference structure. They would take a single register that would be a reference to (pointer, reference_count) or (pointer, size, reference_count) in the case of slices.
make reference count a reference itself, so it's shared across all places the array is stored

jfecher · 2023-10-18T13:13:06Z

@sirasistant ah, good point. Instead of storing the rc in a register then, I'll try out the original idea I had which was to store it before the first element of the array.

Savio-Sou · 2023-11-22T14:18:01Z

Superseded by #3522.

jfecher added 5 commits October 11, 2023 10:35

Add element type to Type::Reference

e5f46ff

Add inc_rc instruction

13b9319

Add inc_rc instructions to ssa programs

2bafb5d

Implement inc_rc in brillig

946b3e6

Implement copy on write optimization in brillig

995faa4

jfecher requested review from sirasistant, kevaundray and vezenovm October 11, 2023 21:18

Refactor sections

f32889f

sirasistant reviewed Oct 17, 2023

View reviewed changes

Fix initial reference count

d5c80e0

jfecher added 3 commits October 17, 2023 12:30

Add reference count to extract_registers

cd8be3c

Remove dbg panic

d83cf03

Cleanup branch_instruction

a7727cc

jfecher and others added 2 commits October 18, 2023 09:34

Start refactor to store rc in the first element of the array

4ef445f

Add reference type to tests

37654e6

Savio-Sou assigned sirasistant and jfecher Nov 18, 2023

Savio-Sou closed this Nov 22, 2023

Savio-Sou mentioned this pull request Nov 22, 2023

feat: Copy on write optimization for brillig #3522

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Implement copy on write optimization for arrays in brillig #3118

feat: Implement copy on write optimization for arrays in brillig #3118

jfecher commented Oct 11, 2023 •

edited

Loading

jfecher commented Oct 11, 2023

TomAFrench commented Oct 11, 2023

kevaundray commented Oct 11, 2023

jfecher commented Oct 12, 2023

sirasistant left a comment

sirasistant Oct 17, 2023

sirasistant Oct 17, 2023

sirasistant Oct 17, 2023

jfecher Oct 17, 2023

jfecher commented Oct 17, 2023 •

edited

Loading

jfecher commented Oct 17, 2023

sirasistant commented Oct 18, 2023

jfecher commented Oct 18, 2023

Savio-Sou commented Nov 22, 2023

feat: Implement copy on write optimization for arrays in brillig #3118

feat: Implement copy on write optimization for arrays in brillig #3118

Conversation

jfecher commented Oct 11, 2023 • edited Loading

Description

Problem*

Summary*

Documentation

Additional Context

PR Checklist*

jfecher commented Oct 11, 2023

TomAFrench commented Oct 11, 2023

kevaundray commented Oct 11, 2023

jfecher commented Oct 12, 2023

sirasistant left a comment

Choose a reason for hiding this comment

sirasistant Oct 17, 2023

Choose a reason for hiding this comment

sirasistant Oct 17, 2023

Choose a reason for hiding this comment

sirasistant Oct 17, 2023

Choose a reason for hiding this comment

jfecher Oct 17, 2023

Choose a reason for hiding this comment

jfecher commented Oct 17, 2023 • edited Loading

jfecher commented Oct 17, 2023

sirasistant commented Oct 18, 2023

jfecher commented Oct 18, 2023

Savio-Sou commented Nov 22, 2023

jfecher commented Oct 11, 2023 •

edited

Loading

jfecher commented Oct 17, 2023 •

edited

Loading