Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] how to handleconst-eval and promoted in K #65

Closed
1 of 2 tasks
yanliu18 opened this issue Feb 28, 2023 · 6 comments
Closed
1 of 2 tasks

[Discussion] how to handleconst-eval and promoted in K #65

yanliu18 opened this issue Feb 28, 2023 · 6 comments
Assignees
Labels
discussion anything like a suggestion, a question, a proposal

Comments

@yanliu18
Copy link
Contributor

yanliu18 commented Feb 28, 2023

Problem

The MIR file emitted by rustc could contain special constructs such as promoted[0], fn main::const_eval()... in multiple occurence or comments //MIR for CTFE before const_eval(). These are all related to const-eval in CTFE.

Such constructs are silently added at compile time, that there is no appearance at surface syntax. We consider it as compiler time optimization, where there sould be no occurence of such construct for no-op MIR file (to be validated). It also seems to be a feature added by miri while rustc calls miri for code checking.

However, when it comes to definitions in K, how should we treat CTFE behaviors in kmir (considering its semantics and relation to the semantics of the original rust program)?
[To be investigated]

references

  • A systematic introduction of const-eval
  • Discussions towards the development of CTFE, developping the restriction of applying const-eval.
  • Related RFCs: RFC1414

Related Issues

Investigation Todos

  • A minimized example in testdata/optimizations/CTFE, it is good to have examples dedicated to the three category of programs enables const-eval
  • compare the difference between MIR files procuded with no-op/ skip miri step or CTFE step and after op.
@bbyalcinkaya
Copy link
Member

My notes on CTFE

CTFE

CTFE: Compile time function evaluation

The CTFE mechanism checks for undefined behaviors during the evaluation (using Miri). In case of a UB, CTFE is terminated with an error. To avoid an optimization changing the undefinedness of a program, CTFE uses unoptimized MIR functions (RFC-3016). Thus, the compiler emits both optimized and unoptimized MIR (// MIR FOR CTFE) for the following definitions:

Since constants remain unevaluated in the MIR output, we need "MIR for CTFE" functions for constant evaluation.

const fn functions

When emitting MIR for a const fn, the compiler emits both the optimized MIR and the unoptimized MIR for CTFE (// MIR FOR CTFE). Thus, there will be two functions with the same signature in the output.

Example

Rust:

const fn foo(x: i32, y: i32) -> i32 {
    x + y
}

fn main() {
    const A: i32 = foo(1, 2 + 34);
    const B: i32 = foo(1, 2 + 5);
    let x = A;
    let y = B;
}
Click to see MIR
// WARNING: This output format is intended for human consumers only
// and is subject to change without notice. Knock yourself out.
fn foo(_1: i32, _2: i32) -> i32 {
    debug x => _1;                       // in scope 0 at const-fn.rs:1:14: 1:15
    debug y => _2;                       // in scope 0 at const-fn.rs:1:22: 1:23
    let mut _0: i32;                     // return place in scope 0 at const-fn.rs:1:33: 1:36
    let mut _3: (i32, bool);             // in scope 0 at const-fn.rs:2:5: 2:10

    bb0: {
        _3 = CheckedAdd(_1, _2);         // scope 0 at const-fn.rs:2:5: 2:10
        assert(!move (_3.1: bool), "attempt to compute `{} + {}`, which would overflow", _1, _2) -> bb1; // scope 0 at const-fn.rs:2:5: 2:10
    }

    bb1: {
        _0 = move (_3.0: i32);           // scope 0 at const-fn.rs:2:5: 2:10
        return;                          // scope 0 at const-fn.rs:3:2: 3:2
    }
}

// MIR FOR CTFE
fn foo(_1: i32, _2: i32) -> i32 {
    debug x => _1;                       // in scope 0 at const-fn.rs:1:14: 1:15
    debug y => _2;                       // in scope 0 at const-fn.rs:1:22: 1:23
    let mut _0: i32;                     // return place in scope 0 at const-fn.rs:1:33: 1:36
    let mut _3: i32;                     // in scope 0 at const-fn.rs:2:5: 2:6
    let mut _4: i32;                     // in scope 0 at const-fn.rs:2:9: 2:10
    let mut _5: (i32, bool);             // in scope 0 at const-fn.rs:2:5: 2:10

    bb0: {
        StorageLive(_3);                 // scope 0 at const-fn.rs:2:5: 2:6
        _3 = _1;                         // scope 0 at const-fn.rs:2:5: 2:6
        StorageLive(_4);                 // scope 0 at const-fn.rs:2:9: 2:10
        _4 = _2;                         // scope 0 at const-fn.rs:2:9: 2:10
        _5 = CheckedAdd(_3, _4);         // scope 0 at const-fn.rs:2:5: 2:10
        assert(!move (_5.1: bool), "attempt to compute `{} + {}`, which would overflow", move _3, move _4) -> bb1; // scope 0 at const-fn.rs:2:5: 2:10
    }

    bb1: {
        _0 = move (_5.0: i32);           // scope 0 at const-fn.rs:2:5: 2:10
        StorageDead(_4);                 // scope 0 at const-fn.rs:2:9: 2:10
        StorageDead(_3);                 // scope 0 at const-fn.rs:2:9: 2:10
        return;                          // scope 0 at const-fn.rs:3:2: 3:2
    }
}

fn main() -> () {
    let mut _0: ();                      // return place in scope 0 at const-fn.rs:5:11: 5:11
    let _1: i32;                         // in scope 0 at const-fn.rs:8:9: 8:10
    scope 1 {
        debug x => _1;                   // in scope 1 at const-fn.rs:8:9: 8:10
        let _2: i32;                     // in scope 1 at const-fn.rs:9:9: 9:10
        scope 2 {
            debug y => _2;               // in scope 2 at const-fn.rs:9:9: 9:10
        }
    }

    bb0: {
        _1 = const _;                    // scope 0 at const-fn.rs:8:13: 8:14
        _2 = const _;                    // scope 1 at const-fn.rs:9:13: 9:14
        return;                          // scope 0 at const-fn.rs:10:2: 10:2
    }
}

const A: i32 = {
    let mut _0: i32;                     // return place in scope 0 at const-fn.rs:6:14: 6:17
    let mut _1: i32;                     // in scope 0 at const-fn.rs:6:27: 6:33
    let mut _2: (i32, bool);             // in scope 0 at const-fn.rs:6:27: 6:33

    bb0: {
        StorageLive(_1);                 // scope 0 at const-fn.rs:6:27: 6:33
        _2 = CheckedAdd(const 2_i32, const 34_i32); // scope 0 at const-fn.rs:6:27: 6:33
        assert(!move (_2.1: bool), "attempt to compute `{} + {}`, which would overflow", const 2_i32, const 34_i32) -> bb1; // scope 0 at const-fn.rs:6:27: 6:33
    }

    bb1: {
        _1 = move (_2.0: i32);           // scope 0 at const-fn.rs:6:27: 6:33
        ConstEvalCounter;                // scope 0 at const-fn.rs:6:20: 6:34
        _0 = foo(const 1_i32, move _1) -> bb2; // scope 0 at const-fn.rs:6:20: 6:34
                                         // mir::Constant
                                         // + span: const-fn.rs:6:20: 6:23
                                         // + literal: Const { ty: fn(i32, i32) -> i32 {foo}, val: Value(<ZST>) }
    }

    bb2: {
        StorageDead(_1);                 // scope 0 at const-fn.rs:6:33: 6:34
        return;                          // scope 0 at const-fn.rs:6:5: 6:35
    }
}

const B: i32 = {
    let mut _0: i32;                     // return place in scope 0 at const-fn.rs:7:14: 7:17
    let mut _1: i32;                     // in scope 0 at const-fn.rs:7:27: 7:32
    let mut _2: (i32, bool);             // in scope 0 at const-fn.rs:7:27: 7:32

    bb0: {
        StorageLive(_1);                 // scope 0 at const-fn.rs:7:27: 7:32
        _2 = CheckedAdd(const 2_i32, const 5_i32); // scope 0 at const-fn.rs:7:27: 7:32
        assert(!move (_2.1: bool), "attempt to compute `{} + {}`, which would overflow", const 2_i32, const 5_i32) -> bb1; // scope 0 at const-fn.rs:7:27: 7:32
    }

    bb1: {
        _1 = move (_2.0: i32);           // scope 0 at const-fn.rs:7:27: 7:32
        ConstEvalCounter;                // scope 0 at const-fn.rs:7:20: 7:33
        _0 = foo(const 1_i32, move _1) -> bb2; // scope 0 at const-fn.rs:7:20: 7:33
                                         // mir::Constant
                                         // + span: const-fn.rs:7:20: 7:23
                                         // + literal: Const { ty: fn(i32, i32) -> i32 {foo}, val: Value(<ZST>) }
    }

    bb2: {
        StorageDead(_1);                 // scope 0 at const-fn.rs:7:32: 7:33
        return;                          // scope 0 at const-fn.rs:7:5: 7:34
    }
}

Tuple structs

Tuple struct and variant constructors are traited as const fns, so the compiler generates a MIR function (+ its MIR FOR CTFE) for every constructor.

struct Foo(i32, i64);

fn main() {
    let a = Foo(1,2); // not a function call
}
Click to see MIR
// WARNING: This output format is intended for human consumers only
// and is subject to change without notice. Knock yourself out.
fn main() -> () {
    let mut _0: ();                      // return place in scope 0 at tuple-struct.rs:3:11: 3:11
    let _1: Foo;                         // in scope 0 at tuple-struct.rs:4:9: 4:10
    scope 1 {
        debug a => _1;                   // in scope 1 at tuple-struct.rs:4:9: 4:10
    }

    bb0: {
        _1 = Foo(const 1_i32, const 2_i64); // scope 0 at tuple-struct.rs:4:13: 4:21
        return;                          // scope 0 at tuple-struct.rs:5:2: 5:2
    }
}

fn Foo(_1: i32, _2: i64) -> Foo {
    let mut _0: Foo;                     // return place in scope 0 at tuple-struct.rs:1:1: 1:11

    bb0: {
        _0 = Foo(move _1, move _2);      // scope 0 at tuple-struct.rs:1:1: 1:11
        return;                          // scope 0 at tuple-struct.rs:1:1: 1:11
    }
}

// MIR FOR CTFE
fn Foo(_1: i32, _2: i64) -> Foo {
    let mut _0: Foo;                     // return place in scope 0 at tuple-struct.rs:1:1: 1:11

    bb0: {
        _0 = Foo(move _1, move _2);      // scope 0 at tuple-struct.rs:1:1: 1:11
        return;                          // scope 0 at tuple-struct.rs:1:1: 1:11
    }
}

Example: Using constructor as a function

struct Foo(i32,i32);

fn zero(x: i32, f: fn(i32,i32)->Foo) -> Foo {
    f(0, x)
}

fn main() {
    let p = zero(3, Foo);
    let r = Foo(1, 3);  
}
Click to see MIR
// WARNING: This output format is intended for human consumers only
// and is subject to change without notice. Knock yourself out.
fn zero(_1: i32, _2: fn(i32, i32) -> Foo) -> Foo {
    debug x => _1;                       // in scope 0 at src/main.rs:8:9: 8:10
    debug f => _2;                       // in scope 0 at src/main.rs:8:17: 8:18
    let mut _0: Foo;                     // return place in scope 0 at src/main.rs:8:41: 8:44

    bb0: {
        _0 = _2(const 0_i32, _1) -> bb1; // scope 0 at src/main.rs:9:5: 9:12
    }

    bb1: {
        return;                          // scope 0 at src/main.rs:10:2: 10:2
    }
}

fn main() -> () {
    let mut _0: ();                      // return place in scope 0 at src/main.rs:12:11: 12:11
    let _1: Foo;                         // in scope 0 at src/main.rs:13:9: 13:10
    let mut _2: fn(i32, i32) -> Foo;     // in scope 0 at src/main.rs:13:21: 13:24
    scope 1 {
        debug p => _1;                   // in scope 1 at src/main.rs:13:9: 13:10
        let _3: Foo;                     // in scope 1 at src/main.rs:14:9: 14:10
        scope 2 {
            debug r => _3;               // in scope 2 at src/main.rs:14:9: 14:10
        }
    }

    bb0: {
        _2 = Foo as fn(i32, i32) -> Foo (Pointer(ReifyFnPointer)); // scope 0 at src/main.rs:13:21: 13:24
                                         // mir::Constant
                                         // + span: src/main.rs:13:21: 13:24
                                         // + literal: Const { ty: fn(i32, i32) -> Foo {Foo}, val: Value(<ZST>) }
        _1 = zero(const 3_i32, move _2) -> bb1; // scope 0 at src/main.rs:13:13: 13:25
                                         // mir::Constant
                                         // + span: src/main.rs:13:13: 13:17
                                         // + literal: Const { ty: fn(i32, fn(i32, i32) -> Foo) -> Foo {zero}, val: Value(<ZST>) }
    }

    bb1: {
        _3 = Foo(const 1_i32, const 3_i32); // scope 1 at src/main.rs:14:13: 14:22
        return;                          // scope 0 at src/main.rs:15:2: 15:2
    }
}

fn Foo(_1: i32, _2: i32) -> Foo {
    let mut _0: Foo;                     // return place in scope 0 at src/main.rs:2:1: 2:11

    bb0: {
        _0 = Foo(move _1, move _2);      // scope 0 at src/main.rs:2:1: 2:11
        return;                          // scope 0 at src/main.rs:2:1: 2:11
    }
}

// MIR FOR CTFE
fn Foo(_1: i32, _2: i32) -> Foo {
    let mut _0: Foo;                     // return place in scope 0 at src/main.rs:2:1: 2:11

    bb0: {
        _0 = Foo(move _1, move _2);      // scope 0 at src/main.rs:2:1: 2:11
        return;                          // scope 0 at src/main.rs:2:1: 2:11
    }
}

Comments and questions

  • In the first example, bb0 of the main function is supposed to use constants A and B as RHS of the assignments. However, both constants appear as const _. It looks like the MIR output does not carry enough information to identify consts. How should we deal with this?
  • In the MIR output, constants remain unevaluated. We should keep the MIR for CTFE functions for constant evaluation. This requires minor (?) changes in the preprocessor and syntax to distinguish between two functions.
  • Is it possible to make rustc evaluate constants before emitting mir?

@virgil-serbanuta
Copy link
Contributor

The two const _ values are not actually needed, which may explain why they are "unevaluated constants" (I think that this is the code that produces the underscore: https://github.com/rust-lang/rust/blob/master/compiler/rustc_middle/src/mir/mod.rs#L2809 ).

@yanliu18
Copy link
Contributor Author

yanliu18 commented Mar 2, 2023

Thank you, @bbyalcinkaya for preparing the examples with mir output embeded nicely!
To answer your questions and the issues we had before:

  • Your question 1. Agree with @virgil-serbanuta, const _ appears in the unoptimized MIR and marks for "unevaluated contants". However, I have the same question like you- how is x(binded to _1) in main() linked to const B? Am I missing any information?

    TODO: pull out the unoptimised mir to compare with the one generated for CTFE

  • Your question 2 and issue Handle MIR FOR CTFE functions #62 and Figure out why rustc produces multiple functions with the same signature in some tests #63, I guess we can skip the function generated for CTFE. Either we can preprocess it or we can excude the miri or borrow checking pass when we emit MIR (preferred).

    TODO: explore emit option dump-mir-exclude-pass-number=val -- exclude the pass number when dumping MIR (used in tests) (default: no), view from rustc -Z help.

Observation There is something I have noticed in the above example: CheckedAdd is a function executing + at debug mode. It will not be applied in release mode. This something to be aware at the time of defining execution semantics for MIR. The compile mode would interfere with the execution behavior.

TODO: consider compilation mode when defining execution semantics.

  • Your question 3, isn't that the same idea as CTFE. Out of our scope in near furture though.

@bbyalcinkaya
Copy link
Member

@virgil-serbanuta Do you mean it is because they are assigned to unused variables? I tried adding print statements, but they are still unevaluated.

@yanliu18 Yes, my 3rd question is the same as CTFE. If I understand correctly, CTFE is applied during code generation, after this MIR output is created. That's why we have unevaluated constants.

How is x linked to const B, compiler's internal data structures keep this information but they are not printed. In some cases (e.g. promoted constants) this info is given in the comments. Search for Unevaluated(main, [], Some(promoted[0]) in test files. This is where those extra comments are printed. It does not print extra comments for integers.

@bbyalcinkaya
Copy link
Member

bbyalcinkaya commented Mar 2, 2023

This is an interesting example:

const A: usize = 111 + 222;    // 333
const B: usize = A + 444;      // 777
    
fn main() {
    let x = A;
    let y = B;
    let arr = [0 ; B];
    println!("{}", x);
    println!("{}", y);
}

Constant evaluation is lazy. The compiler only evaluates the array length because it is part of the type. Other constants will eventually be evaluated in codegen.

    _1 = const _;                    // scope 0 at const.rs:5:13: 5:14
    _2 = const _;                    // scope 1 at const.rs:6:13: 6:14
    _3 = [const 0_i32; 777];         // scope 2 at const.rs:7:15: 7:22

@yanliu18
Copy link
Contributor Author

yanliu18 commented Jun 28, 2023

I've been learning about the mir optimizations (far from done yet) and options to turn them off. Ive find some answers to the questions in our discussion, thus I am composing an update to this issue.

quoted @virgil-serbanuta's comments: The two const _ values are not actually needed, which may explain why they are "unevaluated constants"

  • The consts are always presented in the formconst _ even if they are used. They are actually attempted to be evaluated, for example the const A block of code. Try this example:
const fn foo(x: i32, y: i32) -> i32 {
    x + y
}

fn main() {
    const A: i32 = foo(1, 2 + 34);
    assert!(A == 37);
    return
}

to @bbyalcinkaya 's question: Is it possible to make rustc evaluate constants before emitting mir?

  • Yes and no. Yes -> Enable ConstProp optimization using flag '-Zmir-opt-level=2'; No -> This pass is no longer in the stable rustc versions. It was mentioned this pass might be overlapping with const-eval and might have some issues (I am not sure about the details though).

  • I mentioned the debug mode and release mode, where the former does overflow checking by default but the latter does not. And the latter is the production code. I've found ways to turn that debug assertion off, by passing the flag -C debug-assertions=off or -C overflow-checks=no to rustc. However, it seems CTFE sometimes excape these flags. Besides, we should check out the settings for --release mode.

  • Promote vs CTFE seems to be different things, but I suspect they overlap sometimes. The former can be turned off by disabling the promote temp pass, using the flag -Zmir-enable-passes=-PromoteTemps, though again I am not sure if it applied in all cases. The rustc-dump-mir options are unstable. Furthermore, I have no idea how to skip the latter.

  • We might be able to ignore promote and //MIR for CTFE. The latter outputs code most similar to MIR with mir-opt-level=0. And we will evetually using the non-optimized MIR for analysis, not relying on any optimization passes, unless they are formally proved.

  • I still have no idea, how const _ links to const A. After examining all the intermdeiate outputs (I could possible miss the important information too), I still have no clue. A future step (when we had to deal with it), we would wrap our tool on top of rustc eventually.

@yanliu18 yanliu18 added the discussion anything like a suggestion, a question, a proposal label Jun 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion anything like a suggestion, a question, a proposal
Projects
None yet
Development

No branches or pull requests

4 participants