Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor type-checking to not inline everything #1557

Closed
mitchmindtree opened this issue May 16, 2022 · 4 comments
Closed

Refactor type-checking to not inline everything #1557

mitchmindtree opened this issue May 16, 2022 · 4 comments
Labels
big this task is hard and will take a while code quality compiler General compiler. Should eventually become more specific as the issue is triaged

Comments

@mitchmindtree
Copy link
Contributor

mitchmindtree commented May 16, 2022

Currently, type-checking inlines everything into the entry point of the root module. As a result a call to compile_to_ast (or more specifically, TypedParseTree::type_check) leaves us with a single TypedParseTree representing only the root module with its inlined entry point. The TypedParseTrees for individual submodules are dropped during compile_inner_dependency.

Why inline everything (and why during type-check)?

The original motivation behind inlining everything was to simplify the old code generation step which translated the AST directly to ASM. By inlining everything into the entry point during type-checking, the burden of having to think about how to represent function calls in Fuel VM assembly was removed altogether. @sezna pls correct me if I'm wrong or missing something here!

Issues with inlining everything

We are starting to see some bubbling issues arise from the current approach of inlining everything, and in particular inlining everything during type checking.

We also see some shoe-horning of unrelated compilation steps into type-checking as a result. E.g. currently we pass the dead_code_graph through the entirety of type-checking solely because that's the only time we have access to the fully typed submodules. As a result there are mentions of dead_code_graph everywhere throughout semantic_analysis, when ideally it could be a small, distinct step that follows type-checking.

Potential Solution

Change type-checking to return a more structured, non-inlined representation of the typed program. E.g. I'm imagining something like the following pseudo:

pub struct TypedProgram {
    root: TypedModule,
    // TypedProgramKind - script/predicate/contract/library, indicates main fn, abi entries.
}

pub struct TypedModule {
    items: Vec<TypedItem>,
    submodules: BTreeMap<DepName, TypedModule>,
}

Having something like this would help a lot with the sway-lsp issue and help with the separation of concerns like the graph construction for control flow analysis.

I'm less clear on what the implications would be for sway-ir or what this would mean for keeping function mappings in code generation that could be used by the debugger - @otrho you might have some more insight?

Even if we do end up determining that it's useful to have an inlined AST for code generation, It would be beneficial to separate this into a dedicated AST-inlining step that operates on the structured, non-inlined AST.

@mitchmindtree mitchmindtree added compiler General compiler. Should eventually become more specific as the issue is triaged code quality labels May 16, 2022
@adlerjohn adlerjohn moved this to Todo in Fuel Network May 16, 2022
@otrho otrho added the big this task is hard and will take a while label May 16, 2022
@otrho
Copy link
Contributor

otrho commented May 16, 2022

The IR would like to organise a program into a tree. Instructions gathered into blocks, gathered into functions, gathered into modules. So having distinct functions and all their meta available to IR would be beneficial structurally and analytically.

To the best of my knowledge, the debugger is mostly interested in attributing spans to instructions, but being able to resolve full paths of symbols to instructions is important to set breakpoints (or inspect memory values). Without the absolute naming of functions this becomes impossible. @Dentosal knows more.

@sezna
Copy link
Contributor

sezna commented May 21, 2022

For some context: function declarations were originally inlined upon application to assist with a 1:1 codegen. You're given a fully typed node and you just serialize it to opcodes. You are also right in that it allows us to skip implementing a function call stack in the Fuel VM. It has certainly caused a fair share of hurdles, though.

@sezna
Copy link
Contributor

sezna commented May 21, 2022

I dig your proposed module format for sure. I'm wondering if FunctionApplications could throw away their monomorphized function copies after type checking, or store them separately and denote them as their resolved type version, so both the plugin and the type system get their respective benefits? The inlining stuff does have pretty sweet gains for type resolution and simplicity of IR for our little DSL, so if we can find an API that resolves the issues we have with it, I think it would help us avoid some churn.

@emilyaherbert
Copy link
Contributor

This will be solved by #1821.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
big this task is hard and will take a while code quality compiler General compiler. Should eventually become more specific as the issue is triaged
Projects
Status: Done
Development

No branches or pull requests

5 participants