[Discussion] Macro representation in Marker #47

xFrednet · 2023-09-26T07:22:11Z

Macros are an essential and cool part of Rust, but also difficult to handle during linting.

Technical background for rustc

Marker currently only operates during rustc's LateLintPass where all macros have already been expanded. It can be checked if an AST node comes from a macro, by checking its Span. However, the available information from this check is very limited. AFAIK, there is no intended way to retrieve the input token of a macro expansion.

Rustc also has an EarlyLintPass which can be registered to run before the expansion of macros. However, this is softly deprecated and apparently caused several problems in the past. The EarlyLintPass is also less flexible than the late lint pass, since it doesn't have access to TyCtxt

Do we need macros?

I believe that representing Macros in some shape or form is important for Marker. I can imagine custom lints, which want to check the correct usage of a macro. For this, they should be able to retrieve the input tokens passed to the macro.

It's possible to check the AST nodes that the macro expanded to. This is acceptable in some circumstances, but can cause problems, if it's an external macro which can change internally over time. This solution will often also just be more complicated, then checking the input.

So I say: Yes, Marker needs to support macros somehow

How should they be represented?

As mentioned, rustc (to my knowledge) currently doesn't provide a good way to handle macros during linting. This section will describe my two ideas, regardless of how simple they are to implement:

Suggestion 1: Macros are just nodes in the AST

Rust has restrictions, where macros can occur in the AST. Usually, they can only occur in places where a full AST node can be. It also has to expand to a node, that's valid in the given location.

I currently believe, the best solution would be to represent macros as normal nodes, at the location they can occur at. For example, the following code:

// The macro rules are not important for the issue, I just wanted
// a simple macro for the explaination.
macro_rules! example_macro {
    ($x:expr, $y:expr) => {
        $x + $y
    }
}

fn main() {
    example_macro!(1, 2);
}

The body of fn main() could be represented like this:

Body {
    expr: ExprKind::Block(
        BlockExpr{
            // [...] CommonExprData,
            stmts: [
                StmtKind::Macro( /* example_macro!(1, 2); */
                    MacroStmt {
                        // [...] CommonStmtData
                        
                        // The macro input is available as a token stream
                        input: TokenStream("1, 2"),

                        // The output, the macro was expanded to:
                        output: [
                            StmtKind::Expr( /* 1 + 2 */
                                ExprStmt {
                                    expr: ExprKind::BinOp(
                                        // ...
                                    )
                                }
                            )
                        ]
                    }
                )
            ]
        }
    )
}

Modeling them as entire AST nodes feels like the best representation right now. It might also be better for other drivers that don't expand macros, as they can only slap a MacroStmt node in the AST and leave the output blank.

One problem I see with this approach, that it will be a significant breaking change, once it's added to Marker's API. This would also require some changes in Rustc to store and expose the macro information.

Suggestion 2: Macros are separate from the AST

Macros can be stored separately from the AST and then be checked on demand. For example, if a node has a Span from an expansion, it might be able to request the macro information, from the expansion ID. The information would contain the input token and probably link to the created AST nodes.

One advantage of this approach is that it's just an addition to the API and wouldn't break anything.

The text was updated successfully, but these errors were encountered:

xFrednet added A-stable-api Area: Stable API, How it should look and what should be included in it C-discussion labels Sep 26, 2023

xFrednet mentioned this issue Sep 26, 2023

API: Rename AstContext -> MarkerContext (Let's break everything 💥) rust-marker/marker#256

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Discussion] Macro representation in Marker #47

[Discussion] Macro representation in Marker #47

xFrednet commented Sep 26, 2023 •

edited

Loading

[Discussion] Macro representation in Marker #47

[Discussion] Macro representation in Marker #47

Comments

xFrednet commented Sep 26, 2023 • edited Loading

Technical background for rustc

Do we need macros?

How should they be represented?

Suggestion 1: Macros are just nodes in the AST

Suggestion 2: Macros are separate from the AST

xFrednet commented Sep 26, 2023 •

edited

Loading