Replies: 2 comments 6 replies
-
As I said before (https://stackoverflow.com/questions/78901061/ignoring-whitespace-everywhere-except-for-specific-rule#comment139111216_78901061), you need to use lexer modes for C# code blocks, or you need to define the parser grammar for these C# code blocks. You can then extract the text that includes whitespace from the char stream. For an example of the lexer mode solution, see the antlr4 grammar in grammars-g4. That grammar goes to a lexer mode when a left curly is found. https://github.com/antlr/grammars-v4/blob/e07dbbf3445d31da61af5f54f04df78ea40ab9f8/antlr/antlr4/ANTLRv4Lexer.g4#L114 In TargetLanguageAction, chars are individually tokenized. But you could as well just "more()" the characters onto one large token for the entire C# code block. |
Beta Was this translation helpful? Give feedback.
-
Why not just get the start of your ‘{‘ and the start of your ‘}’ token and extract the text from your input stream? You will need real tokens though not literal strings. On Aug 29, 2024, at 04:59, Ero ***@***.***> wrote:
Thanks for your time, but I'm giving up.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hello, I would like to parse the following:
In short: the contents within the
state
blocks are custom. I can already parse those and they aren't a problem.However, the contents in all other blocks is plain C# code that is compiled using Roslyn. I therefore need to both preserve the whitespace, comments, etc. (for accurate line numbers that point to the correct place in the original script) so that the code can be compiled accurately.
How do I do this? I need to be able to access both the block's name and its raw contents from the generated code.
Beta Was this translation helpful? Give feedback.
All reactions