Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lexer: allow TABS for indentation #2392

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

pkitszel
Copy link

Allow TAB character to be used for indentation.

This is useful to have TAB character used as indentation, especially when parts of build.ninja are hand-written as HEREDOCs in otherwise TAB-indented file (either mandated by style for other part of project, or required by language itself).

Changing lexer is easy thanks to the use of re2c, syntax is perhaps a bit too permissive now, but that is job of the parser to reject use of mixed indentation.

Let's stop complaining that:
ninja: error: build.ninja:3: expected 'command =' line when it is exactly:
command = cc $cflags -c $in -o $out

Closes #1598
Signed-of-by: Przemek Kitszel [email protected]

token = lexer.ReadToken();
EXPECT_EQ(Lexer::ERROR, token);
EXPECT_EQ("tabs are not allowed, use spaces", lexer.DescribeLastError());
Copy link
Contributor

@digit-google digit-google Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests to also verify that:

  • line continuations with tabs actually work. E.g.:
build foo: rule_name $
<tab>input1 $
<tab>input2
  • Tabs that appear after indentation are still not allowed, except when they are in comments. E.g.
rule correct_rule
<tab>command = foo bar

rule incorrect_rule
<tab>command = foo<tab>bar

build incorrect_build: rule_name<tab>input1<tab>input2

The most important reason to avoid tabs is to avoid using them in command = definitions because this will be equally confusing to users.

Finally, please update manual.doc appropriately to document the new behavior.

Copy link
Author

@pkitszel pkitszel Mar 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I have indeed need to tweak the indentation after a linebreak.

However, even without my changes command = foo<tab>bar is fine to use.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, then no need to test for this then. Thank you for double-checking.

@pkitszel
Copy link
Author

I have extended PR to allow TABs used for variables, thank you @digit-google

I have my local test file in misc/, but I have no idea if those are automatically executed in some flows. It also creates tpm files. Is it worth to include it here too?

src/lexer_test.cc Outdated Show resolved Hide resolved
src/lexer.in.cc Outdated Show resolved Hide resolved
@digit-google
Copy link
Contributor

Thank you, I added two small comments. Can you also squash the result into a single commit, this will keep git history simpler for something small like this.

I am not sure about your local test file under misc/, since it is not part of this PR. Normally pytest-3 is invoked in the CMake build directory, pointing to the root source directory, and all *_test.py will be invoked automatically, including those under misc/ (this is not ideal imho, since it can pick up broken test files from third-party dependencies). See the content of the files under .github/workflows/ for details.

Finally, I don't know about tpm files, what are these?

@pkitszel
Copy link
Author

Thank you, I added two small comments. Can you also squash the result into a single commit, this will keep git history simpler for something small like this.

Sure, the bigger commit already does two things, which is a tid bit above my liking, but I agree that the smaller one does not really decouple anything substantial :)

I am not sure about your local test file under misc/, since it is not part of this PR. Normally pytest-3 is invoked in the CMake build directory, pointing to the root source directory, and all *_test.py will be invoked automatically, including those under misc/ (this is not ideal imho, since it can pick up broken test files from third-party dependencies). See the content of the files under .github/workflows/ for details.

Finally, I don't know about tpm files, what are these?

ugh, sorry, 'tmp' files, for temporary

the file would be misc/tabs-test.ninja , and the passing criteria is that ./ninja -v -f ./misc/tabs-test.ninja runs fine

I would be happy to extend current PR to add this test (perhaps without temporaries even) if those "misc/*.ninja" tests are already executed somehow (I didn't found that).

But ninja_syntax.py is a generator, that does not need to use TABs, so no point in changing it (and it's existing tests should cover what we have in this PR).

@digit-google
Copy link
Contributor

Ah, I suggest you create a temporary directory and run your Ninja test invocations there. Look at the run() method in misc/output_test.py to see how this is done currently. You could add regression tests directly there too.

Note that pytest-3 is never invoked on Windows though in our CI workflows, and I suspect the current output_test.py would not work well there.

@pkitszel pkitszel force-pushed the allow-tabs branch 2 times, most recently from 4b7b59f to f5f33f0 Compare March 19, 2024 10:38
@@ -162,6 +162,30 @@ def test_tool_inputs(self):
out2
''')

def test_tabs_indent(self):
content = '''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It is really difficult to see the tabs in this input. I suggest you use explicit <TAB> markers and str.replace() to replace them with real tabs, as in:

    # For clarity, tabs are written as <TAB> in the literal string below, then
    # replaced with a real "\t" value.
    content = '''
    rule exec
    <TAB>command = $cmd
    ....
    """.replace("<TAB>", "\t")

@digit-google
Copy link
Contributor

No worries for the double-push. Can you modify doc/manual.asciidoc to mention that TABs are allowed since 1.12 (next planned release IIRC).

@pkitszel
Copy link
Author

No worries for the double-push. Can you modify doc/manual.asciidoc to mention that TABs are allowed since 1.12 (next planned release IIRC).

I have no idea if it is safe to left $ escape rules not specified or I should go further this route ;)

OTOH I have spot a problem that some (TAB containing) variables that are used without qouting are passed as two params to the final command, sounds like a problem. (Not present if there is space instead of TAB)

so, please do not merge yet

@pkitszel
Copy link
Author

so, please do not merge yet

I was somehow testing with a system ninja instead of local build - turns out that code is fine as is.

I did a rebase + applied suggestion in test to make it visible.

doc/manual.asciidoc Outdated Show resolved Hide resolved
@digit-google
Copy link
Contributor

Nicely done, thanks.

@pkitszel
Copy link
Author

BTW, is there any prefered way to measure performance impact of any change/PR?

doc/manual.asciidoc Outdated Show resolved Hide resolved
doc/manual.asciidoc Outdated Show resolved Hide resolved
@jhasse jhasse added the feature label Apr 14, 2024
@jhasse jhasse added this to the 1.13.0 milestone Apr 14, 2024
Allow TAB character to be used for indentation.

This is useful to have TAB character used as indentation, especially when
parts of build.ninja are hand-written as HEREDOCs in otherwise
TAB-indented file (either mandated by style for other part of project, or
required by language itself).

Changing lexer is easy thanks to the use of re2c, syntax is perhaps a bit
too permissive now, but that is job of the parser to reject use of mixed
indentation.

Let's stop complaining that:
ninja: error: build.ninja:3: expected 'command =' line
when it is exactly:
	command = cc $cflags -c $in -o $out

tests: print string instead of int for Lexer::Token

Extend C++ tests by a wrapper for Lexer::Token printing, to have string
value reported instead of numeric value, that gives "newline" instead of
"8" in test results.
Original line numbers are kept in error messages as they were before.

Extend python functional test to have TABs used in both variable and build
statements.

Closes ninja-build#1598
Signed-of-by: Przemek Kitszel <[email protected]>
@jhasse
Copy link
Collaborator

jhasse commented Apr 27, 2024

I'm not sure if this is a good idea yet, since it will result in breaking backwards compatibility (all build.ninja files with tabs won't work with older ninja versions). Please correct me if I'm wrong.

My suggestion would be to still put out a warning when there are tabs, but not fail. Then in the next release remove the warning.

@pkitszel
Copy link
Author

pkitszel commented May 15, 2024

I'm not sure if this is a good idea yet, since it will result in breaking backwards compatibility (all build.ninja files with tabs won't work with older ninja versions). Please correct me if I'm wrong.

My suggestion would be to still put out a warning when there are tabs, but not fail. Then in the next release remove the warning.

I believe that users should describe/generators should emit the minimal version required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Is there a good reason to err on tabs in build files after all?
3 participants