Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gherkin: Fix parsing of commented tags #880

Merged
merged 23 commits into from
Feb 6, 2020

Conversation

mpkorstanje
Copy link
Contributor

@mpkorstanje mpkorstanje commented Jan 30, 2020

Summary

Tag parsing in gherkin suffers from four problems

  1. Gherkin does not support comment anywhere but at the start of a line.
  2. IDEs and syntax highlighters suggest that it is possible.
  3. The Java implementation splits tags on white space rather then @.
  4. While other languages split on @, tag expression do not support white space.

To illustrate:

Feature: Example
  @Example Tag and #Comment
  Scenario: Example

In Java this example is tagged with the strings @Example, Tag, and, #Comment. In other languages this would be tagged with just @Example tag and #Comment.

When trying to select this scenario using the tag expression @Example only the Java scenario would be executed. Trying to use @Example tag and #Comment as a tag expression results in a parse error. This means that tags are single words.

So by first removing the comments from the tag line by splitting on \s# and then splitting uncommented tags on @ we can parse tags in line with IDE expectations. Invalidating tags containing white space takes care of non-selectable tags.

Motivation and Context

Fixes: #721

Types of changes

  • Bug fix (non-breaking change which fixes an issue).
  • New feature (non-breaking change which adds functionality).
  • Breaking change (fix or feature that would cause existing functionality to not work as expected).

Checklist:

  • The change has been ported to Java.
  • The change has been ported to Ruby.
  • The change has been ported to JavaScript.
  • The change has been ported to Go.
  • I've added tests for my code.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have updated the CHANGELOG accordingly.

@mpkorstanje mpkorstanje changed the title gherkin/java: Fix tag parsing gherkin: Fix tag parsing Jan 30, 2020
@aslakhellesoy
Copy link
Contributor

To create new cross-platform acceptance tests, this is the easiest way at the moment:

  1. TDD the change (like you've done in Java) in gherkin/go
  2. Create some new testdata files (in this case a few testdata/bad/*.feature ones)
  3. Regenerate the approval test data from gherkin/go with GOLDEN=1 make golden
  4. cp -R testdata ../testdata && rsync_files

Since you started with Java, we could do all of this from gherkin/java if we add the same golden task in gherkin/java/Makefile

@mpkorstanje
Copy link
Contributor Author

Cheers! I'll check it out tomorrow.

@@ -3,29 +3,29 @@
"astNodeIds": [
"4"
],
"id": "24",
"id": "34",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aslakhellesoy, @vincent-psarga I don't understand why these ids changed. These scenarios didn't change position. Is there an implicit ordering going on?

@mpkorstanje
Copy link
Contributor Author

mpkorstanje commented Jan 31, 2020

@luke-hill sorry. Took me like 45 minutes to do the ruby. Getting better at!

@mpkorstanje mpkorstanje marked this pull request as ready for review January 31, 2020 20:40
@mpkorstanje mpkorstanje changed the title gherkin: Fix tag parsing gherkin: Fix parsing of commented tags Jan 31, 2020
@mpkorstanje mpkorstanje merged commit 7e68045 into master Feb 6, 2020
@mpkorstanje mpkorstanje deleted the gherkin-java-fix-tag-parsing branch February 6, 2020 17:02
Ajwah pushed a commit to Ajwah/ex-gherkin that referenced this pull request Jul 23, 2020
In the past 6 months, `cucumber/cucumber`-repo has seen a variet of
changes all of which have been fixed with this commit:
  * background to have `id`
  * rule to have `id`
  * examples to have `id`
  * ensure to only strip leading and trailing spaces in table cells.
  Keep newline intact
  * ensure escaped delimiter within doc_string is not recorded as escaped
  * doc_string has field change: contentType -> mediaType
  * ignore comments on tags

See:
  cucumber/common#891
  cucumber/common#880
  cucumber/common#889
  cucumber/common#800

At the time of this commit, master for `cucumber/cucumber` was at:
a7c593f479e7ae739b49108f79bba4853352d99c
@gasparnagy
Copy link
Member

@mpkorstanje @aslakhellesoy @sebrose I run into this change wen I tried to align the .NET Gherkin parser to the others, but I don't understand it. Regarding my knowledge Gherkin was a language that only supported full line comments (so # could only appear at the beginning of the line, except whitespace). We can debate whether this was a good decision or not, but this is how we decided. The motivation was that we wanted the scenario authors to be able to write # in the step text without escaping.

This PR changes this behavior, but only partially. It allows partial line comments, but only in tag lines. (At least based on the test files (tags.feature), I haven't checked the code.)

This seems to be inconsistent for me. Is this really what we wanted?

@mpkorstanje
Copy link
Contributor Author

mpkorstanje commented Feb 16, 2021

It allows partial line comments, but only in tag lines.

It's limited to tags only. Handled in the same code that splits the tags into individual items.

Is this really what we wanted?

Yes. It's the defacto behavior of various Gherkin highlighters and parsers. IDEA, Eclipse and the syntax highlighter on Github all have this behavior:

Feature: example
  @Tag #comment
  Scenario: example #not a comment

This is confusing users (#721) and I feel that avoiding this confusion is more important then being consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

gherkin: Commented Cucumber tags still being picked up in getTags()
3 participants