Markdown support (alternative to Gherkin) #1209

aslakhellesoy · 2020-10-04T20:49:11Z

Summary

This PR adds support for Markdown as an alternative to Gherkin

Details

The Gherkin lexer/tokenizer has been modified to recognise Markdown. This has been done by adding a new dialect named md.

The idea is that the rest of the toolchain remains mostly unchanged:

The parser is the same
Cucumber is the same (except it needs to modify the glob logic to load both **/*.feature and **/*.md)
Formatters are mostly the same

The biggest change will be in the HTML formatter - or more specifically in @cucumber/react. It needs a whole new way to render documents:

Use a Markdown library to render the source instead of our own custom React components to render a GherkinDocument AST.
Decorate the rendered Markdown DOM with results, attachments etc from other messages

Motivation and Context

Adding prose, diagrams and other rich markup to Gherkin documents is cumbersome at best.

Although the Gherkin grammar doesn't make it explicit, you can put anything in the description section of a Feature, Scenarioetc, and some formatters (such as @cucumber/react / html formatter) will process this as Markdown.

In other words, it's possible to put small snippets of Markdown inside a Gherkin document.

This isn't how people work with Markdown. If you want to use Markdown, it's much more natural if the entire document is Markdown. This give you more flexibility to write a readable document.

We use existing Markdown constructs to recognise scenarios:

## is a Scenario
### is Examples
* (list item) is a step (Given, When, Then)

Types of changes

Bug fix (non-breaking change which fixes an issue).
New feature (non-breaking change which adds functionality).
Breaking change (fix or feature that would cause existing functionality to not work as expected).

Checklist:

The change has been ported to Java.
The change has been ported to Ruby.
The change has been ported to JavaScript.
The change has been ported to Go.
The change has been ported to .NET.
I've added tests for my code.
My change requires a change to the documentation.
I have updated the documentation accordingly.
I have updated the CHANGELOG accordingly.

aslakhellesoy · 2020-10-04T22:28:02Z

Here is an example of IntelliJ IDEA running Cucumber which executes a Markdown document! - cucumber/cucumber-jvm#2140

mpkorstanje · 2020-10-04T23:45:50Z

What happens if I put more information in markdown document then just scenarios? Like for example if I were to have scenarios at a different heading then h2? Or if I were to have documentation at a scenario heading. Or if I were to have a folder of .md files some with scenarios, some with just documentation.

An example of all these at once:

Hello world
===========

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat.

## The world is round

Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu
fugiat nulla pariatur.

### Scenario: Something about math
 * step one
 * step two
 * step three

### Scenario: Something about gravity
 * step one
 * step two
 * step three

## The world is wet

Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.

### Scenario: Something about chemistry
 * step one
 * step two
 * step three

### Scenario: Something about weather
 * step one
 * step two
 * step three

## Recommended accommodations

Vivamus eget magna eros. Mauris feugiat elit a lectus vulputate eleifend
a sed nulla. 

### If you are adventurous

Nunc sed auctor sem. Quisque vitae ligula commodo quam
vehicula mollis ut at magna. 

### Only in 2020

ed semper feugiat turpis in vulputate. Praesent varius
leo a enim sollicitudin lobortis. Nam a gravida ex. Nulla
interdum orci purus, vitae pellentesque lorem ultricies ac. 

## Tourist attractions

 Pellentesque efficitur turpis lorem, in tempor metus varius vitae. Proin
lectus dolor, luctus eget pulvinar vitae, commodo nec sapien. Donec
mattis quis mi sit amet auctor.

With that in mind I would strongly recommend a separate parser for Gherkin in Markdown. It will make evolving and experimenting with the parser easier. It also removes some of the edge cases that are leaking out of the parser abstraction and into other parts right now (e.g. the json file not being usable to generate annotations).

aslakhellesoy · 2020-10-05T10:33:22Z

What happens if I put more information in markdown document then just scenarios?

These lines would just be ignored by the scanner. Probably marked as Empty.

Like for example if I were to have scenarios at a different heading then h2?

I've added GHERKIN_MARKDOWN.md which describes the proposed syntax in more detail. Scenarios would only be recognised with ## or possibly ### if the ## above is interpreted as a rule.

Or if I were to have a folder of .md files some with scenarios, some with just documentation.

I'm not sure. If the user doesn't specify what files to include, we could just parse them all. The "documentation" ones might end up having a lot of "undefined" scenarios. If that becomes a hassle we could come up with a more restrictive syntax to reduce the likelihood of this happening.

With that in mind I would strongly recommend a separate parser for Gherkin in Markdown.

Maybe. I think it's too early to make this decision. It's an investment I would like to defer until we have more data. If we can make this work without writing a new parser I would prefer that.

mpkorstanje · 2020-10-05T10:53:15Z

There were two ideas that underpinned my questions. I think they've both gone unaddressed.

The proposed markdown syntax is very inflexible. Why should I define my scenarios always at h2. I'm writing a document that contains features, not a feature file that happens to look like markdown. There seems to be no point in using markdown otherwise.
The implementation is a feature file with different keywords that happen to intersect with markdown but isn't actually markdown. The Gherkin markdown parser should accept all valid markdown documents but only provide pickles if there are indeed scenarios contained within. Currently the parser will reject anything that isn't structured like a feature file.

aslakhellesoy · 2020-10-05T11:29:40Z

Why should I define my scenarios always at h2

I think it's important that users understand what's regarded as a scenario. The parsing rules must be easy to retain and understand for humans. If we make this flexible, as you seem to suggest, I think it will be harder for people to retain and understand the parsing rules.

Gauge is a tool inspired by Cucumber that has supported Markdown from the start. It's not as popular as Cucumber, but it seems to have a healthy user base. I take this as a sign that the Markdown syntax they have settled for works for end users, and that it would be safe for us to adopt a similar (or perhaps identical) syntax. Gauge defines a scenario as a h2.

In order to support the alternative syntax for h2 (text underlined with -----) we'd need a more advanced parser. I'd like to try with a 3rd party commonmark parser when/if we decide to add support for this. But as I said above, I think we can defer this until we have more feedback.

Currently the parser will reject anything that isn't structured like a feature file.

Yes, I think we need to improve the Markdown parser to be more tolerant of documents that don't use the Gherkin structure.

slavcodev · 2020-10-05T11:38:22Z

gherkin/GHERKIN_MARKDOWN.md

+- `* {Keyword}` - Given, When, Then, And and But
+- `|` - Tables (DataTable and ExamplesTable)
+- `\`\`\`` - DocString
+- `>` - prefix for @tags


The quotes may be not very appropriate for tags, they maybe useful for descriptions of feature or scenarios.

Please consider to allow tags in html comments, especially for features, e.g.

 # Feature  ## Scenario

Using > for descriptions of scenarios and features wouldn't be necessary. You'd just use normal paragraphs for that:

## This is a scenatio This is the description over a few lines... * a step * another step

Are there other reasons we want to consider something else than > to prefix tags?

If we put tags inside HTML comments, they won't be rendered once the Markdown is converted to HTML, and I think most users would expect them to be rendered.

Sorry, I maybe not very clear with my bad English. I did not mean > as marker for feature description, I mean > is a quote in markdown and I imagine I would want to add some quotes in the document which describes the a feature.

If we put tags inside HTML comments, they won't be rendered once the Markdown is converted to HTML, and I think most users would expect them to be rendered.

Yeah, this is what I also came to after thinking more about my proposal.

In general I think the > is fine (if parser will look tags which starts with @ in the quotes).

The only thing that unclear to me, is the parser going to look for tags before # Feature or after? Because I think the quotes before level 1 header is not how we usually write the documents.

Yeah, I think it would look better if the tags came after the heading. However, that changes the grammar a bit and might complicate things.

This could work:

# @foo @bar # Feature: Hello According to research: > Water boils at 200C ## @zap ## Scenario: Hello

It admittedly looks a bit weird in Markdown and the default rendering, but we could style it so that # @tag headers are rendered more like small tags above the real header.

mpkorstanje · 2020-10-05T16:06:52Z

If we make this flexible, as you seem to suggest, I think it will be harder for people to retain and understand the parsing rules.
I take this as a sign that the Markdown syntax they have settled for works for end users, and that it would be safe for us to adopt a similar (or perhaps identical) syntax.

I'm expecting it to be flexible because is called Markdown. The example is meant to illustrate that. Gauge on the other hand is calling their specifications "Gauge specifications " and uses the .spec extension with a syntax similar to Markdown (but not actually Markdown). This avoids the problems of mixing markdown and specifications and manages the expectations.

aslakhellesoy · 2020-10-05T20:34:32Z

One idea is to define scenarios like this:

Bla bla

# Feature: Addition

Bla bla

## Scenario: 2+3

Bla bla

* Given I have entered 2

Bla bla

People could use any number of # we match based on Scenario. This also allows internationalising the markdown parsers.

I think this kind of “overlaying” Gherkin on top of Markdown might work better.

It addresses the concern you had about pure documentation documents.

It also makes it easy for authors and consumers to spot more easily what parts of the documentation is executable.

I also think this would be easy to implement.

Rationale: fewer spaces to parse means more efficient loading.

aslakhellesoy

Reviewed with @aurelien-reeves and @mattwynne

Call it "Markdown with Gherkin" (MdG)
- Rename token matcher
Document why we're not matching dialect (has to be set globally by cucumber, see cucumber-js has a command line option for this).
Document that JSON formatters won't have descriptions for Markdown documents.
- Tell people who want it that JSON formatter is in maintenance mode - use message format instead.
- Maybe add descriptions to AST, it might be easy...

Fix a typo

Make lexer handle basic Markdown

59bc13e

aslakhellesoy added library: gherkin type: feature labels Oct 4, 2020

aslakhellesoy mentioned this pull request Oct 4, 2020

Support Markdown cucumber/cucumber-jvm#2140

Closed

Add markdown source envelope, document syntax.

06424d2

slavcodev reviewed Oct 5, 2020

View reviewed changes

Aslak Hellesøy and others added 17 commits October 8, 2020 11:01

WIP

7d69bfc

Merge branch 'master' into markdown

1caf2d4

Don't assume .feature extenstion

fd9f1c3

Use a separate markdown tokenizer

3e029e9

Extract ITokenMatcher interface

7be0198

Add new Markdown tokenizer

4e17488

Merge branch 'master' into markdown

4cf018f

Recognise non-keyword lines as Empty

db0d9a9

WIP

9853dee

Merge branch 'master' into markdown

0766138

Merge branch 'master' into markdown

cfbc2af

Merge branch 'master' into markdown

017d14c

Reduce size of generated Languages.pm module by 70%

f2cc54e

Rationale: fewer spaces to parse means more efficient loading.

Use Carton (Perl's Bundler) to store project-local dependencies

89e3f12

Update dependency org.mockito:mockito-junit-jupiter to v3.7.0

a704da7

Increase stale time

d5fb5c2

Fix eslint warnings

13e54cb

Aslak Hellesøy added 3 commits May 7, 2021 09:34

Merge branch 'markdown' of github.com:cucumber/cucumber into markdown

9ad00c9

Merge branch 'master' into markdown

41dcf23

Roll back changes to @cucumber/react. We'll redo it when #1391 lands

2c0bf4c

aslakhellesoy commented May 7, 2021

View reviewed changes

aslakhellesoy and others added 22 commits May 7, 2021 16:04

Merge branch 'master' into markdown

a742a17

Merge branch 'master' into markdown

380d2fe

Update documentation

c2fc9b6

Renames

82b6ecd

Document why we're not recognising language directives

630c9f6

Capture descriptions. Make Feature header optional.

94fd5a7

Reformat, merge main

d9513e7

Merge remote-tracking branch 'origin/master' into markdown

2d9078d

Update MARKDOWN_WITH_GHERKIN.md

ba60e40

Fix a typo

Merge remote-tracking branch 'origin/master' into markdown

6902df9

Remove dewscription from Markdown

963dcbf

Merge branch 'markdown' of github.com:cucumber/cucumber into markdown

60f1dae

Merge master

e090a61

Update dependencies

0b3874b

Merge branch 'master' into markdown

62fefff

Merge remote-tracking branch 'origin/master' into markdown

25ceeb3

Update dependencies

224f1ab

Merge branch 'master' into markdown

5ca0d0b

Update dependencies

7e279a4

Update package-lock.json

2beae34

Merge master

9a7f3e2

Document why the AST has no description

f6a9593

aslakhellesoy merged commit 763170c into master May 14, 2021

aslakhellesoy deleted the markdown branch May 14, 2021 12:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Markdown support (alternative to Gherkin) #1209

Markdown support (alternative to Gherkin) #1209

aslakhellesoy commented Oct 4, 2020 •

edited by mattwynne

Loading

aslakhellesoy commented Oct 4, 2020 •

edited

Loading

mpkorstanje commented Oct 4, 2020 •

edited

Loading

aslakhellesoy commented Oct 5, 2020

mpkorstanje commented Oct 5, 2020 •

edited

Loading

aslakhellesoy commented Oct 5, 2020

slavcodev Oct 5, 2020

aslakhellesoy Oct 5, 2020

slavcodev Oct 6, 2020

aslakhellesoy Nov 17, 2020

mpkorstanje commented Oct 5, 2020 •

edited

Loading

aslakhellesoy commented Oct 5, 2020

aslakhellesoy left a comment

Markdown support (alternative to Gherkin) #1209

Markdown support (alternative to Gherkin) #1209

Conversation

aslakhellesoy commented Oct 4, 2020 • edited by mattwynne Loading

Summary

Details

Motivation and Context

Types of changes

Checklist:

aslakhellesoy commented Oct 4, 2020 • edited Loading

mpkorstanje commented Oct 4, 2020 • edited Loading

aslakhellesoy commented Oct 5, 2020

mpkorstanje commented Oct 5, 2020 • edited Loading

aslakhellesoy commented Oct 5, 2020

slavcodev Oct 5, 2020

Choose a reason for hiding this comment

aslakhellesoy Oct 5, 2020

Choose a reason for hiding this comment

slavcodev Oct 6, 2020

Choose a reason for hiding this comment

aslakhellesoy Nov 17, 2020

Choose a reason for hiding this comment

mpkorstanje commented Oct 5, 2020 • edited Loading

aslakhellesoy commented Oct 5, 2020

aslakhellesoy left a comment

Choose a reason for hiding this comment

aslakhellesoy commented Oct 4, 2020 •

edited by mattwynne

Loading

aslakhellesoy commented Oct 4, 2020 •

edited

Loading

mpkorstanje commented Oct 4, 2020 •

edited

Loading

mpkorstanje commented Oct 5, 2020 •

edited

Loading

mpkorstanje commented Oct 5, 2020 •

edited

Loading