Gherkin: Delegate to Gherkin-Go prebuilt binaries #425

aslakhellesoy · 2018-07-03T11:36:48Z

aslakhellesoy · 2018-07-03T11:51:07Z

FWIW there are only 348 SLOC left in gherkin-java after I deleted 10K SLOC.

What @cucumber/committers want to help out with the remaining implementations? Nothing is more satisfying than deleting code :-)

Also note that gherkin now supports a new Rule keyword (#416), so if you want that to be available in the next version of cucumber-X - this has to be done.

d-led · 2018-07-03T11:57:54Z

the bullet points are editable. Is this by design?

what about platforms where there are no gherkin-go libraries? E.g. as there's no viable Go compiler, is it out of scope?

aslakhellesoy · 2018-07-03T12:04:23Z

Go's toolchain can cross-compile binaries for most platforms, from any OS that has a Go compiler. We're currently doing that as part of the CI process. This is what we're currently building:

gherkin-darwin-386
gherkin-darwin-amd64
gherkin-freebsd-386
gherkin-freebsd-amd64
gherkin-freebsd-arm
gherkin-linux-386
gherkin-linux-amd64
gherkin-linux-arm
gherkin-linux-mips
gherkin-linux-mips64
gherkin-linux-mips64le
gherkin-linux-mipsle
gherkin-linux-s390x
gherkin-netbsd-386
gherkin-netbsd-amd64
gherkin-netbsd-arm
gherkin-openbsd-386
gherkin-openbsd-amd64
gherkin-windows-386.exe
gherkin-windows-amd64.exe

The CI process also uploads them to S3, where they are publicly accessible. Try this:

curl https://s3.eu-west-2.amazonaws.com/io.cucumber/gherkin-go/master/gherkin-darwin-amd64 -o gherkin-darwin-amd64
chmod +x gherkin-darwin-amd64
./gherkin-darwin-amd64 --json features/*.feature

If there are additional platforms we need to support I'm sure we can make it work for most of them.

aslakhellesoy · 2018-07-03T12:05:23Z

the bullet points are editable. Is this by design?

Yes, but only for committers. See https://blog.github.com/2013-01-09-task-lists-in-gfm-issues-pulls-comments/

muggenhor · 2018-07-03T16:21:52Z

Go's toolchain can cross-compile binaries for most platforms, from any OS that has a Go compiler. We're currently doing that as part of the CI process. This is what we're currently building:

This is not relevant for me at work yet, but we're targeting aarch64 (aka arm64) and I'm expecting to see more of that in the future, not less. So adding that as a target CPU seems desirable.

aslakhellesoy · 2018-07-03T16:22:53Z

@muggenhor PR for that would be much appreciated

gasparnagy · 2018-07-03T18:19:40Z

As I have mentioned earlier, having a process-execution-based parser is great for maintainability, but there are other concerns (performance, security, working offline) that make still room for native parsers. As far as I understand now the decision has been made that this repo will contain the go wrappers. Is there any advice on how/where the native parsers should be maintained (if the community wishes to do that of course)? Shall we host them under the "cucumber" organization (e.g. cucumber/gherkin-dotnet-native) or shall we move it to other organizations/users. I would keep it within cucumber for more consistency. With @SabotageAndi, we are committed to keep maintaining the .NET native parser, and personally, I would like to see a native javascript parser too (I yet have to see how I can contribute to that).

aslakhellesoy · 2018-07-03T18:50:35Z

Regarding performance - I haven't done any throrough benchmarking, but here is a small one:

# Gherkin-Java 5.1.0
time ./bin/gherkin testdata/good/*.feature > /dev/null

real	0m0.299s
user	0m0.367s
sys	0m0.068s

# Gherkin-Java 6.0.0-SNAPSHOT (master)
time ./bin/gherkin testdata/good/*.feature > /dev/null

real	0m0.313s
user	0m0.388s
sys	0m0.065s

For Java there is no noticeable degradation in performance. The go binary is very fast:

# Gherkin-Go (master)
time ./bin/gherkin testdata/good/*.feature > /dev/null

real	0m0.021s
user	0m0.007s
sys	0m0.006s

Regarding working offline: The jar currently bundles binaries for windows/linux/osx + amd64/x86_32. It only tries to download binaries from S3 for other platforms. We should probably bundle all the binaries so we never have to go online. It would increase the size of the jar a little, but disk is cheap.

@gasparnagy I hope this adresses your concerns about performance and the ability to work offline.

Let's focus on security issues next and see where that brings us. What securtity concerns do you have?

l3pp4rd · 2018-07-03T19:12:26Z

have you considered using upx for go binary compression? UPX may decrease the executable startup performance by a tiny fraction, but may reduce the binary size by more than a half, I was successfully using it for more than a year, there was one moment when it was producing a failing binary, but that was an issue on go end.

aslakhellesoy · 2018-07-03T19:16:48Z

Thanks for the upx pointer l3pp4rd - sounds like a good idea. Do you want to tweak the build to use it?

l3pp4rd · 2018-07-03T19:25:56Z

in a day or two I will be able to, if it can wait. I have a baby now, who changed my free time dramatically :) that is why I'm not that frequent contributor. but whenever I can make a small thing, I will be happy to.

aslakhellesoy · 2018-07-03T19:35:29Z

Congratulations on the baby @l3pp4rd!!!!! I don't think upx is urgent at all, take your time :-)

mxygem · 2018-07-04T01:27:25Z

Pass all the tests :-)

Are we including the Windows tests for Ruby, too ;)

aslakhellesoy · 2018-07-04T05:05:56Z

I’m referring to the shared approval tests for gherkin - comparing output with files in ‘testdata’.

mxygem · 2018-07-04T05:19:04Z

Ah, yeah. Makes sense!

mxygem · 2018-07-04T23:54:00Z

−8,020 in my PR for the Ruby stuff so far. :)

aslakhellesoy · 2018-11-09T16:10:12Z

@SabotageAndi the security issues are going to be a concern. See this thread: https://groups.google.com/d/msg/cukes/ZYDs13qoVTY/OtvCvl_fBQAJ

I think we're at an impasse here.

We need to reduce the amount of duplicated development effort
Bundling cross-compiled executables will be blocked in many IT departments

I'd like to explore another avenue. Installing the executable separately. This is how WebDriver works. The various native language implementations are just thin shims that talk to a daemon running locally (chromedriver, geckodriver etc). This has to be installed separately.

Another option is to write the shared code in a language that can be compiled to many targets, such as Haxe. I don't think it's a viable option though. Few people know Haxe, so maintenance would be hard.

What do you think @charlierudolph @brasmusson @SabotageAndi @gasparnagy @mattwynne?

mpkorstanje · 2018-11-09T16:24:42Z

Installing the executable separately. This is how WebDriver works. The various native language implementations are just thin shims that talk to a daemon running locally (chromedriver, geckodriver etc). This has to be installed separately.

This makes it hard to keep projects selfcontained. It'd require some maven/gradle plugin to fetch the executable as part of the build. E.g: https://github.com/Ardesco/selenium-standalone-server-plugin

This solution would be less then perfect though. Most build systems in the java ecosystem can only download artifacts through a proxy/cache like Nexus.

aslakhellesoy · 2018-11-09T16:32:07Z

Do you have any suggestions @mpkorstanje?

mpkorstanje · 2018-11-09T17:57:51Z

I think there are exactly four options: transpile, interpret, generate and delegate. And we're evaluating all these options against the same set of constraints: should be usable and should minimize effort.

So far we've found generation and delegation to be lacking. Only interpretation and transpilation have not yet been tested.

You've added an ease of maintenance requirement. But I don't Haxe is going to any harder then it already is. We're already dealing with a parser generators for a relatively nice language. If there were more people then we wouldn't be trying to reduce the duplication of effort in the first place. So depending on how well it transpiles Haxe would be an option.

As for interpretation: I couldn't find a JavaScript interpreter for most languages.

aslakhellesoy · 2018-11-09T20:56:00Z

By delegation I assume you mean delegating to a binary executable, communicating over streams (what's in gherkin 6).

There are some problems with our current implementation:

Downloading (some malware/virus scanners don't allow downloading of binary executables)
Execution (some OSes are configured to only allow execution of "approved")

If we can address these problems, I don't think delegation is going to be a significant problem.

I propose we stop bundling the executables inside the platform-specific packages. We'll refactor the native code so it can use either of the following strategies for talking to the executable:

Over a socket (assuming it's already running)
Over STDIN/STDOUT (it would be launched just like now)

Users would have to download the binary executable in addition to installing the native package. We would need to build version negotiation into the protocol - the executable and the native package may drift out of sync.

This is the model WebDriver uses. It's a bit clunky, but it works. I can't think of a better strategy right now.

aslakhellesoy · 2018-11-09T21:09:51Z

Just adding for the record that transpilation is not an option. I don’t want to build this new infrastructure and exclude all the platforms that are unsupported by the transpiler.

mpkorstanje · 2018-11-09T22:31:46Z

I can't think of a better strategy right now.

Me neither. But I don't think it's a winning one.

mattwynne · 2018-11-09T22:42:07Z

Would it be possible to use the shared code as libraries rather than executables? Could the Go code be compiled into something that each platform could use as a library instead?

aslakhellesoy · 2018-11-09T23:39:02Z

It’s possible to compile Go code to shared libraries.

We’d still need to bundle them, so bundle sizes would be similar. Not sure if it would shut up malware detectors.

It might be worth a try!

mpkorstanje · 2018-11-10T11:25:20Z

Not sure if it would shut up malware detectors.

Should be easy to test this if we can find a jar with some native code in it. And we do!

https://jogamp.org/wiki/index.php/Downloading_and_installing_JOGL#Native_JARs_vs._native_library_files

Perhaps people can test this from behind a firewall?

https://search.maven.org/remotecontent?filepath=org/jogamp/jogl/jogl-all/2.3.2/jogl-all-2.3.2-natives-linux-amd64.jar
https://search.maven.org/remotecontent?filepath=org/jogamp/jogl/jogl-all/2.3.2/jogl-all-2.3.2-natives-windows-amd64.jar

cyocum · 2018-11-10T11:31:20Z

I have been very busy lately but I have picked up on this as I have seen that languages that have not moved to the new structure will be removed in January. Honestly, I have little time right now to get Cucumber.ml into this structure. Mostly, and I hate to admit this, but I do not understand how to get the protobuf messages information from the monorepo to my repo. I have tried twice now and I am, at this point, just avoiding this because I do not have the time to figure it out on my own. I will need someone who has time to hand-hold me through the process because it seems out of my ability. If someone has the time and patience to help me, I would appreciate it but right now I do not have the time as I have another project that has my time right now. I am hoping to get it out of the way but I will need a long paring session to understand what exactly I need to do to get information that my project needs from the monorepo into mine to avoid deletion.

brasmusson · 2018-11-10T12:05:03Z

I think we should allow for a combination of approaches. The standardized test suite (feature files and corresponding golden json files) has been very successful, enabling us to growing 8 different parser implementations fulfilling the same gherkin language specification. Whatever we do we must not loose this ability. Side note: I think json golden files are better for qualifying a parser implementation than protocol buffer golden files, since it requires less of the parser implementation to show compliance to the gherkin language specification.

For the sake of argument, lets assume that we come up with an approach that works for Java, Javascript and Ruby (that is gherkin parsers to be used in Cucumber-JVM, Cucumber-JS and Cucumber-Ruby), but that approach is not suitable for the .NET environment. Then the gherkin parsers for Java, Javascript and Ruby can be developed and released together (using the mono-repo), and the gherkin for .NET can be developed and released separately still qualified for compliance to the gherkin language specification, and still hosted in the cucumber github project. The hosting in the cucucmber github project should probably require the used of the berp parser generated so that the is some consistency of implementation between the gherkin parsers hoisted here, even though not all are developed and released together.

This is similar to that not all Cucumber-JVM modules are developed and released together anymore. The non-Java language modules are released separately. Similar to a issue discussed here, this change was done to reduce the maintenance burden for the Cucumber-JVM core maintainers.

SabotageAndi · 2018-11-12T08:42:41Z

@aslakhellesoy

I'd like to explore another avenue. Installing the executable separately. This is how WebDriver works. The various native language implementations are just thin shims that talk to a daemon running locally (chromedriver, geckodriver etc). This has to be installed separately.

On .NET, Selenium and all of the webdrivers are distributed by packages. See: https://www.nuget.org/packages?q=selenium
And this is good that way. No idea how I could explain my users, that they have to install an additional tool, when you want to use some library.

Additionally, you can then only have one version installed on your system. There will be breaking changes in the IPC.
And how to you install it on third party hosted build agents (as the AzureDevOps/VSTS hosted pipelines are)?

So having a separate binary to install is also not an option in .NET.

As @mpkorstanje wrote in #425 (comment) I think it is essential, that projects are self-contained. It is really annoying for me, when I have to install a long list of additional tools/libraries to get a project to compile.

@brasmusson Don't we have this golden files here: https://github.com/cucumber/cucumber/tree/master/gherkin/testdata

About shared libraries:

AFAIK they should work on .NET Standard. I have to do some experiments. But native stuff in a .NET program makes always problems. And the size increase of the package is still there. I am still not sold, that this would be good for .NET.

brasmusson · 2018-11-12T16:33:47Z

@SabotageAndi Yes, we have the golden files, my point is just that we shall make sure not to loose them, even if we move away form that all gherkin parsers are stored in the cucumber/cucumber (mono-repo) project, and are released together.

laeubi · 2018-11-30T16:25:09Z

there are only 348 SLOC left in gherkin-java after I deleted 10K SLOC

But in contrast, the jar blows up from 342 kb to 45 MB(!), I understand that it is hard to support many parsers, but is it really the way to go to package such large binaries instead? Won'T it be better to write the parser in some kind of Parser-Language and the generate code for each language from that?
Also spawning new processes might degrade performance compared to a "native" parser.
This really hurts currently the cucumber-eclipse where I try to integrate the new parser, where parsing is used on several places to support the user, now parsind would include to spawn a new go-process for each parsing request.

mpkorstanje · 2018-11-30T21:33:44Z

This really hurts currently the cucumber-eclipse where I try to integrate the new parser, where parsing is used on several places to support the user, now parsind would include to spawn a new go-process for each parsing request.

Unrelated but I don't think it's worth switching to Gherkin 6 just yet for the plugin. Cucumber JVM also can't use it.

srvance · 2018-12-23T21:51:32Z

I'm looking to implement Gherkin compatibility in a new test tool and need to be able to turn feature specs to pickles in a browser. Delegating to an external Go process clearly won't work and I don't want the overhead of having to call a back end API for the functionality.

The gherkin README still documents library use, which no longer seems to exist.
What version has the older library code?

stale · 2019-02-21T21:54:30Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs.

mxygem · 2019-02-21T21:59:04Z

Removed stale and added Slow Burner to keep things open for now. Consider this particular issue on hold at least until Cukenfest where there'll hopefully be some more discussions about how to address things.

Sorry folks, but thank you for the patience!

Ruschnik · 2019-03-13T21:36:46Z

McAfee reported an “Artemis!ACC6568E95DA” regarding artifact .m2\repository\io\cucumber\gherkin\6.0.14\gherkin-6.0.14.jar\gherkin-go-windows-386.exe.
Because of that we downloaded gherkin-6.0.14.jar from maven central and tested it with McAfee and also here McAfee reported the same threat.

Is there anything wrong with the file regarding viruses or is it a false positive?

mpkorstanje · 2020-05-20T00:16:15Z

The Gherkin parser is currently implemented in too many languages. It's too much to maintain. Each language implementation should spawn a child process (gherkin-go), read its STDOUT and deserialise the output to language-native objects using Protocol Buffers.

Informally we have decided against continuing this approach and the latest versions of Gherkin no longer include the go executables.

aslakhellesoy added language: ruby language: java language: javascript library: gherkin language: c# language: go language: objective-c language: swift language: python language: perl labels Jul 3, 2018

ptrthomas mentioned this issue Nov 9, 2018

Karate engine re-write - exit from Cucumber but retain 100% syntax backwards compatibility karatelabs/karate#444

Closed

stale bot added the ⌛ stale Will soon be closed by stalebot unless there is activity label Feb 21, 2019

mxygem added 🧷 pinned Tells Stalebot not to close this issue and removed ⌛ stale Will soon be closed by stalebot unless there is activity labels Feb 21, 2019

dixonwille mentioned this issue Jul 22, 2019

messages: Go generated proto messages not checked in #660

Closed

laeubi mentioned this issue Jul 26, 2019

.feature doesn't work with import from io.cucumber but work with cucumber.api cucumber/cucumber-eclipse#368

Open

mpkorstanje closed this as completed May 20, 2020

mpkorstanje removed the 🧷 pinned Tells Stalebot not to close this issue label May 20, 2020

eggyal mentioned this issue Jul 21, 2020

Splitting into Gherkin parser + Pickle executor? cucumber-rs/cucumber#57

Closed

Gherkin: Delegate to Gherkin-Go prebuilt binaries #425

Gherkin: Delegate to Gherkin-Go prebuilt binaries #425

Comments

aslakhellesoy commented Jul 3, 2018 • edited Loading

aslakhellesoy commented Jul 3, 2018 • edited Loading

d-led commented Jul 3, 2018

aslakhellesoy commented Jul 3, 2018

aslakhellesoy commented Jul 3, 2018

muggenhor commented Jul 3, 2018

aslakhellesoy commented Jul 3, 2018

gasparnagy commented Jul 3, 2018

aslakhellesoy commented Jul 3, 2018

l3pp4rd commented Jul 3, 2018

aslakhellesoy commented Jul 3, 2018

l3pp4rd commented Jul 3, 2018 • edited Loading

aslakhellesoy commented Jul 3, 2018

mxygem commented Jul 4, 2018

aslakhellesoy commented Jul 4, 2018

mxygem commented Jul 4, 2018

mxygem commented Jul 4, 2018

aslakhellesoy commented Nov 9, 2018 • edited Loading

mpkorstanje commented Nov 9, 2018

aslakhellesoy commented Nov 9, 2018

mpkorstanje commented Nov 9, 2018

aslakhellesoy commented Nov 9, 2018

aslakhellesoy commented Nov 9, 2018

mpkorstanje commented Nov 9, 2018

mattwynne commented Nov 9, 2018

aslakhellesoy commented Nov 9, 2018

mpkorstanje commented Nov 10, 2018 • edited Loading

cyocum commented Nov 10, 2018

brasmusson commented Nov 10, 2018

SabotageAndi commented Nov 12, 2018

brasmusson commented Nov 12, 2018

laeubi commented Nov 30, 2018

mpkorstanje commented Nov 30, 2018

srvance commented Dec 23, 2018

stale bot commented Feb 21, 2019

mxygem commented Feb 21, 2019

Ruschnik commented Mar 13, 2019

mpkorstanje commented May 20, 2020

aslakhellesoy commented Jul 3, 2018 •

edited

Loading

aslakhellesoy commented Jul 3, 2018 •

edited

Loading

l3pp4rd commented Jul 3, 2018 •

edited

Loading

aslakhellesoy commented Nov 9, 2018 •

edited

Loading

mpkorstanje commented Nov 10, 2018 •

edited

Loading