Clean up ATN serialization: rm UUID and shifting by value of 2 #3515

parrt · 2022-01-29T23:11:47Z

I think we don't need the UUID in the serialization, since it has not changed in a decade. We can bump the version number and remove the UU ID
I did some tests and there seems to be no reason to shift the values in the serialized ATN by 2 for the purposes of improving the UTF-8 encoding for the Java target.

If you guys agree, we can make this small change for cleanup purposes. I'm happy to do it if you guys don't want to. The second fix will require changes to each target but it's trivial to fix.

ericvergnaud · 2022-01-30T09:56:30Z

Wouldn’t that break when old runtimes (expecting a UUID) try to read new grammars (without a UUID) ? Envoyé de mon iPhone

…

Le 30 janv. 2022 à 00:12, Terence Parr ***@***.***> a écrit : Assigned #3515 to @ericvergnaud. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were assigned.

KvanTTT · 2022-01-30T11:28:54Z

It does not matter since, fortunately, generated parsers are not back-compatible because of version check on Runtime.

KvanTTT · 2022-01-30T11:39:52Z

Ok, I can fix the issue.

ericvergnaud · 2022-01-30T14:05:08Z

That just raises a warning With the proposed change they are likely to crash Envoyé de mon iPhone

…

Le 30 janv. 2022 à 12:29, Ivan Kochurkin ***@***.***> a écrit : It does not matter since, fortunately, generated parsers are not back-compatible because of version check on Runtime. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

KvanTTT · 2022-01-30T14:17:33Z

Yes, it will be crashed but with the clear message (Could not deserialize ATN with version 3 (expected 4)). Maybe it makes sense to add info about lexer/parser regeneration with a new tool. BTW some users don't like warning messages as well: #3278

ericvergnaud · 2022-01-30T15:22:19Z

We probably want to test this before making assertions ? I believe it would crash in Swift.

parrt · 2022-01-30T18:06:35Z

I think that right now we have three tests for compatibility: runtime version of the entire tool, ATN serialization version, and UUID. I never understood what the UUID. I don't think we need 3 checks. We probably don't even need the ATN serialization given the warning when people are mixing runtime and generated code from a different tool. We can leave the version number in there and the runtime. The runtime mismatch will give a warning I think but we can have the version number crash. But, then it begs the question why didn't the first one crash if it's not gonna work. I guess it makes sense that we could tweak the runtime libraries for multiple versions of the software and keep the ATN serialization the same. Definitely if there's a version difference in the serialization is crash.

Per @ericvergnaud I will go try this. It definitely crashed earlier when I tweak the UUID but will check with just a version number.

parrt · 2022-01-30T18:38:04Z

We probably want to test this before making assertions ? I believe it would crash in Swift.

It definitely crashes as we want at least for Java:

java.lang.UnsupportedOperationException: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN; Could not deserialize ATN with version 3 (expected 4).

	at org.antlr.v4.runtime.atn.ATNDeserializer.deserialize(ATNDeserializer.java:90)
	at org.antlr.v4.runtime.misc.InterpreterDataReader.parseFile(InterpreterDataReader.java:133)
	at org.antlr.v4.test.runtime.java.TestInterpreterDataReader.testParseFile(TestInterpreterDataReader.java:24)

KvanTTT · 2022-01-30T18:39:52Z

It crashes with the same error in all runtimes after my changes.

parrt · 2022-01-30T18:40:51Z

Ok, that's good news. let me poke around with the unit tests that have no source g4 files.

ericvergnaud · 2022-01-30T20:37:46Z

Mmmm… Swift doesn’t check version, so that’s a bit surprising...

…

Le 30 janv. 2022 à 19:40, Ivan Kochurkin ***@***.***> a écrit : It crashes with the same error in all runtimes. — Reply to this email directly, view it on GitHub <#3515 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZNQJCSZ7JUFHVKLBKNGFDUYWAYHANCNFSM5NDOJ3EA>. You are receiving this because you were mentioned.

parrt · 2022-01-30T20:59:07Z

I see this in swift ATNDeserializer.swift:

if version != ATNDeserializer.SERIALIZED_VERSION {
    let reason = "Could not deserialize ATN with version \(version) (expected \(ATNDeserializer.SERIALIZED_VERSION))."
    throw ANTLRError.unsupportedOperation(msg: reason)
}

KvanTTT · 2022-01-30T21:03:19Z

Yes, Swift checks the version both in binary and in JSON deserializer methods. In the master Swift uses JSON serialization but I've removed it in one of the latest PR: #3513

ericvergnaud · 2022-01-30T21:04:48Z

Fine then Envoyé de mon iPhone

…

Le 30 janv. 2022 à 22:03, Ivan Kochurkin ***@***.***> a écrit : Yes, Swift checks the version both in binary and in JSON deserializer methods. Now Swift uses JSON serialization but I've removed it in one of the latest PR: #3513 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

parrt · 2022-01-30T21:12:43Z

Actually, concerning #3513, why not use the binary string in the parser file like others? Now Swift build is much more complicated.

KvanTTT · 2022-01-30T21:16:20Z

I've introduced multiline string serialization into Swift in that PR like other runtimes. It removes a lot of serializing code and decreases the size of the building output. I don't know exactly why Swift target authors introduced JSON serialization (maybe because there were a couple of errors in binary serialization), it looks excess.

luisespino · 2022-02-07T22:29:27Z

absolutely stellar support!

bendgk Absolutely, despite having other jobs, and still continue with the project.

parrt · 2022-02-07T22:32:36Z

Not rollback, just split on dev and master (create dev on current master and hard reset master on 4.9.3)

Yeah, that definitely makes sense but then I should change the default PR branch to dev. Hmm... I should think about this before I make a mistake. Have to run off at the moment...

parrt · 2022-02-08T01:40:43Z

I think I did it right!

parrt · 2022-02-08T01:50:38Z

Hmm... I just set the default branch to dev. @bendgk does the Go fetch

$ go get github.com/antlr/antlr4/runtime/Go/antlr

now just go to dev instead of master? If so, we have not changed anything except made the repository more complicated!!!

Maybe we can change the documentation to

$ go get github.com/antlr/antlr4/runtime/Go/antlr@master

I think this works with go 1.11. This seems to confirm it.

parrt · 2022-02-08T01:53:51Z

I started updating the README when I ran into this issue of the default branch.

@jcking, @mike-lischke, @marcospassos, @ericvergnaud Please note the change to the repository, which might of been a mistake ha ha

Now I'm wondering whether I should try to undo the damage and not push this change to the README. I'll wait to hear from the Go Target users.

luisespino · 2022-02-08T02:12:15Z

I'll wait to hear from the Go Target users.

@parrt Hi, I am working with ANTLR and Go. I fixed my problem by downloading the ZIP and making a local package. Because if you use: go get github.com/antlr/antlr4/runtime/Go/antlr it downloads the default branch, that is, the dev. Which still causes problem when using the visitor by version of deserialize ATN. I think the Master branch should be the default branch for use the import ("github.com/antlr/antlr4/runtime/Go/antlr") .

Greetings.

parrt · 2022-02-08T02:12:58Z

Hi! Well, can you try the

go get github.com/antlr/antlr4/runtime/Go/antlr@master

??? Hopefully that will make everything work properly

(We can't use the last release branch as the default because then PRs will all be created from that and not our current development branch.) :(

parrt · 2022-02-08T02:14:58Z

Ok, I have pushed the documentation changes to the master and dev branches. I did a reset hard on master after cutting the dev branch from our latest changes. Then, hi force pushed master to upstream antlr/antlr4. Hope that's right and that Go can use the @master module query. :)

luisespino · 2022-02-08T02:17:18Z

@parrt use @master in go get works fine, thanks for the support!

parrt · 2022-02-08T02:21:44Z

Oh, @KvanTTT and @ericvergnaud we need to change the continuous integration script so that they use dev not master. I just fixed the github CI for dev but I'm not sure how to do that for circleci.

ericvergnaud · 2022-02-08T08:12:40Z

from Circle-CI GitHub We will check the default branch and update it for your project on each commit pushed. If you change the default branch it will be updated on the next commit push to CircleCI

…

Le 8 févr. 2022 à 03:21, Terence Parr ***@***.***> a écrit : Oh, @KvanTTT <https://github.com/KvanTTT> and @ericvergnaud <https://github.com/ericvergnaud> we need to change the continuous integration script so that they use dev not master. I just fixed the github CI for dev but I'm not sure how to do that for circleci. — Reply to this email directly, view it on GitHub <#3515 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZNQJEI3VC3XJ2RUAUZXJ3U2B44JANCNFSM5NDOJ3EA>. You are receiving this because you were mentioned.

mike-lischke · 2022-02-08T08:32:07Z

Odd, my fork (https://github.com/mike-lischke/antlr4) does not show the dev branch, only master. Does this mean all forks have to re-fork again?

ericvergnaud · 2022-02-08T08:59:56Z

No, you need to go to GitHub and change your default brach from master to dev

mike-lischke · 2022-02-08T09:05:22Z

I cannot switch the default branch if dev is not even available:

KvanTTT · 2022-02-08T09:38:22Z

now just go to dev instead of master? If so, we have not changed anything except made the repository more complicated!!!

It looks like branches switching didn't solve any problem. Go users can use @release or @4.9.2 branches instead of @master as well. Maybe roll back everything back and add info about @release to Go?

BTW, maybe it also makes sense to rename master to main.

KvanTTT · 2022-02-08T10:09:34Z

Now every merge request contains a lot of unrelated changes and everyone should update the default branch in forks (if it's possible). I suggest rolling everything back and adding info about @release or @stable branch to Go.

go get github.com/antlr/antlr4/runtime/Go/antlr@release

parrt · 2022-02-08T16:45:06Z

Damn. Ok your suggestion makes sense. Anybody else think we should keep this new branching mechanism? Unfortunately I think it causes more problems than solves

On Tue, Feb 8, 2022 at 2:09 AM Ivan Kochurkin ***@***.***> wrote: Now every merge request contains a lot of unrelated changes and everyone should update the default branch in forks. I suggest rolling back everything back and adding info about @Release branch to Go. — Reply to this email directly, view it on GitHub <#3515 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABLUWI5R5DOKM5BV5Z5YQDU2DTWTANCNFSM5NDOJ3EA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you were mentioned.Message ID: ***@***.***>

-- Dictation in use. Please excuse homophones, malapropisms, and nonsense.

mike-lischke · 2022-02-08T18:12:56Z

Maybe it's just a temporary problem? Using a master/main branch as well as a dev branch (and many more) is common practice and I'd stay with at least these two, if possible.

KvanTTT · 2022-02-08T18:31:05Z

I'm not sure everyone will rebase their branch to dev to get rid of a lot of unrelated commits (take a look at any recent PR). Also, naming depends on a repository, it's possible to use master/main with dev as well as release with master.

parrt · 2022-02-08T19:19:56Z

We do have a lot of PRs in the queue. Can those easily reset to compare with dev not master? Looks like people can click edit then reset to compare to dev. Hm...affects many people to switch now. @ericvergnaud what do you think? @jcking ?

parrt · 2022-02-08T19:50:16Z

Ok, talked to @jcking just now; it won't cause any problems internally for the galactic overlords. I also like the idea of having a separate master and development branch, although it is a bit awkward that the default branch hass to be set to the development branch. In an ideal world the master branch would be the latest stable version and the default but PRs would pull from a development branch.

Anyway, in a nutshell, I'm going to leave it as is. Existing PRs can't easily tweak to base upon dev not master so let's proceed as we have it now.

Note: Watch out for git push origin dev not master haha

parrt · 2022-02-08T22:00:26Z

Also wondering if default branch should be master but keep dev branch. People in future would need to fork then create PRs from dev. As it is now, if we change the documentation or readme during development, we are in a fact changing the landing page or documentation that is visible by default. This is despite the fact that the software is not Yep an updated. For example, we have now removed the case insensitive parsing document but have yet to release the software.

Yes, I think most people are consumers and not developers of antlr and so I think the default brand should be master which will be the last stable release. This means we don't have to update the Go documentation for example but I will update the contributing-to-antlr doc.

KvanTTT · 2022-02-08T22:10:47Z

It might push some users away if they have seen the date of the last update of master. Also, probably, some new users will make pull requests to the master branch by default, that's not good. In the documentation, it's possible to add since version for new features and they are quite rare.

parrt · 2022-02-08T22:27:04Z

This is a tricky situation. I think we can simply put a first line on the readme that says they're looking at the last stable release not the development branch. Definitely some people will make a mistake and edit the master branch but of course we will see this in the PR during evaluation and I will update the documentation. With luck they can simply merge their changes with dev and then the PR Will look like it came originally from dev.

parrt · 2022-02-08T23:39:19Z

I've actually switched master back to being the default now and will update documentation/readme next

parrt · 2022-02-09T00:16:54Z

Ok, until we can stabilize these PRs and branches etc... Please be careful with pulling and pushing :) I have updated the readme and contributing doc: https://github.com/antlr/antlr4/blob/master/CONTRIBUTING.md

KvanTTT · 2022-02-09T10:59:36Z

My name is missing in Authors and major contributors in both branches :) It was added some time ago by you.

parrt · 2022-02-09T16:35:37Z

Oh! Damn. Sorry! Will fix asap. i edited wrong branch then cherry-picked.

parrt · 2022-02-09T17:36:57Z

Fixed @KvanTTT !!

@OverRide

* Get rid of reflection in CodeGenerator * Rename TargetType -> Language * Remove TargetType enum, use String instead as it was before Create CodeGenerator only one time during grammar processing, refactor code * Add default branch to appendEscapedCodePoint for unofficial targets (Kotlin) * Remove getVersion() overrides from Targets since they return the same value * Remove getLanguage() overrides from Targets since common implementation returns correct value * [again] don't use "quiet" option for mvn tests...hard to figure out what's wrong when failed. * normalize targets to 80 char strings for ATN serialization, except Java which needs big strings for efficiency. * Update actions.md fixed a small typo * Rename `CodeGenerator.createCodeGenerator` to `CodeGenerator.create` * Replace constants on string literals in `appendEscapedCodePoint` * Restore API of Target getLanguage(): protected -> public as it was before appendUnicodeEscapedCodePoint(int codePoint, StringBuilder sb, boolean escape): protected -> private (it's a new helper method, no need for API now) Added comment for appendUnicodeEscapedCodePoint * Introduce caseInsensitive lexer rule option, fixes #3436 * don't ahead of time compile for DART. See 8ca8804#commitcomment-62642779 * Simplify test rig related to timeouts (#3445) * remove all -q quiet mvn options to see output on CI servers. * run the various unit test classes in parallel rather than each individual test method, all except for Swift at the moment: `-Dparallel=classes -DthreadCount=4` * use bigger machine at circleci * No more test groups like parser1, parser2. * simplify Swift like the other tests * fix whitespace issues * use 4.10 not 4.9.4 * improve releasing antlr doc * Add Support For Swift Package Manager (#3132) * Add Swift Package Manager Support * Swift Package Dynamic * 【fix】【test】Fix run process path Co-authored-by: Terence Parr <[email protected]> * use src 11 for tool, but 8 for plugin/runtime (#3450) * use src 11 for tool, but 8 for plugin/runtime/runtime-tests. * use 11 in CI builds * cpp/cmake: Fix library install directories (#3447) This installs DLLs in bin directory instead of lib. * Python local import fixes (#3232) * Fixed pygrun relative import issue * Added name to contributors.txt Co-authored-by: Terence Parr <[email protected]> * Update javadoc to 8 and 11 (#3454) * no need for plugin in runtime, always gen svg from dot for javadoc, gen 1.8 not 1.7 doc for runtime. Gen 11 for tool. * tweak doc for 1.8 runtime. Test rig should gen 1.8 not 1.7 * [Go] Fix (*BitSet).equals (#3455) * set tool version for testing * oops reversion tool version as it's not sync'd with runtime and not time to release yet. * Remove unused variable from generated code (#3459) * [C++] Fix bugs in UnbufferedCharStream (#3420) * Escape bad words during grammar generation (#3451) * Escape reserved words during grammar generation, fixes #1070 (for -> for_ but RULE_for) Deprecate USE_OF_BAD_WORD * Make name and escapedName consistent across tool and codegen classes Fix other pull request notes * Rename NamedActionChunk to SymbolRefChunk * try out windows runners * rename workflow * Update windows.yml Fix cmd line issue * fix maven issue on windows * use jdk 11 * remove arch arg * display Github status for windows * try testing python3 on windows * try new run for python3 windows * try new run for python3 windows (again) * try new run for python3 windows (again2) * try new run for python3 windows (again3) * try new run for python3 windows (again4) * try new run for python3 windows (again5) * try new run for python3 windows * try new run for python3 windows * try new run for python3 windows * ugh i give up. python won't install on github actions. * Update windows.yml try python 3 * Update windows.yml * Update run-tests-python3.cmd * Update run-tests-python3.cmd * Create run-tests-python2.cmd * Update windows.yml * Update run-tests-python2.cmd * Update windows.yml * Update windows.yml * Update windows.yml * Create run-tests-javascript.cmd * Update run-tests-javascript.cmd * Update windows.yml * Update windows.yml * Update windows.yml * Update windows.yml * Update windows.yml * Update windows.yml * Create run-tests-csharp.cmd * Update windows.yml * fix warnings in C# CI * Update windows.yml * Update windows.yml * Create run-tests-dart.cmd * Update windows.yml * Update windows.yml * Update windows.yml * Update windows.yml * Update run-tests-dart.cmd * Update run-tests-dart.cmd * Update run-tests-dart.cmd * Update run-tests-dart.cmd * Update windows.yml * Update windows.yml * Update windows.yml * Create run-tests-go.cmd * Update windows.yml * Update windows.yml * Update windows.yml * GitHub action php (#3474) * Update windows.yml * Create run-tests-php.cmd * Update run-tests-php.cmd * Update run-tests-php.cmd * Update run-tests-php.cmd * Update run-tests-php.cmd * Update windows.yml * Update windows.yml * Update windows.yml * Update run-tests-php.cmd * Update windows.yml * Cleanup ci (#3476) * Delete .appveyor directory * Delete .travis directory * Improve CI concurrency (#3477) * Update windows.yml * Update windows.yml * Update windows.yml * Optimize toArray replace toArray(new T[size]) with toArray(new T[0]) for better performance https://shipilev.net/blog/2016/arrays-wisdom-ancients/#_conclusion * add contributor * resolve conflicts * fix-maven-concurrency (#3479) * fix-maven-concurrency * Update windows.yml * Update windows.yml * Update windows.yml * Update windows.yml * Update windows.yml * Update windows.yml * Update windows.yml * Update run-tests-python2.cmd * Update run-tests-python3.cmd * Update windows.yml * Update windows.yml * Update windows.yml * Update windows.yml * Update windows.yml * Update windows.yml * Update windows.yml * Update windows.yml * Update windows.yml * Update run-tests-php.cmd * Update windows.yml * Update run-tests-dart.cmd * Update run-tests-csharp.cmd * Update run-tests-go.cmd * Update run-tests-java.cmd * Update run-tests-javascript.cmd * Update run-tests-php.cmd * Update run-tests-python2.cmd * Update run-tests-python3.cmd * increase Windows CI concurrency for all targets except Dart * Preserve line separators for input runtime tests data (#3483) * Preserve line separators for input data in runtime tests, fix test data Refactor and improve performance of BaseRuntimeTest * Add LineSeparator (\n, \r\n) tests * Set up .gitattributes for LineSeparator_LF.txt (eol=lf) and LineSeparator_CRLF.txt (eol=crlf) * Restore `\n` for all input in runtime tests, add extra LexerExec tests (LineSeparatorLf, LineSeparatorCrLf) * Add generated LargeLexer test, remove LargeLexer.txt descriptor * tweak name to be GeneratedLexerDescriptors * [JavaScript] Migrate from jest to jasmine * [C++] Fix Windows min/max macro collision * [C++] Update cmake README.md to C++17 * remove unnecessary comparisons. * Add useful function writeSerializedATNIntegerHistogram for writing out information concerning how many of each integer value appear in a serialized ATN. * fix comment indicating what goes in the serialized ATN. * move writeSerializedATNIntegerHistogram out of runtime. * follow guidelines * Fix .interp file parsing test for the Java runtime. Also includes separating the generation of the .interp file from writing it out so that we can use both independently. * Delete files no longer needed. Should have been part of #3520 * [C++] Optimizations and cleanups and const correctness, oh my * [C++] Optimize LL1Analyzer * [C++] Fix missing virtual destructors * Remove not used PROTECTED, PUBLIC, PRIVATE tokens from ANTLRLexer.g * Remove ANTLR 3 stuff from ANTLR grammars, deprecate ANTLR 3 errors * Remove not used imaginary tokens from ANTLRParser.g * Fix misprints in grammars * ATN serialized data: remove shifting by 2, remove UUID; fix #3515 Regenerate XPathLexer files * Disable native runtime tests (see #3521) * Implement Java-specific ATN data optimization (+-2 shift) * [C++] Remove now unused antlrcpp::Guid * pull new branch diagram from master * use dev not master branch for CI github * update doc from master * add back missing author * [C++] Fix const correctness in ATN and DFA * keep getSerializedATNSegmentLimit at max int * Fixes #3259 make InErrorRecoveryMode public for go * Change code gen template to capitalize InErrorRecoveryMode * [C++] Improve multithreaded performance, fix TSAN error, and fix profiling ATN simulator setup bug * Get rid of unnecessary allocations and calculations in SerializedATN * Get rid of excess char escaping in generated files, decrease size of output files Fix creation of excess fragments for Dart, Cpp, PHP runtimes * Swift: fix binary serialization and use instead of JSON * Fix targetCharValueEscape, make them final and static * [C++] Cleanup ATNDeserializer and remove related deprecated methods from ATNSimulator * Fix for #3557 (getting "go test" to work again). * Convert Python2/3 to use int arrays not strings for ATN encodings (#3561) * Convert Python2/3 to use int arrays not strings for ATN encodings. Also make target indicate int vs string. * rename and reverse ATNSerializedAsInts * add override * remove unneeded method * [C++] Drastically improve multi-threaded performance (#3550) Thanks guys. A major advancement. * [C++] Remove duplicate includes and remove unused includes (#3563) * [C++] Lazily deserialize ATN in generated code (#3562) * [Docs] Update Swift Docs (#3458) * Add Swift Package Manager Support * Swift Package Dynamic * 【fix】【test】Fix run process path * [Docs] [Swift] update link, remove expired descriptions Co-authored-by: Terence Parr <[email protected]> * Ascii only ATN serialization (#3566) * go back to generating pure ascii ATN serializations to avoid issues where target compilers might assume ascii vs utf-8. * forgot I had to change php on previous ATN serialization tweak. * change how we escapeChar() per target. * oops; gotta use escapeChar method * rm unneeded case * add @OverRide * use ints not chars for C# (#3567) * use ints not chars for C# * oops. remove 'quotes' * regen from XPathLexer.g4 * simplify ATN with bypass alts mechanism in Java. * Change string to int[] for serialized ATN for C#; removed unneeded `use System` from XPathLexer.g4; regen that grammar. * [C++] Use camel case name in generated lexers and parsers (#3565) * Change string to int array for serialized ATN for JavaScript (#3568) * perf: Add default implementation for Visit in ParseTreeVisitor. (#3569) * perf: Add default implementation for Visit in ParseTreeVisitor. Reference: https://github.com/antlr/antlr4/blob/ad29539cd2e94b2599e0281515f6cbb420d29f38/runtime/Java/src/org/antlr/v4/runtime/tree/AbstractParseTreeVisitor.java#L18 * doc: add contributor * Don't use utf decoding...these are just ints (#3573) * [Go] Cleanup and fix ATN deserialization verification (#3574) * [C++] Force generated static data type name to titlecase (#3572) * Use int array not string for ATN in Swift (#3575) * [C++] Fix generated Lexer static data constructor (#3576) * Use int array not string for ATN in Dart (#3578) * Fix PHP codegen to support int ATN serialization (#3579) * Update listener documentation to satisfy the discussion about improving exception handling: #3162 * tweak * [C++] Remove unused LexerATNSimulator::match_calls (#3570) * [C++] Remove unused LexerATNSimulator::match_calls * Remove match_calls from other targets * [Java] Preserve serialized ATN version 3 compatibility (#3583) * add jcking to the contributors list * Update releasing-antlr.md * [C++] Avoid using dynamic_cast where possible by using hand rolled RTTI (#3584) * Revert "[Java] Preserve serialized ATN version 3 compatibility (#3583)" This reverts commit 01bc811. * [C++] Add ANTLR4CPP_PUBLIC attributes to various symbols (#3588) * Update editorconfig for c++ (#3586) * Make it easier to contribute: Add c++ configuration for .editorconfig. Using the observed style with 2 indentation spaces. Signed-off-by: Henner Zeller <[email protected]> * Add hzeller to contributors.txt Signed-off-by: Henner Zeller <[email protected]> * Fix code style and typing to support PHP 8 (#3582) * [Go] Port locking algorithm from C++ to Go (#3571) * Use linux DCO not our old contributors certificate of origin * [C++] Fix bugs in SemanticContext (#3595) * [Go] Do not export Array2DHashSet which is an implementation detail (#3597) * Revert "Use linux DCO not our old contributors certificate of origin" This reverts commit b0f8551. * Use signed ints for ATN serialization not uint16, except for java (#3591) * refactor serialize so we don't need comments * more cleanup during refactor * store language in serializer obj * A lexer rule token type should never be -1 (EOF). 0 is fragment but then must be > 0. * Go uses int not uint16 for ATN now. java/go/python3 pass * remove checks for 0xFFFF in Go. * C++ uint16_t to int for ATN. * add mac php dir; fix type on accept() for generated code to be mixed. * Add test from @KvanTTT. This PR fixes #3555 for non-Java targets. * cleanup and add big lexer from #3546 * increase mvn mem size to 2G * increase mvn mem size to 8G * turn off the big ATN lexer test as we have memory issues during testing. * Fixes #3592 * Revert "C++ uint16_t to int for ATN." This reverts commit 4d2ebbf. # Conflicts: # runtime/Cpp/runtime/src/atn/ATNSerializer.cpp # runtime/Cpp/runtime/src/tree/xpath/XPathLexer.cpp * C++ uint16_t to int32_t for ATN. * rm unnecessary include file, updating project file. get rid of the 0xFFFF does in the C++ deserialization * rm refs to 0xFFFF in swift * javascript tests were running as Node...added to ignore list. * don't distinguish between 16 and 32 bit char sets in serialization; Python2/3 updated to work with this change. * update C++ to deserialize only 32-bit sets * 0xFFFF -> -1 for C++ target. * get other targets to use 32-bit sets in serialization. tests pass locally. * refactor to reduce code size * add comment * oops. comment out call to writeSerializedATNIntegerHistogram(). I wonder if this is why it ran out of memory during testing? * all but Java, Node, PHP, Go work now for the huge lexer file; I have set them to ignore. note that the swift target takes over a minute to lex it. I've turned off Node but it does not seem to terminate but it could terminate eventually. * all but Java, Node, PHP, Go work now for the huge lexer file; I have set them to ignore. note that the swift target takes over a minute to lex it. I've turned off Node but it does not seem to terminate but it could terminate eventually. * Turn off this big lexer because we get memory errors during continuous integration * Intermediate commit where I have shuffled around all of the -1 flipping and bumping by two. work still needs to be done because the token stream rewriter stuff fails. and I assume the other decoding for human readability testing if doesn't work * convert decode to use int[]; remove dead code. don't use serializeAsChar stuff. more tests pass. * more tests passing. simplify. When copying atn, must run ATN through serializer to set some state flags. * 0xFFFD+ are not valid char * clean up. tests passing now * huge clean up. Got Java working with 32-bit ATNs!Still working on cleanup but I want to run the tests * Cleanup the hack I did earlier; everything still seems to work * Use linux DCO not our old contributors certificate of origin * remove bump-by-2 code * clean up per @KvanTTT. Can't test locally on this box. Will see what CI says. * tweak comment * Revert "Use linux DCO not our old contributors certificate of origin" This reverts commit b0f8551. * see if C++ works in CI for huge ATN * Use linux DCO not our old contributors certificate of origin (#3598) * Use linux DCO not our old contributors certificate of origin * Revert "Use linux DCO not our old contributors certificate of origin" This reverts commit b0f8551. * use linux DCO * use linux DCO * Use linux DCO not our old contributors certificate of origin * update release documentation Signed-off-by: Terence Parr <[email protected]> * Equivalent of #3537 * clean up setup * clean up doc version * [Swift] improvements to equality functions (#3302) * fix default equality * equality cases * optional unwrapping * [Swift] Use for in loops (#3303) * common for in loops * reversed loop * drop first loop * for in with default BitSet * [Go] Fix symbol collision in generated lexers and parsers (#3603) * [C++] Refactor and optimize SemanticContext (#3594) * [C++] Devirtualize hand rolled RTTI for performance (#3609) * [C++] Add T::is for type hierarchy checks and remove some dynamic_cast (#3612) * [C++] Avoid copying statically generated serialized ATNs (#3613) * [C++] Refactor PredictionContext and yet more performance improvements (#3608) * [C++] Cleanup DFA, DFAState, LexerAction, and yet more performance improvements (#3615) * fix dependabot issues * [Swift] use stdlib (single pass) (#3602) * this was added to the stdlib in Swift 5 * &>> is defined as lhs >> (rhs % lhs.bitwidth) * the stdlib has these * reduce loops * use indices * append(contentsOf:) * Array literal init works for sets too! * inline and remove bit query functions * more optional handling (#3605) * [C++] Minor improvements to PredictionContext (#3616) * use php runtime dev branch to test dev * update doc to be more explicit about the interaction between lexer actions and semantic predicates; Fixes #3611. Fixes #3606. Signed-off-by: Terence Parr <[email protected]> * Refactor js runtime in preparation of future improvements * refactor, 1 file per class, use import, use module semantics, use webpack 5, use eslint * all tests pass * simplifications and alignment with standard js idioms * simplifications and alignment with standard js idioms * support reading legacy ATN * support both module and non-module imports * fix failing tests * fix failing tests * No longer necessary too generate sets or single atom transit that are bigger than 16bits. (#3620) * Updated getting started with Cpp documentation. (#3628) Included specific examples of using ANTLR4_TAG and ANTLR4_ZIP_REPOSITORY in the sample CMakeLists file. * [C++] Free ATNConfig lookup set in readonly ATNConfigSet (#3630) * [C++] Implement configurable PredictionContextMergeCache (#3627) * Allow to choose to switch off building tests in C++ (#3624) The new option to cmake ANTLR_BUILD_CPP_TESTS is default on (so the behavior is as before), but it provides a way to switch off if not needed. The C++ tests pull in an external dependency (googletests), which might conflict if ANTLR is used as a subproject in another cmake project. Signed-off-by: Henner Zeller <[email protected]> * Fix NPE for undefined label, fix #2788 * An interval ought to be a value Interval was a pointer to 2 Ints it ought to be just 2 Ints, which is smaller and more semantically correct, with no need for a cache. However, this technically breaks metadata and AnyObject conformance but people shouldn't be relying on those for an Interval. * [C++] Remove more dynamic_cast usage * [C++] Introduce version macros * add license prefix * Prep 4.10 (#3599) * Tweak doc * Swift was referring to hardcoded version * Start version update script. * add files to update * clean up setup * clean up setup * clean up setup * don't need file * don't need file * Fixes #3600. add instructions and associated code necessary to build the xpath lexers. * clean up version nums * php8 * php8 * php8 * php8 * php8 * php8 * php8 * php8 * tweak doc * ok, i give up. php won't bump up too v8 * tweak doc * version number bumped to 4.10 in runtime. * Change the doc for releasing and update to use latest ST 4.3.2 * fix dart version to 4.10.0 * cmd files Cannot use export bash command. * try fixing php ci again * working on deploy Signed-off-by: Terence Parr <[email protected]> * php8 always install. * set js to 4.10.0 not 4.10 * turn off apt update for php circleci * try w/o cimg/php * try setting branch * ok i give up * tweak * update docs for release. * php8 circleci * use 3.5.3 antlr * use 3.5.3-SNAPSHOT antlr * use full 3.5.3 antlr * [Swift] reduce Optionals in APIs (#3621) * ParserRuleContext.children see comment in removeLastChild * TokenStream.getText * Parser._parseListeners this might require changes to the code templates? * ATN {various} * make computeReachSet return empty, not nil * overrides refine optionality * BufferedTokenStream getHiddenTokensTo{Left, Right} return empty not nil * Update Swift.stg * avoid breakage by adding overload of `getText` in extension * tweak to kick off build Signed-off-by: Terence Parr <[email protected]> * try parallelism: 4 circleci * Revert "[Swift] reduce Optionals in APIs (#3621)" This reverts commit b5ccba0. * tweaks to doc * Improve the deploy script and tweak the released doc. * use 4.10 not Snapshot for scripts Co-authored-by: Ivan Kochurkin <[email protected]> Co-authored-by: Alexandr <[email protected]> Co-authored-by: 100mango <[email protected]> Co-authored-by: Biswapriyo Nath <[email protected]> Co-authored-by: Benjamin Spiegel <[email protected]> Co-authored-by: Justin King <[email protected]> Co-authored-by: Eric Vergnaud <[email protected]> Co-authored-by: Harry Chan <[email protected]> Co-authored-by: Ken Domino <[email protected]> Co-authored-by: chenquan <[email protected]> Co-authored-by: Marcos Passos <[email protected]> Co-authored-by: Henner Zeller <[email protected]> Co-authored-by: Dante Broggi <[email protected]> Co-authored-by: chris-miner <[email protected]>

parrt added atn-analysis type:cleanup labels Jan 29, 2022

parrt added this to the 4.10 milestone Jan 29, 2022

parrt assigned KvanTTT, ericvergnaud and parrt and unassigned KvanTTT and ericvergnaud Jan 29, 2022

parrt mentioned this issue Jan 29, 2022

Optimize ATN serialization data #3513

Merged

KvanTTT added a commit to KvanTTT/antlr4 that referenced this issue Jan 30, 2022

ATN serialized data: remove shifting by 2, remove UUID; fix antlr#3515

e3b5e76

KvanTTT mentioned this issue Jan 30, 2022

ATN serialized data: remove shifting by 2, remove UUID #3516

Merged

KvanTTT added a commit to KvanTTT/antlr4 that referenced this issue Jan 30, 2022

ATN serialized data: remove shifting by 2, remove UUID; fix antlr#3515

88e10ac

Clean up ATN serialization: rm UUID and shifting by value of 2 #3515

Clean up ATN serialization: rm UUID and shifting by value of 2 #3515

Comments

parrt commented Jan 29, 2022

ericvergnaud commented Jan 30, 2022 via email

KvanTTT commented Jan 30, 2022

KvanTTT commented Jan 30, 2022

ericvergnaud commented Jan 30, 2022 via email

KvanTTT commented Jan 30, 2022

ericvergnaud commented Jan 30, 2022

parrt commented Jan 30, 2022

parrt commented Jan 30, 2022

KvanTTT commented Jan 30, 2022 • edited Loading

parrt commented Jan 30, 2022 • edited Loading

ericvergnaud commented Jan 30, 2022 via email

parrt commented Jan 30, 2022

KvanTTT commented Jan 30, 2022 • edited Loading

ericvergnaud commented Jan 30, 2022 via email

parrt commented Jan 30, 2022

KvanTTT commented Jan 30, 2022 • edited Loading

luisespino commented Feb 7, 2022

parrt commented Feb 7, 2022

parrt commented Feb 8, 2022

parrt commented Feb 8, 2022 • edited Loading

parrt commented Feb 8, 2022

luisespino commented Feb 8, 2022

parrt commented Feb 8, 2022 • edited Loading

parrt commented Feb 8, 2022

luisespino commented Feb 8, 2022 • edited Loading

parrt commented Feb 8, 2022

ericvergnaud commented Feb 8, 2022 via email

mike-lischke commented Feb 8, 2022 • edited Loading

ericvergnaud commented Feb 8, 2022

mike-lischke commented Feb 8, 2022 • edited Loading

KvanTTT commented Feb 8, 2022 • edited Loading

KvanTTT commented Feb 8, 2022 • edited Loading

parrt commented Feb 8, 2022 via email

mike-lischke commented Feb 8, 2022

KvanTTT commented Feb 8, 2022

parrt commented Feb 8, 2022 • edited Loading

parrt commented Feb 8, 2022

parrt commented Feb 8, 2022

KvanTTT commented Feb 8, 2022

parrt commented Feb 8, 2022

parrt commented Feb 8, 2022

parrt commented Feb 9, 2022

KvanTTT commented Feb 9, 2022 • edited Loading

parrt commented Feb 9, 2022

parrt commented Feb 9, 2022

KvanTTT commented Jan 30, 2022 •

edited

Loading

parrt commented Jan 30, 2022 •

edited

Loading

KvanTTT commented Jan 30, 2022 •

edited

Loading

KvanTTT commented Jan 30, 2022 •

edited

Loading

parrt commented Feb 8, 2022 •

edited

Loading

parrt commented Feb 8, 2022 •

edited

Loading

luisespino commented Feb 8, 2022 •

edited

Loading

mike-lischke commented Feb 8, 2022 •

edited

Loading

mike-lischke commented Feb 8, 2022 •

edited

Loading

KvanTTT commented Feb 8, 2022 •

edited

Loading

KvanTTT commented Feb 8, 2022 •

edited

Loading

parrt commented Feb 8, 2022 •

edited

Loading

KvanTTT commented Feb 9, 2022 •

edited

Loading