Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up ATN serialization: rm UUID and shifting by value of 2 #3515

Closed
parrt opened this issue Jan 29, 2022 · 68 comments · Fixed by #3516
Closed

Clean up ATN serialization: rm UUID and shifting by value of 2 #3515

parrt opened this issue Jan 29, 2022 · 68 comments · Fixed by #3516

Comments

@parrt
Copy link
Member

parrt commented Jan 29, 2022

  • I think we don't need the UUID in the serialization, since it has not changed in a decade. We can bump the version number and remove the UU ID
  • I did some tests and there seems to be no reason to shift the values in the serialized ATN by 2 for the purposes of improving the UTF-8 encoding for the Java target.

If you guys agree, we can make this small change for cleanup purposes. I'm happy to do it if you guys don't want to. The second fix will require changes to each target but it's trivial to fix.

@ericvergnaud
Copy link
Contributor

ericvergnaud commented Jan 30, 2022 via email

@KvanTTT
Copy link
Member

KvanTTT commented Jan 30, 2022

It does not matter since, fortunately, generated parsers are not back-compatible because of version check on Runtime.

@KvanTTT
Copy link
Member

KvanTTT commented Jan 30, 2022

Ok, I can fix the issue.

@ericvergnaud
Copy link
Contributor

ericvergnaud commented Jan 30, 2022 via email

@KvanTTT
Copy link
Member

KvanTTT commented Jan 30, 2022

Yes, it will be crashed but with the clear message (Could not deserialize ATN with version 3 (expected 4)). Maybe it makes sense to add info about lexer/parser regeneration with a new tool. BTW some users don't like warning messages as well: #3278

@ericvergnaud
Copy link
Contributor

We probably want to test this before making assertions ? I believe it would crash in Swift.

@parrt
Copy link
Member Author

parrt commented Jan 30, 2022

I think that right now we have three tests for compatibility: runtime version of the entire tool, ATN serialization version, and UUID. I never understood what the UUID. I don't think we need 3 checks. We probably don't even need the ATN serialization given the warning when people are mixing runtime and generated code from a different tool. We can leave the version number in there and the runtime. The runtime mismatch will give a warning I think but we can have the version number crash. But, then it begs the question why didn't the first one crash if it's not gonna work. I guess it makes sense that we could tweak the runtime libraries for multiple versions of the software and keep the ATN serialization the same. Definitely if there's a version difference in the serialization is crash.

Per @ericvergnaud I will go try this. It definitely crashed earlier when I tweak the UUID but will check with just a version number.

@parrt
Copy link
Member Author

parrt commented Jan 30, 2022

We probably want to test this before making assertions ? I believe it would crash in Swift.

It definitely crashes as we want at least for Java:

java.lang.UnsupportedOperationException: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN; Could not deserialize ATN with version 3 (expected 4).

	at org.antlr.v4.runtime.atn.ATNDeserializer.deserialize(ATNDeserializer.java:90)
	at org.antlr.v4.runtime.misc.InterpreterDataReader.parseFile(InterpreterDataReader.java:133)
	at org.antlr.v4.test.runtime.java.TestInterpreterDataReader.testParseFile(TestInterpreterDataReader.java:24)

@KvanTTT
Copy link
Member

KvanTTT commented Jan 30, 2022

It crashes with the same error in all runtimes after my changes.

@parrt
Copy link
Member Author

parrt commented Jan 30, 2022

Ok, that's good news. let me poke around with the unit tests that have no source g4 files.

@ericvergnaud
Copy link
Contributor

ericvergnaud commented Jan 30, 2022 via email

@parrt
Copy link
Member Author

parrt commented Jan 30, 2022

I see this in swift ATNDeserializer.swift:

if version != ATNDeserializer.SERIALIZED_VERSION {
    let reason = "Could not deserialize ATN with version \(version) (expected \(ATNDeserializer.SERIALIZED_VERSION))."
    throw ANTLRError.unsupportedOperation(msg: reason)
}

@KvanTTT
Copy link
Member

KvanTTT commented Jan 30, 2022

Yes, Swift checks the version both in binary and in JSON deserializer methods. In the master Swift uses JSON serialization but I've removed it in one of the latest PR: #3513

@ericvergnaud
Copy link
Contributor

ericvergnaud commented Jan 30, 2022 via email

@parrt
Copy link
Member Author

parrt commented Jan 30, 2022

Actually, concerning #3513, why not use the binary string in the parser file like others? Now Swift build is much more complicated.

@KvanTTT
Copy link
Member

KvanTTT commented Jan 30, 2022

I've introduced multiline string serialization into Swift in that PR like other runtimes. It removes a lot of serializing code and decreases the size of the building output. I don't know exactly why Swift target authors introduced JSON serialization (maybe because there were a couple of errors in binary serialization), it looks excess.

@luisespino
Copy link

absolutely stellar support!

bendgk Absolutely, despite having other jobs, and still continue with the project.

@parrt
Copy link
Member Author

parrt commented Feb 7, 2022

Not rollback, just split on dev and master (create dev on current master and hard reset master on 4.9.3)

Yeah, that definitely makes sense but then I should change the default PR branch to dev. Hmm... I should think about this before I make a mistake. Have to run off at the moment...

@parrt
Copy link
Member Author

parrt commented Feb 8, 2022

I think I did it right!

Screen Shot 2022-02-07 at 5 40 15 PM

@parrt
Copy link
Member Author

parrt commented Feb 8, 2022

Hmm... I just set the default branch to dev. @bendgk does the Go fetch

$ go get github.com/antlr/antlr4/runtime/Go/antlr

now just go to dev instead of master? If so, we have not changed anything except made the repository more complicated!!!

Maybe we can change the documentation to

$ go get github.com/antlr/antlr4/runtime/Go/antlr@master

I think this works with go 1.11. This seems to confirm it.

@parrt
Copy link
Member Author

parrt commented Feb 8, 2022

I started updating the README when I ran into this issue of the default branch.

Screen Shot 2022-02-07 at 5 52 15 PM

@jcking, @mike-lischke, @marcospassos, @ericvergnaud Please note the change to the repository, which might of been a mistake ha ha

Now I'm wondering whether I should try to undo the damage and not push this change to the README. I'll wait to hear from the Go Target users.

@luisespino
Copy link

I'll wait to hear from the Go Target users.

@parrt Hi, I am working with ANTLR and Go. I fixed my problem by downloading the ZIP and making a local package. Because if you use: go get github.com/antlr/antlr4/runtime/Go/antlr it downloads the default branch, that is, the dev. Which still causes problem when using the visitor by version of deserialize ATN. I think the Master branch should be the default branch for use the import ("github.com/antlr/antlr4/runtime/Go/antlr") .

Greetings.

@parrt
Copy link
Member Author

parrt commented Feb 8, 2022

Hi! Well, can you try the

go get github.com/antlr/antlr4/runtime/Go/antlr@master

??? Hopefully that will make everything work properly

(We can't use the last release branch as the default because then PRs will all be created from that and not our current development branch.) :(

@parrt
Copy link
Member Author

parrt commented Feb 8, 2022

Ok, I have pushed the documentation changes to the master and dev branches. I did a reset hard on master after cutting the dev branch from our latest changes. Then, hi force pushed master to upstream antlr/antlr4. Hope that's right and that Go can use the @master module query. :)

@luisespino
Copy link

luisespino commented Feb 8, 2022

@parrt use @master in go get works fine, thanks for the support!

@parrt
Copy link
Member Author

parrt commented Feb 8, 2022

Oh, @KvanTTT and @ericvergnaud we need to change the continuous integration script so that they use dev not master. I just fixed the github CI for dev but I'm not sure how to do that for circleci.

@ericvergnaud
Copy link
Contributor

ericvergnaud commented Feb 8, 2022 via email

@mike-lischke
Copy link
Member

mike-lischke commented Feb 8, 2022

Odd, my fork (https://github.com/mike-lischke/antlr4) does not show the dev branch, only master. Does this mean all forks have to re-fork again?

@ericvergnaud
Copy link
Contributor

No, you need to go to GitHub and change your default brach from master to dev

@mike-lischke
Copy link
Member

mike-lischke commented Feb 8, 2022

I cannot switch the default branch if dev is not even available:

Bildschirmfoto 2022-02-08 um 10 05 38

@KvanTTT
Copy link
Member

KvanTTT commented Feb 8, 2022

now just go to dev instead of master? If so, we have not changed anything except made the repository more complicated!!!

It looks like branches switching didn't solve any problem. Go users can use @release or @4.9.2 branches instead of @master as well. Maybe roll back everything back and add info about @release to Go?

BTW, maybe it also makes sense to rename master to main.

@KvanTTT
Copy link
Member

KvanTTT commented Feb 8, 2022

Now every merge request contains a lot of unrelated changes and everyone should update the default branch in forks (if it's possible). I suggest rolling everything back and adding info about @release or @stable branch to Go.

go get github.com/antlr/antlr4/runtime/Go/antlr@release

@parrt
Copy link
Member Author

parrt commented Feb 8, 2022 via email

@mike-lischke
Copy link
Member

Maybe it's just a temporary problem? Using a master/main branch as well as a dev branch (and many more) is common practice and I'd stay with at least these two, if possible.

@KvanTTT
Copy link
Member

KvanTTT commented Feb 8, 2022

I'm not sure everyone will rebase their branch to dev to get rid of a lot of unrelated commits (take a look at any recent PR). Also, naming depends on a repository, it's possible to use master/main with dev as well as release with master.

@parrt
Copy link
Member Author

parrt commented Feb 8, 2022

We do have a lot of PRs in the queue. Can those easily reset to compare with dev not master? Looks like people can click edit then reset to compare to dev. Hm...affects many people to switch now. @ericvergnaud what do you think? @jcking ?

@parrt
Copy link
Member Author

parrt commented Feb 8, 2022

Ok, talked to @jcking just now; it won't cause any problems internally for the galactic overlords. I also like the idea of having a separate master and development branch, although it is a bit awkward that the default branch hass to be set to the development branch. In an ideal world the master branch would be the latest stable version and the default but PRs would pull from a development branch.

Anyway, in a nutshell, I'm going to leave it as is. Existing PRs can't easily tweak to base upon dev not master so let's proceed as we have it now.

Note: Watch out for git push origin dev not master haha

@parrt
Copy link
Member Author

parrt commented Feb 8, 2022

Also wondering if default branch should be master but keep dev branch. People in future would need to fork then create PRs from dev. As it is now, if we change the documentation or readme during development, we are in a fact changing the landing page or documentation that is visible by default. This is despite the fact that the software is not Yep an updated. For example, we have now removed the case insensitive parsing document but have yet to release the software.

Yes, I think most people are consumers and not developers of antlr and so I think the default brand should be master which will be the last stable release. This means we don't have to update the Go documentation for example but I will update the contributing-to-antlr doc.

@KvanTTT
Copy link
Member

KvanTTT commented Feb 8, 2022

It might push some users away if they have seen the date of the last update of master. Also, probably, some new users will make pull requests to the master branch by default, that's not good. In the documentation, it's possible to add since version for new features and they are quite rare.

@parrt
Copy link
Member Author

parrt commented Feb 8, 2022

This is a tricky situation. I think we can simply put a first line on the readme that says they're looking at the last stable release not the development branch. Definitely some people will make a mistake and edit the master branch but of course we will see this in the PR during evaluation and I will update the documentation. With luck they can simply merge their changes with dev and then the PR Will look like it came originally from dev.

@parrt
Copy link
Member Author

parrt commented Feb 8, 2022

I've actually switched master back to being the default now and will update documentation/readme next

@parrt
Copy link
Member Author

parrt commented Feb 9, 2022

Ok, until we can stabilize these PRs and branches etc... Please be careful with pulling and pushing :) I have updated the readme and contributing doc: https://github.com/antlr/antlr4/blob/master/CONTRIBUTING.md

@KvanTTT
Copy link
Member

KvanTTT commented Feb 9, 2022

My name is missing in Authors and major contributors in both branches :) It was added some time ago by you.

@parrt
Copy link
Member Author

parrt commented Feb 9, 2022

Oh! Damn. Sorry! Will fix asap. i edited wrong branch then cherry-picked.

@parrt
Copy link
Member Author

parrt commented Feb 9, 2022

Fixed @KvanTTT !!

parrt added a commit that referenced this issue Apr 10, 2022
* Get rid of reflection in CodeGenerator

* Rename TargetType -> Language

* Remove TargetType enum, use String instead as it was before

Create CodeGenerator only one time during grammar processing, refactor code

* Add default branch to appendEscapedCodePoint for unofficial targets (Kotlin)

* Remove getVersion() overrides from Targets since they return the same value

* Remove getLanguage() overrides from Targets since common implementation returns correct value

* [again] don't use "quiet" option for mvn tests...hard to figure out what's wrong when failed.

* normalize targets to 80 char strings for ATN serialization, except Java which needs big strings for efficiency.

* Update actions.md

fixed a small typo

* Rename `CodeGenerator.createCodeGenerator` to `CodeGenerator.create`

* Replace constants on string literals in `appendEscapedCodePoint`

* Restore API of Target

getLanguage(): protected -> public as it was before

appendUnicodeEscapedCodePoint(int codePoint, StringBuilder sb, boolean escape): protected -> private (it's a new helper method, no need for API now)

Added comment for appendUnicodeEscapedCodePoint

* Introduce caseInsensitive lexer rule option, fixes #3436

* don't ahead of time compile for DART. See 8ca8804#commitcomment-62642779

* Simplify test rig related to timeouts (#3445)

* remove all -q quiet mvn options to see output on CI servers.

* run the various unit test classes in parallel rather than each individual test method, all except for Swift at the moment: `-Dparallel=classes -DthreadCount=4`

* use bigger machine at circleci

* No more test groups like parser1, parser2.

* simplify Swift like the other tests

* fix whitespace issues

* use 4.10 not 4.9.4

* improve releasing antlr doc

* Add Support For Swift Package Manager (#3132)

* Add Swift Package Manager Support

* Swift Package Dynamic

* 【fix】【test】Fix run process path

Co-authored-by: Terence Parr <[email protected]>

* use src 11 for tool, but 8 for plugin/runtime (#3450)

* use src 11 for tool, but 8 for plugin/runtime/runtime-tests.
* use 11 in CI builds

* cpp/cmake: Fix library install directories (#3447)

This installs DLLs in bin directory instead of lib.

* Python local import fixes (#3232)

* Fixed pygrun relative import issue

* Added name to contributors.txt

Co-authored-by: Terence Parr <[email protected]>

* Update javadoc to 8 and 11 (#3454)

* no need for plugin in runtime, always gen svg from dot for javadoc, gen 1.8 not 1.7 doc for runtime. Gen 11 for tool.

* tweak doc for 1.8 runtime.  Test rig should gen 1.8 not 1.7

* [Go] Fix (*BitSet).equals (#3455)

* set tool version for testing

* oops reversion tool version as it's not sync'd with runtime and not time to release yet.

* Remove unused variable from generated code (#3459)

* [C++] Fix bugs in UnbufferedCharStream (#3420)

* Escape bad words during grammar generation (#3451)

* Escape reserved words during grammar generation, fixes #1070 (for -> for_ but RULE_for)

Deprecate USE_OF_BAD_WORD

* Make name and escapedName consistent across tool and codegen classes

Fix other pull request notes

* Rename NamedActionChunk to SymbolRefChunk

* try out windows runners

* rename workflow

* Update windows.yml

Fix cmd line issue

* fix maven issue on windows

* use jdk 11

* remove arch arg

* display Github status for windows

* try testing python3 on windows

* try new run for python3 windows

* try new run for python3 windows (again)

* try new run for python3 windows (again2)

* try new run for python3 windows (again3)

* try new run for python3 windows (again4)

* try new run for python3 windows (again5)

* try new run for python3 windows

* try new run for python3 windows

* try new run for python3 windows

* ugh i give up. python won't install on github actions.

* Update windows.yml

try python 3

* Update windows.yml

* Update run-tests-python3.cmd

* Update run-tests-python3.cmd

* Create run-tests-python2.cmd

* Update windows.yml

* Update run-tests-python2.cmd

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Create run-tests-javascript.cmd

* Update run-tests-javascript.cmd

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Create run-tests-csharp.cmd

* Update windows.yml

* fix warnings in C# CI

* Update windows.yml

* Update windows.yml

* Create run-tests-dart.cmd

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Update run-tests-dart.cmd

* Update run-tests-dart.cmd

* Update run-tests-dart.cmd

* Update run-tests-dart.cmd

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Create run-tests-go.cmd

* Update windows.yml

* Update windows.yml

* Update windows.yml

* GitHub action php (#3474)

* Update windows.yml

* Create run-tests-php.cmd

* Update run-tests-php.cmd

* Update run-tests-php.cmd

* Update run-tests-php.cmd

* Update run-tests-php.cmd

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Update run-tests-php.cmd

* Update windows.yml

* Cleanup ci (#3476)

* Delete .appveyor directory

* Delete .travis directory

* Improve CI concurrency (#3477)

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Optimize toArray

replace toArray(new T[size]) with toArray(new T[0]) for better performance

https://shipilev.net/blog/2016/arrays-wisdom-ancients/#_conclusion

* add contributor

* resolve conflicts

* fix-maven-concurrency (#3479)

* fix-maven-concurrency

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Update run-tests-python2.cmd

* Update run-tests-python3.cmd

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Update windows.yml

* Update run-tests-php.cmd

* Update windows.yml

* Update run-tests-dart.cmd

* Update run-tests-csharp.cmd

* Update run-tests-go.cmd

* Update run-tests-java.cmd

* Update run-tests-javascript.cmd

* Update run-tests-php.cmd

* Update run-tests-python2.cmd

* Update run-tests-python3.cmd

* increase Windows CI concurrency for all targets except Dart

* Preserve line separators for input runtime tests data (#3483)

* Preserve line separators for input data in runtime tests, fix test data

Refactor and improve performance of BaseRuntimeTest

* Add LineSeparator (\n, \r\n) tests

* Set up .gitattributes for LineSeparator_LF.txt (eol=lf) and LineSeparator_CRLF.txt (eol=crlf)

* Restore `\n` for all input in runtime tests, add extra LexerExec tests (LineSeparatorLf, LineSeparatorCrLf)

* Add generated LargeLexer test, remove LargeLexer.txt descriptor

* tweak name to be GeneratedLexerDescriptors

* [JavaScript] Migrate from jest to jasmine

* [C++] Fix Windows min/max macro collision

* [C++] Update cmake README.md to C++17

* remove unnecessary comparisons.

* Add useful function writeSerializedATNIntegerHistogram for writing out information concerning how many of each integer value appear in a serialized ATN.

* fix  comment indicating what goes in the serialized ATN.

* move writeSerializedATNIntegerHistogram out of runtime.

* follow guidelines

* Fix .interp file parsing test for the Java runtime.

Also includes separating the generation of the .interp file from writing it out so that we can use both independently.

* Delete files no longer needed. Should have been part of #3520

* [C++] Optimizations and cleanups and const correctness, oh my

* [C++] Optimize LL1Analyzer

* [C++] Fix missing virtual destructors

* Remove not used PROTECTED, PUBLIC, PRIVATE tokens from ANTLRLexer.g

* Remove ANTLR 3 stuff from ANTLR grammars, deprecate ANTLR 3 errors

* Remove not used imaginary tokens from ANTLRParser.g

* Fix misprints in grammars

* ATN serialized data: remove shifting by 2, remove UUID; fix #3515

Regenerate XPathLexer files

* Disable native runtime tests (see #3521)

* Implement Java-specific ATN data optimization (+-2 shift)

* [C++] Remove now unused antlrcpp::Guid

* pull new branch diagram from master

* use dev not master branch for CI github

* update doc from master

* add back missing author

* [C++] Fix const correctness in ATN and DFA

* keep getSerializedATNSegmentLimit at max int

* Fixes #3259 make InErrorRecoveryMode public for go

* Change code gen template to capitalize InErrorRecoveryMode

* [C++] Improve multithreaded performance, fix TSAN error, and fix profiling ATN simulator setup bug

* Get rid of unnecessary allocations and calculations in SerializedATN

* Get rid of excess char escaping in generated files, decrease size of output files

Fix creation of excess fragments for Dart, Cpp, PHP runtimes

* Swift: fix binary serialization and use instead of JSON

* Fix targetCharValueEscape, make them final and static

* [C++] Cleanup ATNDeserializer and remove related deprecated methods from ATNSimulator

* Fix for #3557 (getting "go test" to work again).

* Convert Python2/3 to use int arrays not strings for ATN encodings (#3561)

* Convert Python2/3 to use int arrays not strings for ATN encodings. Also make target indicate int vs string.

* rename and reverse ATNSerializedAsInts

* add override

* remove unneeded method

* [C++] Drastically improve multi-threaded performance (#3550)

Thanks guys. A major advancement.

* [C++] Remove duplicate includes and remove unused includes (#3563)

* [C++] Lazily deserialize ATN in generated code (#3562)

* [Docs] Update Swift Docs (#3458)

* Add Swift Package Manager Support

* Swift Package Dynamic

* 【fix】【test】Fix run process path

* [Docs] [Swift] update link, remove expired descriptions

Co-authored-by: Terence Parr <[email protected]>

* Ascii only ATN serialization (#3566)

* go back to generating pure ascii ATN serializations to avoid issues where target compilers might assume ascii vs utf-8.

* forgot I had to change php on previous ATN serialization tweak.

* change how we escapeChar() per target.

* oops; gotta use escapeChar method

* rm unneeded case

* add @OverRide

* use ints not chars for C# (#3567)

* use ints not chars for C#

* oops. remove 'quotes'

* regen from XPathLexer.g4

* simplify ATN with bypass alts mechanism in Java.

* Change string to int[] for serialized ATN for C#; removed unneeded `use System` from XPathLexer.g4; regen that grammar.

* [C++] Use camel case name in generated lexers and parsers (#3565)

* Change string to int array for serialized ATN for JavaScript (#3568)

* perf: Add default implementation for Visit in ParseTreeVisitor.  (#3569)

* perf: Add default implementation for Visit in ParseTreeVisitor.

Reference: https://github.com/antlr/antlr4/blob/ad29539cd2e94b2599e0281515f6cbb420d29f38/runtime/Java/src/org/antlr/v4/runtime/tree/AbstractParseTreeVisitor.java#L18

* doc: add contributor

* Don't use utf decoding...these are just ints (#3573)

* [Go] Cleanup and fix ATN deserialization verification (#3574)

* [C++] Force generated static data type name to titlecase (#3572)

* Use int array not string for ATN in Swift (#3575)

* [C++] Fix generated Lexer static data constructor (#3576)

* Use int array not string for ATN in Dart (#3578)

* Fix PHP codegen to support int ATN serialization (#3579)

* Update listener documentation to satisfy the discussion about improving exception handling: #3162

* tweak

* [C++] Remove unused LexerATNSimulator::match_calls (#3570)

* [C++] Remove unused LexerATNSimulator::match_calls

* Remove match_calls from other targets

* [Java] Preserve serialized ATN version 3 compatibility (#3583)

* add jcking to the contributors list

* Update releasing-antlr.md

* [C++] Avoid using dynamic_cast where possible by using hand rolled RTTI (#3584)

* Revert "[Java] Preserve serialized ATN version 3 compatibility (#3583)"

This reverts commit 01bc811.

* [C++] Add ANTLR4CPP_PUBLIC attributes to various symbols (#3588)

* Update editorconfig for c++ (#3586)

* Make it easier to contribute: Add c++ configuration for .editorconfig.

Using the observed style with 2 indentation spaces.

Signed-off-by: Henner Zeller <[email protected]>

* Add hzeller to contributors.txt

Signed-off-by: Henner Zeller <[email protected]>

* Fix code style and typing to support PHP 8 (#3582)

* [Go] Port locking algorithm from C++ to Go (#3571)

* Use linux DCO not our old contributors certificate of origin

* [C++] Fix bugs in SemanticContext (#3595)

* [Go] Do not export Array2DHashSet which is an implementation detail (#3597)

* Revert "Use linux DCO not our old contributors certificate of origin"

This reverts commit b0f8551.

* Use signed ints for ATN serialization not uint16, except for java (#3591)

* refactor serialize so we don't need comments

* more cleanup during refactor

* store language in serializer obj

* A lexer rule token type should never be -1 (EOF). 0 is fragment but then must be > 0.

* Go uses int not uint16 for ATN now. java/go/python3 pass

* remove checks for 0xFFFF in Go.

* C++ uint16_t to int for ATN.

* add mac php dir; fix type on accept() for generated code to be mixed.

* Add test from @KvanTTT. This PR fixes #3555 for non-Java targets.

* cleanup and add big lexer from #3546

* increase mvn mem size to 2G

* increase mvn mem size to 8G

* turn off the big ATN lexer test as we have memory issues during testing.

* Fixes #3592

* Revert "C++ uint16_t to int for ATN."

This reverts commit 4d2ebbf.

# Conflicts:
#	runtime/Cpp/runtime/src/atn/ATNSerializer.cpp
#	runtime/Cpp/runtime/src/tree/xpath/XPathLexer.cpp

* C++ uint16_t to int32_t for ATN.

* rm unnecessary include file, updating project file. get rid of the 0xFFFF does in the C++ deserialization

* rm refs to 0xFFFF in swift

* javascript tests were running as Node...added to ignore list.

* don't distinguish between 16 and 32 bit char sets in serialization; Python2/3  updated to work with this change.

* update C++ to deserialize only 32-bit sets

* 0xFFFF -> -1 for C++ target.

* get other targets to use 32-bit sets in serialization. tests pass locally.

* refactor to reduce code size

* add comment

* oops. comment out call to writeSerializedATNIntegerHistogram(). I wonder if this is why it ran out of memory during testing?

* all but Java, Node, PHP, Go work now for the huge lexer file; I have set them to ignore.  note that the swift target takes over a minute to lex it.  I've turned off Node but it does not seem to terminate but it could terminate eventually.

* all but Java, Node, PHP, Go work now for the huge lexer file; I have set them to ignore.  note that the swift target takes over a minute to lex it.  I've turned off Node but it does not seem to terminate but it could terminate eventually.

* Turn off this big lexer because we get memory errors during continuous integration

* Intermediate commit where I have shuffled around all of the -1 flipping and bumping by two.  work still needs to be done because the token stream rewriter stuff fails. and I assume the other decoding for human readability testing if doesn't work

* convert decode to use int[]; remove dead code. don't use serializeAsChar stuff. more tests pass.

* more tests passing. simplify. When copying atn, must run ATN through serializer to set some state flags.

* 0xFFFD+ are not valid char

* clean up. tests passing now

* huge clean up. Got Java working with 32-bit ATNs!Still working on cleanup but I want to run the tests

* Cleanup the hack I did earlier; everything still seems to work

* Use linux DCO not our old contributors certificate of origin

* remove bump-by-2 code

* clean up per @KvanTTT. Can't test locally on this box. Will see what CI says.

* tweak comment

* Revert "Use linux DCO not our old contributors certificate of origin"

This reverts commit b0f8551.

* see if C++ works in CI for huge ATN

* Use linux DCO not our old contributors certificate of origin (#3598)

* Use linux DCO not our old contributors certificate of origin

* Revert "Use linux DCO not our old contributors certificate of origin"

This reverts commit b0f8551.

* use linux DCO

* use linux DCO

* Use linux DCO not our old contributors certificate of origin

* update release documentation

Signed-off-by: Terence Parr <[email protected]>

* Equivalent of #3537

* clean up setup

* clean up doc version

* [Swift] improvements to equality functions (#3302)

* fix default equality

* equality cases

* optional unwrapping

* [Swift] Use for in loops (#3303)

* common for in loops

* reversed loop

* drop first loop

* for in with default BitSet

* [Go] Fix symbol collision in generated lexers and parsers (#3603)

* [C++] Refactor and optimize SemanticContext (#3594)

* [C++] Devirtualize hand rolled RTTI for performance (#3609)

* [C++] Add T::is for type hierarchy checks and remove some dynamic_cast (#3612)

* [C++] Avoid copying statically generated serialized ATNs (#3613)

* [C++] Refactor PredictionContext and yet more performance improvements (#3608)

* [C++] Cleanup DFA, DFAState, LexerAction, and yet more performance improvements (#3615)

* fix dependabot issues

* [Swift] use stdlib (single pass) (#3602)

* this was added to the stdlib in Swift 5

* &>> is defined as lhs >> (rhs % lhs.bitwidth)

* the stdlib has these

* reduce loops

* use indices

* append(contentsOf:)

* Array literal init works for sets too!

* inline and remove bit query functions

* more optional handling (#3605)

* [C++] Minor improvements to PredictionContext (#3616)

* use php runtime dev branch to test dev

* update doc to be more explicit about the interaction between lexer actions and semantic predicates; Fixes #3611. Fixes #3606.

Signed-off-by: Terence Parr <[email protected]>

* Refactor js runtime in preparation of future improvements

* refactor, 1 file per class, use import, use module semantics, use webpack 5, use eslint

* all tests pass

* simplifications and alignment with standard js idioms

* simplifications and alignment with standard js idioms

* support reading legacy ATN

* support both module and non-module imports

* fix failing tests

* fix failing tests

* No longer necessary too generate sets or single atom transit that are bigger than 16bits. (#3620)

* Updated getting started with Cpp documentation. (#3628)

Included specific examples of using ANTLR4_TAG and ANTLR4_ZIP_REPOSITORY in the sample CMakeLists file.

* [C++] Free ATNConfig lookup set in readonly ATNConfigSet (#3630)

* [C++] Implement configurable PredictionContextMergeCache (#3627)

* Allow to choose to switch off building tests in C++ (#3624)

The new option to cmake ANTLR_BUILD_CPP_TESTS is default
on (so the behavior is as before), but it provides a way to
switch off if not needed.

The C++ tests pull in an external dependency (googletests),
which might conflict if ANTLR is used as a subproject in
another cmake project.

Signed-off-by: Henner Zeller <[email protected]>

* Fix NPE for undefined label, fix #2788

* An interval ought to be a value

Interval was a pointer to 2 Ints
it ought to be just 2 Ints, which is smaller and more semantically correct,
with no need for a cache.

However, this technically breaks metadata and AnyObject conformance but people shouldn't be relying on those for an Interval.

* [C++] Remove more dynamic_cast usage

* [C++] Introduce version macros

* add license prefix

* Prep 4.10 (#3599)

* Tweak doc

* Swift was referring to hardcoded version

* Start version update script.

* add files to update

* clean up setup

* clean up setup

* clean up setup

* don't need file

* don't need file

* Fixes #3600.  add instructions and associated code necessary to build the xpath lexers.

* clean up version nums

* php8

* php8

* php8

* php8

* php8

* php8

* php8

* php8

* tweak doc

* ok, i give up. php won't  bump up too v8

* tweak doc

* version number bumped to 4.10 in runtime.

* Change the doc for releasing and update to use latest ST 4.3.2

* fix dart version to 4.10.0

* cmd files Cannot use export bash command.

* try fixing php ci again

* working on deploy

Signed-off-by: Terence Parr <[email protected]>

* php8 always install.

* set js to 4.10.0 not 4.10

* turn off apt update for php circleci

* try w/o cimg/php

* try setting branch

* ok i give up

* tweak

* update docs for release.

* php8 circleci

* use 3.5.3 antlr

* use 3.5.3-SNAPSHOT antlr

* use full 3.5.3 antlr

* [Swift] reduce Optionals in APIs (#3621)

* ParserRuleContext.children

see comment in removeLastChild

* TokenStream.getText

* Parser._parseListeners

this might require changes to the code templates?

* ATN {various}

* make computeReachSet return empty, not nil

* overrides refine optionality

* BufferedTokenStream getHiddenTokensTo{Left, Right} return empty not nil

* Update Swift.stg

* avoid breakage by adding overload of `getText` in extension

* tweak to kick off build

Signed-off-by: Terence Parr <[email protected]>

* try     parallelism: 4 circleci

* Revert "[Swift] reduce Optionals in APIs (#3621)"

This reverts commit b5ccba0.

* tweaks to doc

* Improve the deploy  script and tweak the released doc.

* use 4.10 not Snapshot for scripts

Co-authored-by: Ivan Kochurkin <[email protected]>
Co-authored-by: Alexandr <[email protected]>
Co-authored-by: 100mango <[email protected]>
Co-authored-by: Biswapriyo Nath <[email protected]>
Co-authored-by: Benjamin Spiegel <[email protected]>
Co-authored-by: Justin King <[email protected]>
Co-authored-by: Eric Vergnaud <[email protected]>
Co-authored-by: Harry Chan <[email protected]>
Co-authored-by: Ken Domino <[email protected]>
Co-authored-by: chenquan <[email protected]>
Co-authored-by: Marcos Passos <[email protected]>
Co-authored-by: Henner Zeller <[email protected]>
Co-authored-by: Dante Broggi <[email protected]>
Co-authored-by: chris-miner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants