Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement VersionCollection in Java #34050

Merged

Conversation

alpar-t
Copy link
Contributor

@alpar-t alpar-t commented Sep 25, 2018

This is not a straight forward rewrite.
It's in preparation of removing the build and snapshot from Version classes in both core and build-tools. As such, the Gradle project will track qualifiers and snapshots in version names, but Elasticsearch will have no knowledge of these.

We used to mark unreleased versions as snapshots in the version collection to differentiate them,
this is no longer possible once we remove that property.
The new implementation replaces the old Groovy one and brings a few changes:

  • does not track unreleased versions by name. Unfortunately we can't generate Gradle project names dynamically, so the named projects that used to build these unreleased versions are now just numbered placeholders to which we bind the versions in order at configuration time.
    • hopefully this approach make the intent of version collection cleaner, and simplifies the implementation of some methods.
    • I found the existing names confusing
    • no longer uses snapshot in any of the terminology, I found that confusing in the context of bwc testing as it clashes with the snapshot functionality in Elasticsearch.
  • relaxes equals constraints on the Version class from the build to disregard qualifier and version
  • different testing strategy based on what's currently in git. These tests were implemented against the Groovy implementation and then ported to Java to assure correctness. Hopefully will make the tests easier to understand.

Parse the versions from the Java source files and apply rules to figure
out which are the wire and index compatible ones from these.
On top of that figure out which are unreleased and the branches these
live on so we can build them, without  coming up with names for each.
Since we can't create the Gradle projects dynamically, use place-holders
and bind unreleased versions to them by index.
Hopefully this is less confusing than the assigning names to the
unreleased versions.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@@ -103,10 +103,6 @@ subprojects {
* in a branch if there are only betas and rcs in the branch so we have
* *something* to test against. */
VersionCollection versions = new VersionCollection(file('server/src/main/java/org/elasticsearch/Version.java').readLines('UTF-8'))
if (versions.currentVersion != VersionProperties.elasticsearch) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check is moved into version collections.
I tried to move checks and decisions there for better visibility.

project("$project.path:build-unreleased-version-$unreleasedVersion.index") {
Version bwcVersion = unreleasedVersion.version
String bwcBranch = unreleasedVersion.branch
apply plugin: 'distribution'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything below this line is just indented without any other changes except:

  • some exceptions include the version number that was built
  • the indenting output stream indents with the version number, so [bwc] becomes e.x. [6.5.0]

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking on this long overdue cleanup! The more we can convert our gradle code to java the better.

I don't think we should remove the named versions of unreleased versions. The names have very specific meaning that maps to the rules we have about bwc across versions that we want to test. Regarding the use of "snapshot", there are many things in elasticsearch and other software projects that context is needed to understand the semantics of a term. Changing to "unreleased" is fine, though, but I would keep the flag available on Version.

Finally, since this is a rewrite, I don't think a TreeSet is needed here at all. @hub-cap and I discussed its use long ago. See VersionUtils.resolveReleasedVersions for an alternative way to determine which versions are unreleased based on our cross branch compatibility rules. I always meant to go back and change VersionCollection to use the same algorithm, but it was not a high priority.

@hub-cap
Copy link
Contributor

hub-cap commented Sep 26, 2018

I agree w @rjernst that the code he wrote in VersionUtils is wayyyy better than using the TreeSet. Can we instead port the code to look more like that?

@alpar-t
Copy link
Contributor Author

alpar-t commented Sep 27, 2018

I adopted the general idea of groupingBy, but also kept the SortedSet around because I liked how expressive the .last() method is when used like that. I also kept an implementation that looks for unreleased versions to emphasize that this is what the build is primarily interested in ( so we can build them).

I'll rename the projects back as well.

@alpar-t
Copy link
Contributor Author

alpar-t commented Sep 28, 2018

I renamed the projects and improved the documentation.
Also moved the versionCheck logic to VersionCollection so I can unit test it, and to keep it all in a single place as much as possible.

@alpar-t
Copy link
Contributor Author

alpar-t commented Sep 28, 2018

@rjernst @hub-cap can you take another look please ?

@hub-cap
Copy link
Contributor

hub-cap commented Sep 28, 2018

Ill have a look after @rjernst +1s the approach. I will say that the code that he wrote was wayyyyy more readable then the hot mess of TreeSet code I originally wrote. He even advised against it, to my chagrin. Boy was I wrong. So, past me wants to let it go, but present me thinks his List<List<Version>> code is much more readable. but i def wont -1 this if @rjernst is happy.

@rjernst
Copy link
Member

rjernst commented Sep 28, 2018

I think the sorting based approach is much more readable (and less complex) and we should not use a TreeSet.

@alpar-t
Copy link
Contributor Author

alpar-t commented Oct 1, 2018

@rjernst TreeSet is now removed.

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the changes so far @atorok! I left a lot of little implementation comments. I also think @hub-cap should look at this, at least for the javadocs in VersionCollection, as I know he spent a lot of time on the previous description, and making sure they make sense for another person would be good.

TESTING.asciidoc Outdated Show resolved Hide resolved
build.gradle Show resolved Hide resolved
build.gradle Outdated
ext.projectSubstitutions["org.elasticsearch.distribution.zip:elasticsearch-oss:${snapshot}"] = snapshotProject
}
// substitute unreleased versions with projects that check out and build locally
bwcVersions.forPreviousUnreleased { VersionCollection.UnreleasedVersionDescription unreleasedVersion ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a foreach closure? I'm fine with the name change to "unreleased" instead of "snapshot", but can't we just have bwcVersions.unreleased as a collection that we can iterate here as before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to avoid creating a collection and returning it when the calling code is really interested in iteration. It could have to be a collection of UnreleasedVersionDescription built each time, or build once cached. I taught this is simpler and easier to read and fits well with Gradle DSL.

We could simply call it unreleased if you think the extra detail in the name makes it actually more confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the collection should be created once in the constructor, so there shouldn't be any concern about returning it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that having the collections as different views of the same data makes the class harder to understand and in this particular case, it won't make a difference in performance large enough to be worthwhile. Do you see other advantages in doing so besides not having to compute it multiple times ?

import static java.util.Collections.unmodifiableList;

/**
* Parse the Java source file containing the versions declarations and use the known rules to figure out which are all
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and all javadocs) should start off with a simple one line description. In this case, something like:

A container for elasticsearch supported version information.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first comment here seems like it is specific to the constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add the summary, thanks!
The constructor that does this is the only public one.
I thought of moving this but then kept it from the original javadoc as I think it helps in understanding what's going on and where the information is coming from especially for someone coming over from TESTING. I'm ok to move it to the constructor if you think it's a better fit.


if (version.getRevision() == 0) {
if (releasedMajorGroupedByMinor
.get(releasedMajorGroupedByMinor.keySet().stream().max(Integer::compareTo).orElse(0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list of versions within a major should be sorted, so we should be able to just look at the last element?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's getting the largest key, i.e. the latest minor. We don't have ordering for the keys.

distribution/bwc/build.gradle Outdated Show resolved Hide resolved

public List<Version> getUnreleasedWireCompatible() {
List<Version> unreleasedWireCompatible = new ArrayList<>(getWireCompatible());
unreleasedWireCompatible.retainAll(getUnreleased());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

retainAll with 2 lists is going to be quadratic. The unreleased list should be sorted. The same logic suggested above on finding index/wire compatible can be used here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like how well this reads, and the number of versions is finite in practice.

@hub-cap
Copy link
Contributor

hub-cap commented Oct 29, 2018

This is on the top of my queue for tomorrow morning reviews.

@alpar-t
Copy link
Contributor Author

alpar-t commented Oct 30, 2018

@rjernst I responded to or made changes for all your comments.
@hub-cap looking forward !

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I still disagree with some of the internal implementation details, I don't see anything incorrect, and I don't want to block downstream changes that depend on this rewrite. This LGTM, assuming @hub-cap looks at the javadocs and is also ok with the implementation.

@alpar-t alpar-t merged commit f8378d9 into elastic:master Nov 1, 2018
@alpar-t alpar-t deleted the remove-version-qualifier-snapshot-build branch November 1, 2018 15:44
alpar-t added a commit that referenced this pull request Nov 1, 2018
@mark-vieira mark-vieira added the Team:Delivery Meta label for Delivery team label Nov 11, 2020
@PeterYang03110
Copy link

Hello, everyone.
I am so interested elasticsearch and have some question.
How to build the plugin again?
I have find some error when rebuild.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Delivery/Build Build or test infrastructure >non-issue Team:Delivery Meta label for Delivery team v6.6.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants