Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-17353: [Release][R] Validate binaries version #14396

Merged
merged 4 commits into from
Oct 19, 2022

Conversation

kou
Copy link
Member

@kou kou commented Oct 13, 2022

We generate version in Crossbow not CI job.
We can use file name validation feature by using version generated by Crossbow.

We use "X.Y.Z" version for RC version such as "10.0.0" for "10.0.0-rc1".

@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@kou
Copy link
Member Author

kou commented Oct 13, 2022

This is the result of archery crossbow submit r-binary-packages. Outputs have .YYYYMMDD suffix.

Revision: e17c09b

Submitted crossbow builds: ursacomputing/crossbow @ build-685

Task Status
r-binary-packages Github Actions

@kou

This comment was marked as outdated.

@kou
Copy link
Member Author

kou commented Oct 13, 2022

This is the result of archery crossbow submit r-binary-packages --arrow-version 10.0.0-rc1. Outputs don't have .YYYYMMDD suffix.

Revision: 6505489

Submitted crossbow builds: ursacomputing/crossbow @ build-687

Task Status
r-binary-packages Github Actions

@kou
Copy link
Member Author

kou commented Oct 14, 2022

@assignUser What do you think about this approach?

A normal case #14396 (comment) still use .YYYYMMDD suffix.

A RC case #14396 (comment) doesn't use .YYYYMMDD suffix.

Output files are validated by Crossbow in both cases.

Copy link
Member

@assignUser assignUser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kou thanks! Do I understand correctly you added the arrow.repo option so binary verification from the staging area of the artifactory is possible?

The only issue I see is that for R packages the usual convention for dev versions is last.released.version.large_numeric_component compared with the use of future.release.version.dev_int in the other components. Changing that might cause confusion in the R nightly users (not sure how big that group is?). cc @nealrichardson

profile_path <- file.path(getwd(), ".Rprofile")
repo = paste0("file://", getwd(), "/repo")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
repo = paste0("file://", getwd(), "/repo")
repo <- paste0("file://", getwd(), "/repo")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry!
I've fixed.

@nealrichardson
Copy link
Member

@kou thanks! Do I understand correctly you added the arrow.repo option so binary verification from the staging area of the artifactory is possible?

The only issue I see is that for R packages the usual convention for dev versions is last.released.version.large_numeric_component compared with the use of future.release.version.dev_int in the other components. Changing that might cause confusion in the R nightly users (not sure how big that group is?). cc @nealrichardson

Correct, I think we should stick with the last.released.version.large_numeric_component versioning we currently do. If today's nightly were 10.0.0.20221014, when we release 10.0.0 the release will appear to be a lower version number, so a user trying to update packages wouldn't pick it up.

@kou
Copy link
Member Author

kou commented Oct 15, 2022

This is the result of archery crossbow submit r-binary-packages. Outputs use 9.0.0.100000466.

Revision: cd8bc0e

Submitted crossbow builds: ursacomputing/crossbow @ build-691

Task Status
r-binary-packages Github Actions

@kou
Copy link
Member Author

kou commented Oct 15, 2022

This is the result of archery crossbow submit r-binary-packages --arrow-version 10.0.0-rc1. Outputs use 10.0.0 suffix.

Revision: cd8bc0e

Submitted crossbow builds: ursacomputing/crossbow @ build-692

Task Status
r-binary-packages Github Actions

@kou
Copy link
Member Author

kou commented Oct 15, 2022

Do I understand correctly you added the arrow.repo option so binary verification from the staging area of the artifactory is possible?

No. I added it for using the artifacts (built in the current r-binary-packages task) at local.

But we can use it to verify the artifacts at https://apache.jfrog.io/ui/native/arrow/r-rc/ in dev/release/verify-release-candidate.sh. (It's out of scope of this pull request.)

The only issue I see is that for R packages the usual convention for dev versions is last.released.version.large_numeric_component compared with the use of future.release.version.dev_int in the other components. Changing that might cause confusion in the R nightly users (not sure how big that group is?).

OK. I've changed to use 9.0.0.${1_0000_00_00 + dev_int} (without --arrow-version / see #14396 (comment) / 1_0000_00_00 + is just for generating a greater number than YYYYMMDD. We can just use 9000 instead if we don't need to care existing developers who use our nightly packages) and 10.0.0 (with --arrow-version 10.0.0-rc1 / see #14396 (comment) ).

# version: 10.0.0.dev234
# r_version: 9.0.0.9000
# -> 9.0.0.100000234
r_version = re.sub(r"\.9000\Z", f".{r_version_dev}", r_version)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This version is only used for nightly builds (which are uploaded to nightlies.a.o) and for crossbow jobs triggered manually right? Is there a reason we need to move to the "dev version" numbering or could we just use datetime to stick with the existing YYYYMMDD numbering?

Suggested change
r_version = re.sub(r"\.9000\Z", f".{r_version_dev}", r_version)
r_version = re.sub(r"\.9000\Z", f".{date.today().strftime('%Y%m%d')}", r_version)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This version is only used for nightly builds (which are uploaded to nightlies.a.o) and for crossbow jobs triggered manually right?

Right.

Is there a reason we need to move to the "dev version" numbering or could we just use datetime to stick with the existing YYYYMMDD numbering?

  1. Consistency:
  2. To identify the exact commit:
    • If we have multiple Apache Arrow R related commits (including Apache Arrow C++ related commits) in a day, we need to extra work to identify the commit of a nightly package with YYYYMMDD version.
    • If we use the "dev version", we can identify the commit of a nightly package without extra work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I see, thanks for explaining. I have no strong opinion on this. @nealrichardson what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as the x.y.z version number is the same as the current release and the "dev version" in the nightly is > 9000, I don't feel strongly. It's not clear to me how it's less work to turn 100000234 into a specific commit than it is to convert YYYYMMDD--still gotta look that up somewhere--in practice all that matters is that the nightly package version number is greater than last night's release and less than the next major release.

Copy link
Member Author

@kou kou Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry. "without extra work" is exaggerated...

We need to do the following to show the commit for 1000000234:

$ git log "master~$[$(git describe --tags master | cut -d- -f4) - (1000000234 - 1000000000)]"

I'll resolve conflict and merge this for 10.0.0.

If we find any problem with this approach, we can try another approach for 11.0.0 or later.

@kou
Copy link
Member Author

kou commented Oct 19, 2022

This is the result of archery crossbow submit r-binary-packages.

Revision: 1c64902

Submitted crossbow builds: ursacomputing/crossbow @ build-693

Task Status
r-binary-packages Github Actions

@kou
Copy link
Member Author

kou commented Oct 19, 2022

This is the result of archery crossbow submit r-binary-packages --arrow-version 10.0.0-rc1.

Revision: 1c64902

Submitted crossbow builds: ursacomputing/crossbow @ build-694

Task Status
r-binary-packages Github Actions

@kou kou merged commit 3c0a7a6 into apache:master Oct 19, 2022
@kou kou deleted the ci-r-packages branch October 19, 2022 04:36
kou added a commit that referenced this pull request Oct 20, 2022
We generate version in Crossbow not CI job.
We can use file name validation feature by using version generated by Crossbow.

We use "X.Y.Z" version for RC version such as "10.0.0" for "10.0.0-rc1".

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants