-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-2676: [Packaging] Deploy build artifacts to github releases #2109
Conversation
c3a68f1
to
dd747c2
Compare
Codecov Report
@@ Coverage Diff @@
## master #2109 +/- ##
==========================================
- Coverage 86.39% 86.39% -0.01%
==========================================
Files 230 230
Lines 40706 40706
==========================================
- Hits 35167 35166 -1
- Misses 5539 5540 +1
Continue to review full report at Codecov.
|
wheel-linux \ | ||
wheel-win \ | ||
wheel-osx \ | ||
linux-packages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this. I was thinking we would want to schedule nightlies using a cronjob of some kind. Or at least the manifest of nightly jobs would be specified in a YAML file someplace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be just for convenience, to use travis' cron instead of a self hosted one.
BTW I'm just refactoring the branching logic to support tagging / deploying (github releases).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a fine place to keep the nightly jobs list for now. There's nothing preventing us from moving this to a non-travis cron in the future if there's some reason we would need to do that.
dev/tasks/tasks.yml
Outdated
# branch: defaults to name | ||
# platform: osx|linux|win | ||
# template: path of jinja2 templated yml | ||
# params: optional extra parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having the template be a .travis.yml or .appveyor.yml file only in need of template processing seems potentially inflexible to me. Here is a possible alternative:
- tasks.yml provides the path to a task definition file (which could be YAML)
- The task definition file contains the platform, parameters, and additional information needed to process the task. The idea here would be to make the task definition file more modular; so simple tasks could just be a templated .travis.yml, but we wouldn't be constraining ourselves to that model
My guess is that as time goes on, more of the logic for each task will move from templates to Python code, as a means of improving code reuse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've already lost in the YAML forest
:)
I'm trying to update this PR do deploy the artifacts - hopefully until the sync, but these ci-test-loops are insanely time consuming.
@wesm Rigth now I'm struggling with passing appveyor Here https://github.com/kszucs/crossbow/blob/build-30/job.yml is an example yml containing the submitted job's state. There is draft for querying tasks statuses too. |
@kszucs Let's stick with github for now and we can extend to other if the need arises. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking mostly good! Couple of questions/comments.
@@ -343,6 +343,9 @@ set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${ARROW_CXXFLAGS}") | |||
# For any C code, use the same flags. | |||
set(CMAKE_C_FLAGS "${CMAKE_CXX_FLAGS}") | |||
|
|||
# Remove --std=c++11 to avoid errors from C compilers | |||
string(REPLACE "-std=c++11" "" CMAKE_C_FLAGS ${CMAKE_C_FLAGS}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, this is a temporary fix until we can address ARROW-2720. This is fragile if (with CMAKE_CXX_STANDARD
set to 11) -std=gnu++11
ends up in CMAKE_C_FLAGS
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is removed in master (I removed it in a patch recently). I think this is a rebase artifact and should be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that removal broke osx builds though. That's why we added it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It broke osx wheel and/or conda builds, not CI builds (obviously) though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's super weird, how did that flag end up in CMAKE_C_FLAGS
? What kind of error did it cause?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't exactly what's happening, but I know that we are setting CMAKE_CXX_STANDARD
to 11
and then settings CMAKE_C_FLAGS
to CMAKE_CXX_FLAGS
, which might explain how the std flag is ending up there. However, this doesn't show up on any NIX systems except OSX which is why I'm not 100% sure what's happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error you get is error: invalid argument '-std=c++11' not allowed with 'C/ObjC'
when the ae.c
file is being compiled during the plasma compilation. Without a mac to figure out what's going on, it's pretty hard for me to tell what the heck is happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. I think gcc and clang just ignore the flag when compiling C code. This suggests we should be setting up the CMAKE_C_FLAGS separate from CXX. I'm opening a JIRA
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Thanks.
dev/tasks/README.md
Outdated
@@ -106,7 +106,7 @@ The script does the following: | |||
$ git clone https://github.com/kszucs/crossbow | |||
|
|||
$ cd arrow/dev/tasks | |||
$ python crossbow.py | |||
$ python crossbow.py submit <task names> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give an example here instead of a placeholder?
dev/tasks/README.md
Outdated
@@ -115,7 +115,7 @@ The script does the following: | |||
|
|||
```bash | |||
git checkout ARROW-<ticket number> | |||
python dev/tasks/crossbow.py --dry-run | |||
python dev/tasks/crossbow.py submit --dry-run <task names> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
dev/tasks/README.md
Outdated
@@ -124,85 +124,56 @@ The script does the following: | |||
3. Reads and renders the required build configurations with the parameters | |||
substituted. | |||
2. Create a commit per build configuration to its own branch. For example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still correct? Isn't it now "A branch per task, prefixed with the job id"?
|
||
# Configure conda. | ||
- | | ||
echo "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason why we're echoing a newline here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy paste from conda forge. This newline probably makes the output more readable.
wheel-linux \ | ||
wheel-win \ | ||
wheel-osx \ | ||
linux-packages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a fine place to keep the nightly jobs list for now. There's nothing preventing us from moving this to a non-travis cron in the future if there's some reason we would need to do that.
dev/tasks/crossbow.py
Outdated
# format="[%(asctime)s] %(levelname)s Crossbow %(message)s", | ||
# datefmt="%H:%M:%S", | ||
# stream=click.get_text_stream('stdout') | ||
# ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to just delete this code instead of leaving it here commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the previous solution used logging instead of click.echo
. It was Wes' request, but generated a lot of noise, so I've switched to directly printing to stdout.
Honestly I'd remove it :)
dev/tasks/crossbow.py
Outdated
|
||
|
||
class Target(object): | ||
class Repo(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we really trying to be python 2 compatible?
def next_job_id(self, prefix): | ||
"""Auto increments the branch's identifier based on the prefix""" | ||
pattern = re.compile(prefix + '-(\d+)') | ||
matches = list(filter(None, map(pattern.match, self.repo.branches))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there away to get the most recently created branch, from the server, so that we don't have to look at every branch that exists every time we create a new one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not aware of any, but I don't think we should worry about that - with libgit2 traversing is really fast.
If it's going to be a problem, we can address it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
provider: releases | ||
api_key: $CROSSBOW_GITHUB_TOKEN | ||
file_glob: true | ||
file: /path/to/pachages/*.tar.gz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the build passes.
dev/tasks/README.md
Outdated
@@ -106,7 +106,7 @@ The script does the following: | |||
$ git clone https://github.com/kszucs/crossbow | |||
|
|||
$ cd arrow/dev/tasks | |||
$ python crossbow.py | |||
$ python crossbow.py submit <task names> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way that we can make the submit
subcommand show a better error message if the task isn't valid?
Right now I get this:
$ python crossbow.py submit --dry-run linux-packges
Traceback (most recent call last):
File "crossbow.py", line 409, in <module>
crossbow(auto_envvar_prefix='CROSSBOW')
File "/home/phillip/miniconda/envs/pyarrow36/lib/python3.6/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/home/phillip/miniconda/envs/pyarrow36/lib/python3.6/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/home/phillip/miniconda/envs/pyarrow36/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/phillip/miniconda/envs/pyarrow36/lib/python3.6/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/phillip/miniconda/envs/pyarrow36/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "crossbow.py", line 338, in submit
tasks = {name: tasks[name] for name in task_names}
File "crossbow.py", line 338, in <dictcomp>
tasks = {name: tasks[name] for name in task_names}
KeyError: 'linux-packges'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in kszucs#3
return Job(tasks=tasks, branch=data.get('branch')) | ||
|
||
def files(self): | ||
return {'job.yml': yaml.dump(self.to_dict(), default_flow_style=False)} | ||
|
||
|
||
# this should be the mailing list | ||
MESSAGE_EMAIL = '[email protected]' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we instead 1) make this configurable via an environment variable, and 2) default to the email of the git user triggering the builds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can submit a patch to your fork for these changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -80,12 +80,12 @@ submission. The tasks are defined in `tasks.yml` | |||
6. Install the python dependencies for the script: | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that is worth nothing in this README.md
is that you need at least one commit in your queue repo or else this happens:
Traceback (most recent call last):
File "crossbow.py", line 413, in <module>
crossbow(auto_envvar_prefix='CROSSBOW')
File "/home/phillip/miniconda/envs/pyarrow36/lib/python3.6/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/home/phillip/miniconda/envs/pyarrow36/lib/python3.6/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/home/phillip/miniconda/envs/pyarrow36/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/phillip/miniconda/envs/pyarrow36/lib/python3.6/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/phillip/miniconda/envs/pyarrow36/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "crossbow.py", line 356, in submit
queue.put(job)
File "crossbow.py", line 186, in put
branch = self._create_branch(task.branch, files=task.files())
File "crossbow.py", line 158, in _create_branch
parents = parents or [self.head]
File "crossbow.py", line 96, in head
return self.repo.head.target
_pygit2.GitError: reference 'refs/heads/master' not found
Is there any way that we can automate this away? If not, make sure to add a note about it in the README.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need parent commits, it just makes the graph nicer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved
@cpcloud https://github.com/kszucs/crossbow/releases/tag/build-95 26 artifacts, is that correct? (not counting linux packages) |
@kszucs There should be two each of pyarrow and arrow-cpp windows packages: one for python 3.5 and one for python 3.6 for a total of 4 windows conda packages + parquet-cpp |
Restarted the travis build |
@cpcloud All of the conda and whl artifacts are uploaded now https://github.com/kszucs/crossbow/releases |
@wesm @xhochy @kszucs has artifacts for all platforms and pythons here if you're interested: https://github.com/kszucs/crossbow/releases/tag/build-115 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kszucs Do we have a JIRA for this?
|
dev/tasks/conda-recipes/appveyor.yml
Outdated
- for %%f in (*.tar.bz2) do ren "%%f" "%%~nf-win-64.tar.bz2" | ||
- for %%f in (*.tar.bz2) do ( | ||
set %%g=%%~nf | ||
ren "%%f" "%%~ng-win-64.tar.bz2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Batch is truly horrifying sometimes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grgrgr
@cpcloud Created JIRA for that https://issues.apache.org/jira/browse/ARROW-2724 |
@wesm I'm ready to merge if you are. There are a couple of follow up tasks:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to merge and move on to the next stage of work. We need update dev/release/RELEASE_MANAGEMENT.md -- maybe when this is done you should advise the mailing list so some others can take this for a test run and see what issues they run into
@pcmoritz @robertnishihara Have either of you encountered this error before: https://travis-ci.org/apache/arrow/jobs/395810684 I'm guessing that we happened to get a machine with very little memory on that particular run. Wondering if this is transient or an honest-to-goodness bug. |
Arg, restarting the build appears to overwrite the previous build's log. |
Ok, merging. |
Thanks @kszucs! Onwards! |
Added a new task which trigger crossbow builds on master@crossbow.See travis output https://travis-ci.org/kszucs/crossbow/builds/388667590Here are the boxes we need to check:
Create a separate tagged branch that contains a YAML file indicating the information about each task created as part of the run. So there should be one entry for each job that was created -- the git hash for the task, the CI service used to run the task, etc. It should also indicate if one or more artifacts are expected to be uploaded
Write a status tool which can query the status of a particular run and determine if the run is complete (needs cleanup)
Can we run each desired task in a particular CI service
We can determine the list of created tasks associated with a particular run
Tasks should be configured with the tag name, and artifacts should be uploaded to GitHub under the tag which should appear as a release on the repo
Each task can upload its artifacts to a deterministic central location (e.g. GitHub), where the artifacts are not commingled with any other run
-> only linux packages are failing, I suggest resolving it in a subsequent PR (issue https://issues.apache.org/jira/browse/ARROW-2713)
We can determine whether all the expected artifacts from a particular run have been successfully uploaded (i.e. to GitHub)to be done in https://issues.apache.org/jira/browse/ARROW-2724We can download all the artifacts from a successful run and GPG sign them for purposes of a release vote
Example of artifacts available here https://github.com/kszucs/crossbow/releases
Jobs and tasks here https://github.com/kszucs/crossbow/branches
Job definition here https://github.com/kszucs/crossbow/blob/build-36/job.yml