Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clean obsolote artifacts lingering around, fixes Jenkins builds #7827

Merged
merged 1 commit into from
Aug 1, 2018

Conversation

jalvz
Copy link
Contributor

@jalvz jalvz commented Aug 1, 2018

No description provided.

@jalvz jalvz requested a review from kvch August 1, 2018 11:16
Copy link
Contributor

@kvch kvch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WFG. Thanks for fixing the problem!

@jalvz jalvz changed the title clean obsolote artifacts ingering around, fixes Jenkins builds clean obsolote artifacts lingering around, fixes Jenkins builds Aug 1, 2018
@kvch
Copy link
Contributor

kvch commented Aug 1, 2018

The failure on Travis is unrelated.

@andrewkroh
Copy link
Member

Which Jenkins failure does this fix?

# Write cached magefile binaries to workspace to ensure
# each run starts from a clean slate.
export MAGEFILE_CACHE="${WORKSPACE}/.magefile"

  • We set MAGEFILE_CACHE to point to the Jenkins workspace. So either this code is not being invoked or the workspace is not being destroyed between runs.
  • MAGEFILE_CACHE isn't being set in the release manager environment AFAIK, but the VM is destroyed between run like it is on Travis. Do we need it set for the RM? If so we can add it to release-manager-release/snapshot targets.

@jalvz
Copy link
Contributor Author

jalvz commented Aug 1, 2018

It fixes https://apm-ci.elastic.co/job/elastic+apm-server+pull-request+multijob-intake/1224/console, and I assume all these: https://beats-ci.elastic.co/job/elastic+apm-server+master+beats-update/

I don't know how it all works, basically ssh'ed in a jenkins worker and found that to do the job.
If you get to the root cause and have a cleaner fix, go for it :) , for the moment this unblocked us.

FWIW I also stumbled on it locally, 2 consecutive runs in different branches of make update did fail unless i make clean in between.

Out of curiosity: why there is a mage target in libbeat/scripts/Makefile and another one in the Makefile in the root folder, almost identical?

@kvch
Copy link
Contributor

kvch commented Aug 1, 2018

The problem with release-manager and the apm-server build. I was able to reproduce it locally by having apm-server and beats under the same GOPATH. First, run make update in apm-server and then running make update in beats fails. It's because fields.yml is written to "-" file, instead of printing it to stdout. Because previously out-of-date mage/fields.go from apm-server was run and the binary was cached.

@andrewkroh
Copy link
Member

@jalvz It looks like apm-server has it's own Jenkins script for the intake jobs that does not reuse the jenkins_setup function from Beats. If the two projects could share the the common setup code then this would be good.

@andrewkroh
Copy link
Member

@kvch Thanks for the explanation. It is very helpful in understanding the issue.

So the mage related issue is caused by mage not updating its cached binary when apm-server/vendor/github.com/elastic/beats/dev-tools/mage is updated. This relates to magefile/mage#114.

For example when mage fields is invoked this will compile a new binary to ~/.magefile/<sha1 hash of magefile.go> and invoke that binary. So if magefile.go is unchanged and a dependency changes you'll have a stale executable unless you forcefully clean with mage -clean. And is why MAGEFILE_CACHE must be set to $WORKSPACE/.magefile in CI environments so that the cache is destroyed between job executions.

@kvch
Copy link
Contributor

kvch commented Aug 1, 2018

@andrewkroh so we should get rid of this patch in beats? and let apm-server fix their Jenkins script?

@jalvz
Copy link
Contributor Author

jalvz commented Aug 2, 2018

@andrewkroh Thanks for digging in, all clear now. I understand that when ci was setup in apm-server this kind of thing wasn't foreseen.
Ill see what i can do, then this patch can be reverted if it is an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants