Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] DebMetadataTests test05CheckLintian failing on libtorch_cpu.so #88090

Closed
grcevski opened this issue Jun 27, 2022 · 10 comments · Fixed by #88228 or #88301
Closed

[CI] DebMetadataTests test05CheckLintian failing on libtorch_cpu.so #88090

grcevski opened this issue Jun 27, 2022 · 10 comments · Fixed by #88228 or #88301
Assignees
Labels
:Delivery/Packaging RPM and deb packaging, tar and zip archives, shell and batch scripts :ml Machine learning Team:Delivery Meta label for Delivery team Team:ML Meta label for the ML team >test-failure Triaged test failures from CI

Comments

@grcevski
Copy link
Contributor

grcevski commented Jun 27, 2022

CI Link

https://gradle-enterprise.elastic.co/s/2bezdlvc2ylgg/tests/:qa:os:destructiveDistroTest.default-deb/org.elasticsearch.packaging.test.DebMetadataTests/test05CheckLintian?top-execution=1

Repro line

[bash, -c, lintian /var/lib/jenkins/workspace/elastic+elasticsearch+pull-request+packaging-tests-unix-sample/PACKAGING_TASK/destructiveDistroTest.packages/os/ubuntu-20.04-packaging/distribution/packages/deb/build/distributions/elasticsearch-8.4.0-SNAPSHOT-amd64.deb]

Does it reproduce?

Didn't try

Applicable branches

master

Failure history

No response

Failure excerpt

I think this issue is similar to #87632, except this time lintian is complaining about libtorch_cpu.so and libstdc++.so.6.

@grcevski grcevski added :Delivery/Packaging RPM and deb packaging, tar and zip archives, shell and batch scripts >test-failure Triaged test failures from CI :ml Machine learning Team:ML Meta label for the ML team needs:triage Requires assignment of a team area label labels Jun 27, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticmachine elasticmachine added the Team:Delivery Meta label for Delivery team label Jun 27, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-delivery (Team:Delivery)

@mark-vieira
Copy link
Contributor

@droberts195 is this a change to included ML dependencies or is this another possible update to Lintian itself just complaining about more stuff?

@mark-vieira mark-vieira removed the needs:triage Requires assignment of a team area label label Jun 27, 2022
@grcevski
Copy link
Contributor Author

I muted the test for now, since few CI jobs are failing.

@droberts195
Copy link
Contributor

It must be a side effect of elastic/ml-cpp#2316. We'll investigate what changed.

@droberts195
Copy link
Contributor

The 3 errors Lintian found were:

E: elasticsearch: binary-or-shlib-defines-rpath usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/lib/libstdc++.so.6 '$ORIGIN'
E: elasticsearch: binary-or-shlib-defines-rpath usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/lib/libtorch_cpu.so '$ORIGIN'
W: elasticsearch: shared-lib-without-dependency-information usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/lib/libmkl_cdft_core.so

For the ones about the RPATH being '$ORIGIN' the thing that's wrong is not crystal clear. However, comparing what things look like before and after elastic/ml-cpp#2316 shows the problem.

Before:

[dave@marple lib]$ readelf -a libtorch_cpu.so | grep 'R.*PATH'
 0x000000000000000f (RPATH)              Library rpath: [$ORIGIN]
[dave@marple lib]$ readelf -a libMlApi.so | grep 'R.*PATH'
 0x000000000000000f (RPATH)              Library rpath: [$ORIGIN]

Lintian didn't complain about either of these.

After:

[dave@marple lib]$ readelf -a libtorch_cpu.so | grep 'R.*PATH'
 0x000000000000000f (RPATH)              Library rpath: ['$ORIGIN']
[dave@marple lib]$ readelf -a libMlApi.so | grep 'R.*PATH'
 0x000000000000000f (RPATH)              Library rpath: [$ORIGIN/]

Lintian complains about the first one but not the second one.

So it seems that the bug is that we're putting $ORIGIN in quotes now which presumably ruins its intended effect.

The problem with libmkl_cdft_core.so is nothing that's changed about the library. What's new is that we're shipping it at all. We shouldn't be.

Before:

[dave@marple lib]$ ls -1 libmkl*
libmkl_avx.so
libmkl_avx2.so
libmkl_avx512.so
libmkl_avx512_mic.so
libmkl_core.so
libmkl_def.so
libmkl_gnu_thread.so
libmkl_intel_lp64.so
libmkl_mc3.so
libmkl_vml_avx.so
libmkl_vml_avx2.so
libmkl_vml_avx512.so
libmkl_vml_avx512_mic.so
libmkl_vml_cmpt.so
libmkl_vml_def.so
libmkl_vml_mc3.so

After:

[dave@marple lib]$ ls -1 libmkl*
libmkl_avx.so
libmkl_avx2.so
libmkl_avx512.so
libmkl_avx512_mic.so
libmkl_cdft_core.so
libmkl_core.so
libmkl_def.so
libmkl_gnu_thread.so
libmkl_intel_lp64.so
libmkl_mc3.so
libmkl_vml_avx.so
libmkl_vml_avx2.so
libmkl_vml_avx512.so
libmkl_vml_avx512_mic.so
libmkl_vml_cmpt.so
libmkl_vml_def.so
libmkl_vml_mc3.so

So there are two bugs to fix:

  1. Don't quote $ORIGIN when updating the RPATH of 3rd party libraries
  2. Don't ship libmkl_cdft_core.so

@droberts195
Copy link
Contributor

Once elastic/ml-cpp#2337 is merged the last step to close this will be to unmute the test.

@mark-vieira
Copy link
Contributor

Thanks for the investigation @droberts195!

edsavage added a commit to edsavage/elasticsearch that referenced this issue Jul 4, 2022
droberts195 added a commit that referenced this issue Jul 4, 2022
The fix for the problems identified in #88090 is in
elastic/ml-cpp#2337, so the test can be unmuted now.

Closes #88090
@droberts195
Copy link
Contributor

droberts195 commented Jul 4, 2022

There's another error showing up now instead of the original 3:

W: elasticsearch: hardening-no-pie [usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/autodetect

I've re-muted the test while we work on this.

Failure links:

  1. https://gradle-enterprise.elastic.co/s/bz2l3jghrdba4/tests/:qa:os:destructiveDistroTest.default-deb/org.elasticsearch.packaging.test.DebMetadataTests/test05CheckLintian?top-execution=1
  2. https://gradle-enterprise.elastic.co/s/qhcumlpvkjrag/tests/:qa:os:destructiveDistroTest.default-deb/org.elasticsearch.packaging.test.DebMetadataTests/test05CheckLintian?top-execution=1

@droberts195 droberts195 reopened this Jul 4, 2022
droberts195 added a commit that referenced this issue Jul 6, 2022
The most recent problem should be fixed by elastic/ml-cpp#2346

Fixes #88090
Fixes #88252
@droberts195
Copy link
Contributor

The second fix was in elastic/ml-cpp#2346

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Delivery/Packaging RPM and deb packaging, tar and zip archives, shell and batch scripts :ml Machine learning Team:Delivery Meta label for Delivery team Team:ML Meta label for the ML team >test-failure Triaged test failures from CI
Projects
None yet
5 participants