Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-27178][k8s] add nss to the spark/k8s Dockerfile #24111

Closed

Conversation

shaneknapp
Copy link
Contributor

@shaneknapp shaneknapp commented Mar 15, 2019

What changes were proposed in this pull request?

while performing some tests on our existing minikube and k8s infrastructure, i noticed that the integration tests were failing. i dug in and discovered the following message buried at the end of the stacktrace:

  Caused by: java.io.FileNotFoundException: /usr/lib/libnss3.so
  	at sun.security.pkcs11.Secmod.initialize(Secmod.java:193)
  	at sun.security.pkcs11.SunPKCS11.<init>(SunPKCS11.java:218)
  	... 81 more

after i added the nss package to resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile, everything worked.

this is also impacting current builds. see: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/8959/console

How was this patch tested?

i tested locally before pushing, and the build system will test the rest.

@SparkQA
Copy link

SparkQA commented Mar 15, 2019

Test build #103556 has started for PR 24111 at commit 48e7a0c.

@SparkQA
Copy link

SparkQA commented Mar 15, 2019

@SparkQA
Copy link

SparkQA commented Mar 15, 2019

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/8967/

@shaneknapp
Copy link
Contributor Author

test this please

@SparkQA
Copy link

SparkQA commented Mar 15, 2019

@SparkQA
Copy link

SparkQA commented Mar 15, 2019

Test build #103558 has finished for PR 24111 at commit 48e7a0c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@erikerlandson
Copy link
Contributor

Was this introduced by some other dependency version bump?

@SparkQA
Copy link

SparkQA commented Mar 15, 2019

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/8969/

@shaneknapp
Copy link
Contributor Author

Was this introduced by some other dependency version bump?

the only other PR merged that touched k8s after the client bump on wednesday was this one:
#23380

...and that doesn't touch anything i care about. :\

since i don't want to ruin my weekend, i will investigate further next week.

@felixcheung
Copy link
Member

sun.security - is it running with a new JDK?

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems OK to me, but I don't know much about this.

@shaneknapp
Copy link
Contributor Author

sun.security - is it running with a new JDK?

i dunno... it's in the docker container, which is pulling upstream from alpine linux's dockerhub repo.

@markhamstra
Copy link
Contributor

Seems that there is an argument to be made that the Dockerfile should lock down a much more specific version of Alpine and JDK. Using what is essentially a floating tag leaves us guessing any time these dependencies change.

@shaneknapp
Copy link
Contributor Author

Seems that there is an argument to be made that the Dockerfile should lock down a much more specific version of Alpine and JDK. Using what is essentially a floating tag leaves us guessing any time these dependencies change.

yep, i agree completely. it's not at all where we want to be.

three things i will do tomorrow:

  1. open a jira to discuss locking any dockerfiles to a specific version (maybe host our own?)

  2. open a new PR for 2.4.x w/this fix

  3. looking in to any potential dep changes in the image

sorry, four thing:

  1. pick up a bottle of bourbon to help with (3)

:)

@attilapiros
Copy link
Contributor

ping @vanzin

@vanzin
Copy link
Contributor

vanzin commented Mar 18, 2019

Actually I wonder if this has been fixed already upstream. Since both the jdk and the base libs come from the base image, it would be a bug in the base image if this library were missing.

With the latest openjdk:8-alpine:

$ docker run -it openjdk:8-alpine
/ # java -version
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (IcedTea 3.10.0) (Alpine 8.191.12-r0)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
/ # ls -l /usr/lib/libnss3.so
lrwxrwxrwx    1 root     root            13 Mar  8 02:13 /usr/lib/libnss3.so -> libnss3.so.41

Not sure if the IT scripts are forcing the image to refresh or use a cached version, though...

@shaneknapp
Copy link
Contributor Author

shaneknapp commented Mar 18, 2019

Actually I wonder if this has been fixed already upstream. Since both the jdk and the base libs come from the base image, it would be a bug in the base image if this library were missing.

With the latest openjdk:8-alpine:

$ docker run -it openjdk:8-alpine
/ # java -version
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (IcedTea 3.10.0) (Alpine 8.191.12-r0)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
/ # ls -l /usr/lib/libnss3.so
lrwxrwxrwx    1 root     root            13 Mar  8 02:13 /usr/lib/libnss3.so -> libnss3.so.41

Not sure if the IT scripts are forcing the image to refresh or use a cached version, though...

hmm, lemme remove the nss package install and re-run the k8s integration tests locally and see what happens.

@shaneknapp
Copy link
Contributor Author

ok... so the last time this image on dockerhub was updated was march 1st, which doesn't fit in to our timing of these failures happening. i'm still mildly confused.

i'm also wondering if this is something that's minikube-specific... since we're using the convoluted path of minikube -> kvm2 -> docker-machine things may not be behaving as expected.

also, i re-tested the integration tests w/nss removed from apk add and the jobs are failing "as expected" and passing w/nss added to the Dockerfile.

@vanzin
Copy link
Contributor

vanzin commented Mar 18, 2019

that's weird because "apk info nss" on the bare image shows that it is installed, so this change should be a no op.

$ docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
openjdk             8-alpine            e9ea51023687        10 days ago         105MB

@vanzin
Copy link
Contributor

vanzin commented Mar 18, 2019

Ah, I think this is it:

 ---> Running in 23e86476ab1c
+ apk upgrade --no-cache
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/community/x86_64/APKINDEX.tar.gz
(1/7) Upgrading libcrypto1.1 (1.1.1a-r1 -> 1.1.1b-r1)
(2/7) Upgrading libssl1.1 (1.1.1a-r1 -> 1.1.1b-r1)
(3/7) Upgrading openjdk8-jre-base (8.191.12-r0 -> 8.201.08-r0)
(4/7) Upgrading openjdk8-jre (8.191.12-r0 -> 8.201.08-r0)
(5/7) Purging nss (3.41-r0)

Now why the upgrade is purging the nss package, I have no idea... seems like a bug in some alpine package (probably the updated openjre 8.201.08-r0), but this seems ok as a workaround.

@shaneknapp
Copy link
Contributor Author

ok cool. i'll test this one more time and then merge it to master. 2.4.1 will be it's own PR.

@shaneknapp
Copy link
Contributor Author

test this please

@SparkQA
Copy link

SparkQA commented Mar 18, 2019

@SparkQA
Copy link

SparkQA commented Mar 18, 2019

Test build #103633 has finished for PR 24111 at commit 48e7a0c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 18, 2019

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/9034/

@shaneknapp
Copy link
Contributor Author

alright, merging to master.

@shaneknapp
Copy link
Contributor Author

actually i'm going to wait for a couple of hours just to be sure my access to repos is properly synced.

vanzin pushed a commit that referenced this pull request Mar 18, 2019
## What changes were proposed in this pull request?

see also:  #24111

while performing some tests on our existing minikube and k8s infrastructure, i noticed that the integration tests were failing. i dug in and discovered the following message buried at the end of the stacktrace:

```
  Caused by: java.io.FileNotFoundException: /usr/lib/libnss3.so
  	at sun.security.pkcs11.Secmod.initialize(Secmod.java:193)
  	at sun.security.pkcs11.SunPKCS11.<init>(SunPKCS11.java:218)
  	... 81 more
```
after i added the `nss` package to `resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile`, everything worked.

this is also impacting current builds.  see:  https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/8959/console

## How was this patch tested?

i tested locally before pushing, and the build system will test the rest.

Closes #24137 from shaneknapp/add-nss-package.

Authored-by: shane knapp <[email protected]>
Signed-off-by: Marcelo Vanzin <[email protected]>
markhamstra pushed a commit to markhamstra/spark that referenced this pull request Mar 21, 2019
see also:  apache#24111

while performing some tests on our existing minikube and k8s infrastructure, i noticed that the integration tests were failing. i dug in and discovered the following message buried at the end of the stacktrace:

```
  Caused by: java.io.FileNotFoundException: /usr/lib/libnss3.so
  	at sun.security.pkcs11.Secmod.initialize(Secmod.java:193)
  	at sun.security.pkcs11.SunPKCS11.<init>(SunPKCS11.java:218)
  	... 81 more
```
after i added the `nss` package to `resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile`, everything worked.

this is also impacting current builds.  see:  https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/8959/console

i tested locally before pushing, and the build system will test the rest.

Closes apache#24137 from shaneknapp/add-nss-package.

Authored-by: shane knapp <[email protected]>
Signed-off-by: Marcelo Vanzin <[email protected]>
@shaneknapp shaneknapp deleted the add-nss-package-to-dockerfile branch April 19, 2019 17:35
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Jul 23, 2019
## What changes were proposed in this pull request?

see also:  apache#24111

while performing some tests on our existing minikube and k8s infrastructure, i noticed that the integration tests were failing. i dug in and discovered the following message buried at the end of the stacktrace:

```
  Caused by: java.io.FileNotFoundException: /usr/lib/libnss3.so
  	at sun.security.pkcs11.Secmod.initialize(Secmod.java:193)
  	at sun.security.pkcs11.SunPKCS11.<init>(SunPKCS11.java:218)
  	... 81 more
```
after i added the `nss` package to `resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile`, everything worked.

this is also impacting current builds.  see:  https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/8959/console

## How was this patch tested?

i tested locally before pushing, and the build system will test the rest.

Closes apache#24137 from shaneknapp/add-nss-package.

Authored-by: shane knapp <[email protected]>
Signed-off-by: Marcelo Vanzin <[email protected]>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Jul 25, 2019
## What changes were proposed in this pull request?

see also:  apache#24111

while performing some tests on our existing minikube and k8s infrastructure, i noticed that the integration tests were failing. i dug in and discovered the following message buried at the end of the stacktrace:

```
  Caused by: java.io.FileNotFoundException: /usr/lib/libnss3.so
  	at sun.security.pkcs11.Secmod.initialize(Secmod.java:193)
  	at sun.security.pkcs11.SunPKCS11.<init>(SunPKCS11.java:218)
  	... 81 more
```
after i added the `nss` package to `resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile`, everything worked.

this is also impacting current builds.  see:  https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/8959/console

## How was this patch tested?

i tested locally before pushing, and the build system will test the rest.

Closes apache#24137 from shaneknapp/add-nss-package.

Authored-by: shane knapp <[email protected]>
Signed-off-by: Marcelo Vanzin <[email protected]>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Aug 1, 2019
## What changes were proposed in this pull request?

see also:  apache#24111

while performing some tests on our existing minikube and k8s infrastructure, i noticed that the integration tests were failing. i dug in and discovered the following message buried at the end of the stacktrace:

```
  Caused by: java.io.FileNotFoundException: /usr/lib/libnss3.so
  	at sun.security.pkcs11.Secmod.initialize(Secmod.java:193)
  	at sun.security.pkcs11.SunPKCS11.<init>(SunPKCS11.java:218)
  	... 81 more
```
after i added the `nss` package to `resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile`, everything worked.

this is also impacting current builds.  see:  https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/8959/console

## How was this patch tested?

i tested locally before pushing, and the build system will test the rest.

Closes apache#24137 from shaneknapp/add-nss-package.

Authored-by: shane knapp <[email protected]>
Signed-off-by: Marcelo Vanzin <[email protected]>
zhongjinhan pushed a commit to zhongjinhan/spark-1 that referenced this pull request Sep 3, 2019
## What changes were proposed in this pull request?

see also:  apache/spark#24111

while performing some tests on our existing minikube and k8s infrastructure, i noticed that the integration tests were failing. i dug in and discovered the following message buried at the end of the stacktrace:

```
  Caused by: java.io.FileNotFoundException: /usr/lib/libnss3.so
  	at sun.security.pkcs11.Secmod.initialize(Secmod.java:193)
  	at sun.security.pkcs11.SunPKCS11.<init>(SunPKCS11.java:218)
  	... 81 more
```
after i added the `nss` package to `resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile`, everything worked.

this is also impacting current builds.  see:  https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/8959/console

## How was this patch tested?

i tested locally before pushing, and the build system will test the rest.

Closes #24137 from shaneknapp/add-nss-package.

Authored-by: shane knapp <[email protected]>
Signed-off-by: Marcelo Vanzin <[email protected]>
(cherry picked from commit 342e91f)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants