Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: container kaniko exit 1 Susbtra-backend crash #809

Open
SantiagoMoreno-UdeA opened this issue Jan 30, 2024 · 3 comments
Open

BUG: container kaniko exit 1 Susbtra-backend crash #809

SantiagoMoreno-UdeA opened this issue Jan 30, 2024 · 3 comments

Comments

@SantiagoMoreno-UdeA
Copy link

What are you trying to do?

Deploy locally substra following the steps in: https://docs.substra.org/en/stable/how-to/developing-substra/local-deployment.html

Issue Description (what is happening?)

skaffold crash for permissions when running without sudo.
Then I ran all the steps with sudo, but when I try to launch substra-backend (sudo skaffold run) the process crash. error_SubstraBackend.txt

Then I try "sudo skaffold run" again and this was the result: error_SubstraBackend_secondTry.txt

CPU: Intel Core i7-1355U
RAM: 16GB
OS: Ubuntu 22.04.3 LTS 64-bit

Expected Behavior (what should happen?)

The substra backend service would launch.

Reproducible Example

No response

Operating system

ububtu 22.04

Python version

3.10.112

Installed Substra versions

substra==0.49.0
substrafl==0.42.0
substratools==0.21.0

Installed versions of dependencies

helm == v3.14.0
skaffold == v2.1.0
numpy == 1.24.3
pytorch == 2.0.1+cu117

Logs / Stacktrace

error_SubstraBackend.txt
error_SubstraBackend_secondTry.txt

@guilhem-barthes
Copy link
Contributor

guilhem-barthes commented Jan 30, 2024

Hi there,

In both logs we can see the line error building image: Get "https://ghcr.io/v2/": dial tcp: lookup ghcr.io on 10.43.0.10:53: server misbehaving. A server misbehaving error on port 53 is usually linked with a misconfigured DNS resolver. Seeing the IP, it looks like you are not using an external one. Perhaps you could try to change your dns resolver and use any external provider (dns0.eu, Google DNS, Cloudfare dns).

@SantiagoMoreno-UdeA
Copy link
Author

SantiagoMoreno-UdeA commented Feb 1, 2024

Hi @guilhem-barthes!

I changed the dns resolver and add the Google DNS. But now I'm having trouble with the previous step, when I try "sudo skaffold run" for the orchestator, the process wait for the Helm release manager installation and then crash.

Logs for the Services launcher: logs_K3SLaunch.txt I do not see anything weird.

There is not much information, the skaffold process just stop: error_SubstraOrchestator.txt

Second try for the skaffold orchestator:
error_SubstraOrchestator_secondTry.txt

Sorry if is something trivial I'm a rookie handling Network Deployment and Substra.

@guilhem-barthes
Copy link
Contributor

Hey @SantiagoMoreno-UdeA!

No worries for your questions. For your new question, I think you should check if kubectl get pods -n ingress-nginx --selector=app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx returns a list of pods with 1 pod with STATUS Running. Is there a specific reason why you run as sudo ?

One of our colleague had a similar problem on the last MacOs docker version (v4.27.1). It was fixed by going to the Docker desktop > Settings > Resources > Network > Untick "Use kernel networking for UDP".
I don't think this setting is in the Linux version tho, but you could check if your port 53 is already binded to another process sudo netstat -pna |grep 53

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants