-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TFJob failed to run behind proxy with IOError: Not a gzipped file
#182
Comments
Thank you for reporting us your feedback! The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-6042.
|
So I have inspoected the tf job image and the code they use there. First of all there is higher tag for that image but its the same cod ewith python2.7 which is still not working. So I duf the code and the librarry and basically the important code which is downloading the dataset to the image which is causing problem is this piece of code.
So I have created an sleeping image behind proxy with python 2.7 I sshed and I rerun just that code with proxy env variables. To compare from the same pod I have run simple curl to get the file to compare. You wont believe but the python code succeeds and the curl succeeds but the files are different size. The python code gets just 3KB of data
To overcome this problem in the tf job command I am first curling all the datasets to the image before running the code so the code does not use the urlretrieve_with_retry. |
Fixed here canonical/charmed-kubeflow-uats#105 |
Bug Description
The training operator UATs failed in a CKF deployment behind proxy, with TFJob in
Failed
statusTo Reproduce
Environment
microk8s 1.29-strict/stable
juju 3.4.4
Relevant Log Output
Additional Context
No response
The text was updated successfully, but these errors were encountered: