Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blob.download_to_filename fails with google-cloud-storage==1.3.0 #3736

Closed
sonlac opened this issue Aug 7, 2017 · 30 comments
Closed

blob.download_to_filename fails with google-cloud-storage==1.3.0 #3736

sonlac opened this issue Aug 7, 2017 · 30 comments
Assignees
Labels
api: storage Issues related to the Cloud Storage API. status: investigating The issue is under investigation, which is determined to be non-trivial.

Comments

@sonlac
Copy link
Contributor

sonlac commented Aug 7, 2017

Due to the notable implementation changes in google-cloud-storage 1.3.0, the blob.download_to_filename fails whatever google-cloud from 0.24.0 to 0.27.0 (latest)

Our prod server which using google-cloud==0.24.0 has been broken since last Saturday. I analyzed a bit the error. The problem is come from the upgraded package google-cloud-storage==1.3.0. We are using the google-cloud-storage==1.2.0.

I did several tests which figured out the following problems:

Problem 1: blob.download_to_filename failed when using google-cloud==0.27.0 (included google-cloud-storage==1.3.0 as default, defined in setup.py). Here is a part of stacktrace:

blob.download_to_filename(blob_local_path)
  File "/root/.local/lib/python2.7/site-packages/google/cloud/storage/blob.py", line 482, in download_to_filename
    self.download_to_file(file_obj, client=client)
  File "/root/.local/lib/python2.7/site-packages/google/cloud/storage/blob.py", line 464, in download_to_file
    self._do_download(transport, file_obj, download_url, headers)
  File "/root/.local/lib/python2.7/site-packages/google/cloud/storage/blob.py", line 418, in _do_download
    download.consume(transport)
  File "/root/.local/lib/python2.7/site-packages/google/resumable_media/requests/download.py", line 101, in consume
    self._write_to_stream(result)
  File "/root/.local/lib/python2.7/site-packages/google/resumable_media/requests/download.py", line 62, in _write_to_stream
    with response:
AttributeError: __exit__

Problem 2: blob.download_to_filename failed when using google-cloud==0.24.0 (included google-cloud-storage==1.3.0 as default, defined in setup.py). Here is a part of stacktrace:

File "dev_env/python_venv/local/lib/python2.7/site-packages/google/cloud/storage/blob.py", line 482, in download_to_filename
    self.download_to_file(file_obj, client=client)
  File "dev_env/python_venv/local/lib/python2.7/site-packages/google/cloud/storage/blob.py", line 464, in download_to_file
    self._do_download(transport, file_obj, download_url, headers)
  File "dev_env/python_venv/local/lib/python2.7/site-packages/google/cloud/storage/blob.py", line 418, in _do_download
    download.consume(transport)
  File "dev_env/python_venv/local/lib/python2.7/site-packages/google/resumable_media/requests/download.py", line 96, in consume
    transport, method, url, **request_kwargs)
  File "dev_env/python_venv/local/lib/python2.7/site-packages/google/resumable_media/requests/_helpers.py", line 101, in http_request
    func, RequestsMixin._get_status_code, retry_strategy)
  File "dev_env/python_venv/local/lib/python2.7/site-packages/google/resumable_media/_helpers.py", line 146, in wait_and_retry
    response = func()
  File "dev_env/python_venv/local/lib/python2.7/site-packages/google_auth_httplib2.py", line 198, in request
    uri, method, body=body, headers=request_headers, **kwargs)
TypeError: request() got an unexpected keyword argument 'data'

So, why it happened? I realized that in the REQUIREMENTS of the package google-cloud which is defined in the file setup.py. An example for google-cloud==0.27.0 https://github.com/GoogleCloudPlatform/google-cloud-python/blob/master/setup.py#L67

We can see that if we only define google-cloud==[0.24.0 to 0.27.0], "pip install" always try to install google-cloud-storage 1.3.0 which requires the dependencies google-cloud-core ~= 0.26. That's why the bug happens.

Solution:

  • Downgrade to google-cloud < 0.27.0 and force the requirements google-cloud-storage < 1.3.0
  • The problem seems still there with google-cloud==0.27.0 and google-cloud-storage==1.3.0 (to check but at least it has been broken for my case, see Problem 2 above)

Suggestion: better manage the deps for google-cloud in setup.py.

Thanks.

@dhermes
Copy link
Contributor

dhermes commented Aug 7, 2017

@sonlac It looks like you're using a custom transport. When constructing Client, are you passing a custom _http argument? (Note that the leading underscore in _http is a "user-beware" signal.)

For reference, see the release notes:

@dhermes dhermes added the api: storage Issues related to the Cloud Storage API. label Aug 7, 2017
@dhermes dhermes self-assigned this Aug 7, 2017
@sonlac
Copy link
Contributor Author

sonlac commented Aug 7, 2017

@dhermes Thank you for your quick reply. No, I didn't pass anything as custom argument. I am using the latest code to test, all arguments are default ones.
It seems that this change https://github.com/GoogleCloudPlatform/google-cloud-python/pull/3705/files makes the google-cloud-storage 1.3.0 broken?
For my part, I am trying to get the logs for more details...

@dhermes
Copy link
Contributor

dhermes commented Aug 7, 2017

Before making a request, can you print client._http?

@dhermes
Copy link
Contributor

dhermes commented Aug 7, 2017

RE: #3705, notice the code in _make_transport and compare it to the _http property. You'll see they are the same. Also, using the default _http, our system tests are passing.

@lukesneeringer
Copy link
Contributor

lukesneeringer commented Aug 7, 2017

@sonlac Can you give us the output of pip freeze? In particular, I would like to see what version of requests you have installed. I am thinking our lower bound might be too low.

If it is a semi-old one, can you also try updating requests to the newest version and tell us if that fixes your issue?

@lukesneeringer lukesneeringer added status: acknowledged status: investigating The issue is under investigation, which is determined to be non-trivial. and removed status: acknowledged labels Aug 7, 2017
@sonlac
Copy link
Contributor Author

sonlac commented Aug 7, 2017

Thank you @dhermes @lukesneeringer for the comments. So, the problem 1 is about "user-beware". It's was not a bug. For the problem 2, I could not reproduce it in a context "out-of-box". I've just tested a simple code using download_to_filename, it worked with latest ones: google-cloud 0.27.0 and goole-cloud-storage 1.3.0. In fact, I've run the code inside a ML engine, so it'd take time to get more logs... I'll back to you for more information.

pip freeze

Click to expand
cachetools==2.0.0
certifi==2017.7.27.1
chardet==3.0.4
dill==0.2.7.1
enum34==1.1.6
future==0.16.0
futures==3.1.1
gapic-google-cloud-datastore-v1==0.15.3
gapic-google-cloud-error-reporting-v1beta1==0.15.3
gapic-google-cloud-logging-v2==0.91.3
gapic-google-cloud-pubsub-v1==0.15.4
gapic-google-cloud-spanner-admin-database-v1==0.15.3
gapic-google-cloud-spanner-admin-instance-v1==0.15.3
gapic-google-cloud-spanner-v1==0.15.3
google-auth==1.0.2
google-cloud==0.27.0
google-cloud-bigquery==0.26.0
google-cloud-bigtable==0.26.0
google-cloud-core==0.26.0
google-cloud-datastore==1.2.0
google-cloud-dns==0.26.0
google-cloud-error-reporting==0.26.0
google-cloud-language==0.27.0
google-cloud-logging==1.2.0
google-cloud-monitoring==0.26.0
google-cloud-pubsub==0.27.0
google-cloud-resource-manager==0.26.0
google-cloud-runtimeconfig==0.26.0
google-cloud-spanner==0.26.0
google-cloud-speech==0.28.0
google-cloud-storage==1.3.0
google-cloud-translate==1.1.0
google-cloud-videointelligence==0.25.0
google-cloud-vision==0.26.0
google-gax==0.15.13
google-resumable-media==0.2.2
googleapis-common-protos==1.5.2
grpc-google-iam-v1==0.11.1
grpcio==1.4.0
httplib2==0.10.3
idna==2.5
monotonic==1.3
oauth2client==3.0.0
ply==3.8
proto-google-cloud-datastore-v1==0.90.4
proto-google-cloud-error-reporting-v1beta1==0.15.3
proto-google-cloud-logging-v2==0.91.3
proto-google-cloud-pubsub-v1==0.15.4
proto-google-cloud-spanner-admin-database-v1==0.15.3
proto-google-cloud-spanner-admin-instance-v1==0.15.3
proto-google-cloud-spanner-v1==0.15.3
protobuf==3.3.0
pyasn1==0.3.2
pyasn1-modules==0.0.11
requests==2.18.3
rsa==3.4.2
six==1.10.0
tenacity==4.4.0
urllib3==1.22

@dhermes
Copy link
Contributor

dhermes commented Aug 7, 2017

@sonlac I am going to preemptively close this based on

I could not reproduce it in a context "out-of-box"

We're happy to keep discussing and re-open if there does turn out to be some issue.

@guy-shahine
Copy link

guy-shahine commented Aug 14, 2017

Hey guys, faced same issue here. Code ran fine on my local mac machine and ubuntu VM in Compute Engine, but failed on Debian VM in Compute engine that was deployed through Salt.

pip freeze showed requests==2.7.0 on Debian VM whereas it is requests==2.18.3 on my machine. So I ran pip install --upgrade requests and AttributeError: __exit__ exception that broke the flow is gone

@dhermes
Copy link
Contributor

dhermes commented Aug 14, 2017

@lukesneeringer @jonparrott We should probably do a release of storage that enforces that lower bound (code is already in master)?

@theacodes
Copy link
Contributor

Seems reasonable

dhermes added a commit to dhermes/google-cloud-python that referenced this issue Aug 14, 2017
Did this to update the lower bound on the `requests` dependency
(see googleapis#3736).
dhermes added a commit to dhermes/google-cloud-python that referenced this issue Aug 15, 2017
Did this to update the lower bound on the `requests` dependency
(see googleapis#3736).
dhermes added a commit that referenced this issue Aug 15, 2017
* Update storage to 1.3.2.

Did this to update the lower bound on the `requests` dependency
(see #3736).

* Updating minimum bound on `requests` in storage.
@ashwathnrajan
Copy link

ashwathnrajan commented Aug 23, 2017

Hello, I'm having the same problem here. It's really easy for me to reproduce, and I'm using a Google Cloud Compute VM running ubuntu 16.0.4. I don't think it's just a cloud-storage/requests versioning issue as I have the latest version of both installed, as well as the latest of google-cloud

I can reproduce the bug really simply.

from google.cloud import storage
client = storage.Client()
bucket = client.get_bucket(<STORAGE_BUCKET>)
blob = bucket.get_blob(<PATH_TO_VIDEO>)
blob.download_to_filename(<LOCAL_PATH_TO_VIDEO>)

will raise

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/dist-packages/google/cloud/storage/blob.py", line 482, in download_to_filename
    self.download_to_file(file_obj, client=client)
  File "/usr/local/lib/python3.5/dist-packages/google/cloud/storage/blob.py", line 464, in download_to_file
    self._do_download(transport, file_obj, download_url, headers)
  File "/usr/local/lib/python3.5/dist-packages/google/cloud/storage/blob.py", line 418, in _do_download
    download.consume(transport)
  File "/usr/local/lib/python3.5/dist-packages/google/resumable_media/requests/download.py", line 101, in consume
    self._write_to_stream(result)
  File "/usr/local/lib/python3.5/dist-packages/google/resumable_media/requests/download.py", line 62, in _write_to_
stream
    with response:
AttributeError: __exit__

Finally, here is my pip freeze (pip3 freeze):
bleach==1.5.0
blinker==1.3
boto==2.38.0
cachetools==2.0.1
certifi==2017.7.27.1
chardet==2.3.0
cloud-init==0.7.9
command-not-found==0.3
configobj==5.0.6
cryptography==1.2.3
dill==0.2.7.1
future==0.16.0
gapic-google-cloud-datastore-v1==0.15.3
gapic-google-cloud-error-reporting-v1beta1==0.15.3
gapic-google-cloud-logging-v2==0.91.3
gapic-google-cloud-pubsub-v1==0.15.4
gapic-google-cloud-spanner-admin-database-v1==0.15.3
gapic-google-cloud-spanner-admin-instance-v1==0.15.3
gapic-google-cloud-spanner-v1==0.15.3
google-auth==1.0.2
google-cloud==0.27.0
google-cloud-bigquery==0.26.0
google-cloud-bigtable==0.26.0
google-cloud-core==0.26.0
google-cloud-datastore==1.2.0
google-cloud-dns==0.26.0
google-cloud-error-reporting==0.26.0
google-cloud-language==0.27.0
google-cloud-logging==1.2.0
google-cloud-monitoring==0.26.0
google-cloud-pubsub==0.27.0
google-cloud-resource-manager==0.26.0
google-cloud-runtimeconfig==0.26.0
google-cloud-spanner==0.26.0
google-cloud-speech==0.28.0
google-cloud-storage==1.3.2
google-cloud-translate==1.1.0
google-cloud-videointelligence==0.25.0
google-cloud-vision==0.26.0
google-compute-engine==2.4.1
google-gax==0.15.14
google-resumable-media==0.2.3
googleapis-common-protos==1.5.2
grpc-google-iam-v1==0.11.1
grpcio==1.4.0
html5lib==0.9999999
httplib2==0.10.3
idna==2.0
imutils==0.4.3
Jinja2==2.8
jsonpatch==1.10
jsonpointer==1.9
language-selector==0.1
Markdown==2.6.9
MarkupSafe==0.23
monotonic==1.3
numpy==1.13.1
oauth2client==3.0.0
oauthlib==1.0.3
ply==3.8
prettytable==0.7.2
proto-google-cloud-datastore-v1==0.90.4
proto-google-cloud-error-reporting-v1beta1==0.15.3
proto-google-cloud-logging-v2==0.91.3
proto-google-cloud-pubsub-v1==0.15.4
proto-google-cloud-spanner-admin-database-v1==0.15.3
proto-google-cloud-spanner-admin-instance-v1==0.15.3
proto-google-cloud-spanner-v1==0.15.3
protobuf==3.4.0
pyasn1==0.3.2
pyasn1-modules==0.0.11
pycurl==7.43.0
pygobject==3.20.0
PyJWT==1.3.0
pyserial==3.0.1
python-apt==1.1.0b1
python-debian==0.1.27
python-systemd==231
PyYAML==3.11
requests==2.9.1
rsa==3.4.2
six==1.10.0
ssh-import-id==5.5
tenacity==4.4.0
tensorflow==1.3.0
tensorflow-gpu==1.3.0
tensorflow-tensorboard==0.1.4
ufw==0.35
unattended-upgrades==0.1
urllib3==1.13.1
Werkzeug==0.12.2

@dhermes
Copy link
Contributor

dhermes commented Aug 23, 2017

@ashwathnrajan You have requests==2.9.1, we need >= 2.18.0.

@ashwathnrajan
Copy link

meh. one day i'll learn to compare numbers. thanks, i'm sure this will fix it

@naoko
Copy link

naoko commented Oct 10, 2017

I'm getting this error with the follow versions

requests==2.18.4
google-cloud-storage==1.3.2

this is my entire pip freeze output

+ pip freeze
alembic==0.9.5
arrow==0.8.0
beautifulsoup4==4.6.0
boto==2.48.0
bz2file==0.98
cachetools==2.0.1
certifi==2017.7.27.1
chardet==3.0.4
colander==1.3.3
configparser==3.5.0
cornice==2.4.0
cymem==1.31.2
cytoolz==0.8.2
datastuffpy==0.4.5
dill==0.2.7.1
docopt==0.6.2
elasticsearch==5.4.0
elasticsearch-dsl==5.3.0
email-reply-parser==0.5.9
fire==0.1.2
future==0.16.0
gapic-google-cloud-datastore-v1==0.15.3
gapic-google-cloud-error-reporting-v1beta1==0.15.3
gapic-google-cloud-logging-v2==0.91.3
gapic-google-cloud-pubsub-v1==0.15.4
gapic-google-cloud-spanner-admin-database-v1==0.15.3
gapic-google-cloud-spanner-admin-instance-v1==0.15.3
gapic-google-cloud-spanner-v1==0.15.3
gensim==3.0.0
google-api-python-client==1.6.4
google-auth==1.1.1
google-cloud==0.27.0
google-cloud-bigquery==0.26.0
google-cloud-bigtable==0.26.0
google-cloud-core==0.26.0
google-cloud-datastore==1.2.0
google-cloud-dns==0.26.0
google-cloud-error-reporting==0.26.0
google-cloud-language==0.27.0
google-cloud-logging==1.2.0
google-cloud-monitoring==0.26.0
google-cloud-pubsub==0.27.0
google-cloud-resource-manager==0.26.0
google-cloud-runtimeconfig==0.26.0
google-cloud-spanner==0.26.0
google-cloud-speech==0.28.0
google-cloud-storage==1.3.2
google-cloud-translate==1.1.0
google-cloud-videointelligence==0.25.0
google-cloud-vision==0.26.0
google-gax==0.15.15
google-resumable-media==0.2.3
googleapis-common-protos==1.5.3
grpc-google-iam-v1==0.11.4
grpcio==1.6.3
httplib2==0.10.3
hupper==1.0
idna==2.6
iso8601==0.1.11
lightgbm==2.0.7
Mako==1.0.7
MarkupSafe==1.0
marshmallow==2.13.6
monotonic==1.3
murmurhash==0.26.4
nextiva.nlp==0.2.91
nltk==3.2.5
numpy==1.13.3
oauth2client==3.0.0
pandas==0.20.3
PasteDeploy==1.5.2
pathlib==1.0.1
pkg-resources==0.0.0
plac==0.9.6
ply==3.8
preshed==1.0.0
proto-google-cloud-datastore-v1==0.90.4
proto-google-cloud-error-reporting-v1beta1==0.15.3
proto-google-cloud-logging-v2==0.91.3
proto-google-cloud-pubsub-v1==0.15.4
proto-google-cloud-spanner-admin-database-v1==0.15.3
proto-google-cloud-spanner-admin-instance-v1==0.15.3
proto-google-cloud-spanner-v1==0.15.3
protobuf==3.4.0
psycopg2==2.7.3.1
pyasn1==0.3.7
pyasn1-modules==0.1.4
pyramid==1.8.3
python-dateutil==2.6.1
python-editor==1.0.3
pytz==2017.2
raven==6.2.1
regex==2017.4.5
repoze.lru==0.7
requests==2.18.4
rsa==3.4.2
scikit-learn==0.18
scipy==0.19.1
simplejson==3.11.1
six==1.11.0
sklearn==0.0
smart-open==1.5.3
sox==1.3.0
spacy==1.7.5
SQLAlchemy==1.1.14
tenacity==4.4.0
termcolor==1.1.0
thinc==6.5.2
tinydb==3.6.0
toolz==0.8.2
tqdm==4.19.2
translationstring==1.3
ujson==1.35
uritemplate==3.0.0
urllib3==1.22
venusian==1.1.0
waitress==1.0.2
WebOb==1.7.3
wrapt==1.10.11
zope.deprecation==4.3.0
zope.interface==4.4.3

Almost identical virtualenv on my mac won't raise this error.

The error I'm getting is on debian

.env/lib/python3.5/site-packages/datastuffpy/storages/google/bucket.py:88: in download
    blob.download_to_file(file_obj)
.env/lib/python3.5/site-packages/google/cloud/storage/blob.py:464: in download_to_file
    self._do_download(transport, file_obj, download_url, headers)
.env/lib/python3.5/site-packages/google/cloud/storage/blob.py:418: in _do_download
    download.consume(transport)
.env/lib/python3.5/site-packages/google/resumable_media/requests/download.py:101: in consume
    self._write_to_stream(result)
.env/lib/python3.5/site-packages/google/resumable_media/requests/download.py:62: in _write_to_stream
    with response:
E   AttributeError: __exit__

Any advice would be appreciated.

@dhermes
Copy link
Contributor

dhermes commented Oct 10, 2017

@naoko Run pip show requests to make sure that is the version of pip you think it is.

@naoko
Copy link

naoko commented Oct 10, 2017

@dhermes , thank you for your quick response. It seems that version matches to what shows on pip freeze... :(

+ pip show requests
Name: requests
Version: 2.18.4
Summary: Python HTTP for Humans.
Home-page: http://python-requests.org
Author: Kenneth Reitz
Author-email: [email protected]
License: Apache 2.0
Location: /opt/jenkins/workspace/ML-API-Test/.env/lib/python3.5/site-packages
Requires: urllib3, idna, chardet, certifi

@dhermes
Copy link
Contributor

dhermes commented Oct 10, 2017

Is the Location (/opt/jenkins/workspace/ML-API-Test/.env/lib/python3.5/site-packages) the same as env/lib/python3.5/site-packages in your stacktrace? (The one in the stacktrace was a relative path, that's why I ask.)

On the breaking machine, can you do the following:

$ /opt/jenkins/workspace/ML-API-Test/.env/bin/python3.5
>>> import requests
>>> requests.__file__
'???'
>>> requests.__version__
'???'

@naoko
Copy link

naoko commented Oct 11, 2017

For some reason, running that in Jenkins was so hard (looks like it strips single quote...)

$ command=( python3.5 -c $'import requests\nprint(requests.__file__)\nprint(requests.__version__)' )
$ "${command[@]}"

But the goal is to show that version of requests is really >= 2.18.0 or not on actual running environment. So right above where I run tests, I ran pip install -U requests which goes .env/lib/python3.5/site-packages and the exception is raised from .env/lib/python3.5/site-packages/google/resumable_media/requests/download.py so I have to think it does have latest version there... but I understand that I can only replicate this on one environment and I should not waste any more of your time. Thank you @dhermes for your time. I will think of other way to run jenkins job.

+ pip install -U requests
Requirement already up-to-date: requests in ./.env/lib/python3.5/site-packages
Requirement already up-to-date: idna<2.7,>=2.5 in ./.env/lib/python3.5/site-packages (from requests)
Requirement already up-to-date: certifi>=2017.4.17 in ./.env/lib/python3.5/site-packages (from requests)
Requirement already up-to-date: urllib3<1.23,>=1.21.1 in ./.env/lib/python3.5/site-packages (from requests)
Requirement already up-to-date: chardet<3.1.0,>=3.0.2 in ./.env/lib/python3.5/site-packages (from requests)
.env/lib/python3.5/site-packages/google/resumable_media/requests/download.py:62: in _write_to_stream
    with response:
E   AttributeError: __exit__

@id0Sch
Copy link

id0Sch commented Nov 1, 2017

I don't know if it will help but I was able to solve the second issue (TypeError: request() got an unexpected keyword argument 'data' )
by initializing the client with my project_name like this

client = storage.Client(PROJECT_NAME) #without PROJECT_NAME it breaks
bucket = client.get_bucket(bucket_name)
blob = bucket.get_blob(filename)
return blob.download_as_string()

@anthony-chaudhary
Copy link

anthony-chaudhary commented Feb 22, 2018

Hey there, I'm experiencing this issue too.
google-cloud-storage == 1.6
requests == 2.18.4
Error

google\resumable_media\requests\download.py", line 117, in _write_to_stream
    with response:
AttributeError: __exit__

removing with statement in line 117 appears to monkey patch it

@theacodes
Copy link
Contributor

@swirlingsand are you sure you have requests 2.18.4? Can you verify with import requests; print(requests.__version__)?

@anthony-chaudhary
Copy link

My apologies I think this was something with my conda setup.
pip show requests yielded 2.18,
however running that print statement showed 2.14
Thanks for help @jonparrott

@theacodes
Copy link
Contributor

theacodes commented Feb 23, 2018 via email

@kparaju
Copy link

kparaju commented Mar 7, 2018

I had to uninstall requests 2.18.4 (pip uninstall requests) and install 2.18.0 (pip install requests==2.18.0) to make it work

@mmas
Copy link

mmas commented Mar 27, 2018

I downgraded google-cloud-storage from 1.8.0 to 1.6.0 too. So

google-cloud-storage==1.6.0
requests==2.18.0

@jocieA
Copy link

jocieA commented Aug 20, 2018

Hi all, any idea on how to make this work on Cloud Composer? I am migrating DAGs from our current environment to Cloud Composer however I get this error even when I install google-cloud-storage==1.6.0 and requests==2.18.0 as PyPI package.

[2018-08-20 09:32:09,128] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,127] {bash_operator.py:101} INFO - from google.cloud import bigquery
[2018-08-20 09:32:09,129] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,129] {bash_operator.py:101} INFO - File "/usr/local/lib/python2.7/site-packages/google/cloud/bigquery/init.py", line 32, in
[2018-08-20 09:32:09,130] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,130] {bash_operator.py:101} INFO - from google.cloud.bigquery.client import Client
[2018-08-20 09:32:09,131] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,131] {bash_operator.py:101} INFO - File "/usr/local/lib/python2.7/site-packages/google/cloud/bigquery/client.py", line 20, in
[2018-08-20 09:32:09,132] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,132] {bash_operator.py:101} INFO - from google.cloud.bigquery.dataset import Dataset
[2018-08-20 09:32:09,133] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,133] {bash_operator.py:101} INFO - File "/usr/local/lib/python2.7/site-packages/google/cloud/bigquery/dataset.py", line 20, in
[2018-08-20 09:32:09,134] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,134] {bash_operator.py:101} INFO - from google.cloud.bigquery.table import Table
[2018-08-20 09:32:09,140] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,137] {bash_operator.py:101} INFO - File "/usr/local/lib/python2.7/site-packages/google/cloud/bigquery/table.py", line 27, in
[2018-08-20 09:32:09,141] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,138] {bash_operator.py:101} INFO - from google.cloud.exceptions import make_exception
[2018-08-20 09:32:09,142] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,138] {bash_operator.py:101} INFO - ImportError: cannot import name make_exception
[2018-08-20 09:32:09,733] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,730] {bash_operator.py:105} INFO - Command exited with return code 1

@sonlac
Copy link
Contributor Author

sonlac commented Aug 21, 2018

Hi @jocieA, with your logs I am assuming that your problem is not in the module google-cloud-storage==1.6.0 but rather in the module google-cloud-bigquery. You can do a pip freeze to see your installed bigquery module. I am not working with Cloud Composer service but I am very familiar with Airflow DAGs, your task (using Airflow BashOperator) should work correctly "out-of-box". It should be working by using the command-line out of context of AIRFLOW/Cloud Composer. You just need to debug your own Airflow task/program.

Concretely, I saw in your logs, the error ImportError: cannot import name make_exception is in line 27 of the file google/cloud/bigquery/table.py. Your installed bigquery module could be incompatible?

Hope this helps. Thanks.

@sjungwirth
Copy link

For anyone else running into this issue with Google Cloud Composer, I ran into it after adding apache-beam[gcp]==2.6.0 to my Composer dependencies.

The issue here was that apache-beam is installing google-cloud-bigquery==0.25.0, which causes a bunch of other packages to be downgraded.

The fix for me was to explicitly state/install google-cloud-core>=0.28.0 google-cloud-bigquery>=1.5.0 AFTER apache-beam[gcp]==2.6.0 in requirements.txt file used to specify Composer dependencies:
https://cloud.google.com/sdk/gcloud/reference/composer/environments/update

gcloud composer environments update ENVIRONMENT --location=LOCATION \
    --update-pypi-packages-from-file requirements.txt

@jocieA
Copy link

jocieA commented Sep 14, 2018

Hi @sjungwirth and @sonlac ,

Thanks for getting back on my post, these have been very useful.
I recently tried running my pipeline on the latest version of cloud composer (1.1.0) and it worked without specifying versions of google cloud core, big query, cloud storage, requests etc. I think some improvements have been made to address this issue. Currently i'm looking forward to upgrade my existing environment to the new version.

@harinuk224469
Copy link

Is this issue resolved ? what is the solution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API. status: investigating The issue is under investigation, which is determined to be non-trivial.
Projects
None yet
Development

No branches or pull requests