Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3 credentials in the repo level dvc config are not recognized #5141

Closed
johnnychen94 opened this issue Dec 21, 2020 · 1 comment
Closed

s3 credentials in the repo level dvc config are not recognized #5141

johnnychen94 opened this issue Dec 21, 2020 · 1 comment

Comments

@johnnychen94
Copy link
Contributor

johnnychen94 commented Dec 21, 2020

Bug Report

Description

I use a minio as my private s3 storage. Here's my dvc config

['remote "lflab"']
    url = s3://lflab/dataset-registry
    endpointurl = https://object-storage.example.com
    access_key_id = my_account
    secret_access_key = my_password
[core]
    remote = lflab

If I put this into repo_root/.dvc/config or repo_root/.dvc/config.local. url and core are correctly recognized, but endpointurl/access_key_id/secret_access_key are not recognized.

If I put this into /etc/xdg/dvc/config or ~/.config/dvc/config, things are working as expected.

error message
root@7fe37b378882:~/dataset-registry# cp .dvc/config ~/.config/dvc/
root@7fe37b378882:~/dataset-registry# dvc list https://gitlab.lflab.cn/lflab/dataset-registry Denoise/SIDD -v
2020-12-21 03:37:45,811 DEBUG: Creating external repo https://gitlab.lflab.cn/lflab/dataset-registry@None
2020-12-21 03:37:45,811 DEBUG: erepo: git clone 'https://gitlab.lflab.cn/lflab/dataset-registry' to a temporary dir
2020-12-21 03:37:46,122 DEBUG: cache '/tmp/tmpwd7mwurldvc-cache/a5/0fff28e5ee2ad64cf073c950e25a15.dir' expected 'HashInfo(name='md5', value='a50fff28e5ee2ad64cf073c950e25a15.dir', dir_info=None, size=623951007235, nfiles=28054)' actual 'None'
2020-12-21 03:37:46,182 DEBUG: Preparing to download data from 's3://lflab/dataset-registry'
2020-12-21 03:37:46,183 DEBUG: Preparing to collect status from s3://lflab/dataset-registry
2020-12-21 03:37:46,183 DEBUG: Collecting information from local cache...
2020-12-21 03:37:46,184 DEBUG: cache '/tmp/tmpwd7mwurldvc-cache/a5/0fff28e5ee2ad64cf073c950e25a15.dir' expected 'HashInfo(name='md5', value='a50fff28e5ee2ad64cf073c950e25a15.dir', dir_info=None, size=None, nfiles=None)' actual 'None'
2020-12-21 03:37:46,185 DEBUG: Collecting information from remote cache...
2020-12-21 03:37:46,186 DEBUG: Matched '0' indexed hashes
2020-12-21 03:37:46,186 DEBUG: Querying 1 hashes via object_exists
2020-12-21 03:37:46,380 DEBUG: Downloading 's3://lflab/dataset-registry/a5/0fff28e5ee2ad64cf073c950e25a15.dir' to '../../tmp/tmpwd7mwurldvc-cache/a5/0fff28e5ee2ad64cf073c950e25a15.dir'
2020-12-21 03:37:46,997 DEBUG: cache '/tmp/tmpwd7mwurldvc-cache/a5/0fff28e5ee2ad64cf073c950e25a15.dir' expected 'HashInfo(name='md5', value='a50fff28e5ee2ad64cf073c950e25a15.dir', dir_info=<dvc.dir_info.DirInfo object at 0x7fa25c446898>, size=623951007235, nfiles=28054)' actual 'HashInfo(name='md5', value='a50fff28e5ee2ad64cf073c950e25a15', dir_info=None, size=2948570, nfiles=None)'
2020-12-21 03:37:47,287 DEBUG: Assuming '/tmp/tmpwd7mwurldvc-cache/a5/0fff28e5ee2ad64cf073c950e25a15.dir' is unchanged since it is read-only
2020-12-21 03:37:47,290 DEBUG: Assuming '/tmp/tmpwd7mwurldvc-cache/a5/0fff28e5ee2ad64cf073c950e25a15.dir' is unchanged since it is read-only
2020-12-21 03:37:47,291 DEBUG: Assuming '/tmp/tmpwd7mwurldvc-cache/a5/0fff28e5ee2ad64cf073c950e25a15.dir' is unchanged since it is read-only
full
medium
small
2020-12-21 03:37:47,295 DEBUG: Analytics is enabled.
2020-12-21 03:37:47,359 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmp2ip541hm']'
2020-12-21 03:37:47,361 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmp2ip541hm']'

If I remove ~/.config/dvc/config, it doesn't work anymore (even though repo_root/.dvc/config exists)

root@7fe37b378882:~/dataset-registry# rm ~/.config/dvc/config
root@7fe37b378882:~/dataset-registry# dvc list https://gitlab.lflab.cn/lflab/dataset-registry Denoise/SIDD -v

2020-12-21 03:35:45,595 DEBUG: Creating external repo https://gitlab.lflab.cn/lflab/dataset-registry@None
2020-12-21 03:35:45,595 DEBUG: erepo: git clone 'https://gitlab.lflab.cn/lflab/dataset-registry' to a temporary dir
2020-12-21 03:35:46,019 DEBUG: cache '/tmp/tmp2ccxoexcdvc-cache/a5/0fff28e5ee2ad64cf073c950e25a15.dir' expected 'HashInfo(name='md5', value='a50fff28e5ee2ad64cf073c950e25a15.dir', dir_info=None, size=623951007235, nfiles=28054)' actual 'None'
2020-12-21 03:35:46,078 DEBUG: Preparing to download data from 's3://lflab/dataset-registry'
2020-12-21 03:35:46,078 DEBUG: Preparing to collect status from s3://lflab/dataset-registry
2020-12-21 03:35:46,079 DEBUG: Collecting information from local cache...
2020-12-21 03:35:46,080 DEBUG: cache '/tmp/tmp2ccxoexcdvc-cache/a5/0fff28e5ee2ad64cf073c950e25a15.dir' expected 'HashInfo(name='md5', value='a50fff28e5ee2ad64cf073c950e25a15.dir', dir_info=None, size=None, nfiles=None)' actual 'None'
2020-12-21 03:35:46,081 DEBUG: Collecting information from remote cache...
2020-12-21 03:35:46,081 DEBUG: Matched '0' indexed hashes
2020-12-21 03:35:46,081 DEBUG: Querying 1 hashes via object_exists
2020-12-21 03:35:48,230 ERROR: failed to list 'https://gitlab.lflab.cn/lflab/dataset-registry' - Unable to find AWS credentials. <https://error.dvc.org/no-credentials>: Unable to locate credentials
------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/dvc/tree/s3.py", line 91, in _get_s3
    yield self.s3
  File "/usr/local/lib/python3.6/site-packages/dvc/tree/s3.py", line 108, in _get_bucket
    yield s3.Bucket(bucket)
  File "/usr/local/lib/python3.6/site-packages/dvc/tree/s3.py", line 207, in _list_paths
    for obj_summary in bucket.objects.filter(**kwargs):
  File "/usr/local/lib/python3.6/site-packages/boto3/resources/collection.py", line 83, in __iter__
    for page in self.pages():
  File "/usr/local/lib/python3.6/site-packages/boto3/resources/collection.py", line 166, in pages
    for page in pages:
  File "/usr/local/lib/python3.6/site-packages/botocore/paginate.py", line 255, in __iter__
    response = self._make_request(current_kwargs)
  File "/usr/local/lib/python3.6/site-packages/botocore/paginate.py", line 332, in _make_request
    return self._method(**current_kwargs)
  File "/usr/local/lib/python3.6/site-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python3.6/site-packages/botocore/client.py", line 663, in _make_api_call
    operation_model, request_dict, request_context)
  File "/usr/local/lib/python3.6/site-packages/botocore/client.py", line 682, in _make_request
    return self._endpoint.make_request(operation_model, request_dict)
  File "/usr/local/lib/python3.6/site-packages/botocore/endpoint.py", line 102, in make_request
    return self._send_request(request_dict, operation_model)
  File "/usr/local/lib/python3.6/site-packages/botocore/endpoint.py", line 132, in _send_request
    request = self.create_request(request_dict, operation_model)
  File "/usr/local/lib/python3.6/site-packages/botocore/endpoint.py", line 116, in create_request
    operation_name=operation_model.name)
  File "/usr/local/lib/python3.6/site-packages/botocore/hooks.py", line 356, in emit
    return self._emitter.emit(aliased_event_name, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/botocore/hooks.py", line 228, in emit
    return self._emit(event_name, kwargs)
  File "/usr/local/lib/python3.6/site-packages/botocore/hooks.py", line 211, in _emit
    response = handler(**kwargs)
  File "/usr/local/lib/python3.6/site-packages/botocore/signers.py", line 90, in handler
    return self.sign(operation_name, request)
  File "/usr/local/lib/python3.6/site-packages/botocore/signers.py", line 162, in sign
    auth.add_auth(request)
  File "/usr/local/lib/python3.6/site-packages/botocore/auth.py", line 357, in add_auth
    raise NoCredentialsError
botocore.exceptions.NoCredentialsError: Unable to locate credentials

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/dvc/command/ls/__init__.py", line 35, in run
    dvc_only=self.args.dvc_only,
  File "/usr/local/lib/python3.6/site-packages/dvc/repo/ls.py", line 40, in ls
    ret = _ls(repo.repo_tree, path_info, recursive, dvc_only)
  File "/usr/local/lib/python3.6/site-packages/dvc/repo/ls.py", line 60, in _ls
    path_info.fspath, onerror=onerror, dvcfiles=True
  File "/usr/local/lib/python3.6/site-packages/dvc/tree/repo.py", line 353, in walk
    yield from dvc_tree.walk(top, topdown=topdown, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/dvc/tree/dvc.py", line 218, in walk
    self._add_dir(top, trie, out, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/dvc/tree/dvc.py", line 163, in _add_dir
    self._fetch_dir(out, filter_info=top, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/dvc/tree/dvc.py", line 150, in _fetch_dir
    out.get_dir_cache(**kwargs)
  File "/usr/local/lib/python3.6/site-packages/dvc/output/base.py", line 411, in get_dir_cache
    **kwargs,
  File "/usr/local/lib/python3.6/site-packages/dvc/data_cloud.py", line 91, in pull
    show_checksums=show_checksums,
  File "/usr/local/lib/python3.6/site-packages/dvc/remote/base.py", line 56, in wrapper
    return f(obj, *args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/dvc/remote/base.py", line 447, in pull
    download=True,
  File "/usr/local/lib/python3.6/site-packages/dvc/remote/base.py", line 332, in _process
    download=download,
  File "/usr/local/lib/python3.6/site-packages/dvc/remote/base.py", line 176, in _status
    md5s, jobs=jobs, name=str(self.tree.path_info)
  File "/usr/local/lib/python3.6/site-packages/dvc/remote/base.py", line 132, in hashes_exist
    return indexed_hashes + self.cache.hashes_exist(list(hashes), **kwargs)
  File "/usr/local/lib/python3.6/site-packages/dvc/cache/base.py", line 905, in hashes_exist
    remote_hashes = self.list_hashes_exists(hashes, jobs, name)
  File "/usr/local/lib/python3.6/site-packages/dvc/cache/base.py", line 863, in list_hashes_exists
    ret = list(itertools.compress(hashes, in_remote))
  File "/usr/local/lib/python3.6/concurrent/futures/_base.py", line 586, in result_iterator
    yield fs.pop().result()
  File "/usr/local/lib/python3.6/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/usr/local/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 56, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.6/site-packages/dvc/cache/base.py", line 854, in exists_with_progress
    ret = self.tree.exists(path_info)
  File "/usr/local/lib/python3.6/site-packages/dvc/tree/s3.py", line 169, in exists
    return self.isfile(path_info) or self.isdir(path_info)
  File "/usr/local/lib/python3.6/site-packages/dvc/tree/s3.py", line 199, in isfile
    return path_info.path in self._list_paths(path_info)
  File "/usr/local/lib/python3.6/site-packages/dvc/tree/s3.py", line 208, in _list_paths
    yield obj_summary.key
  File "/usr/local/lib/python3.6/contextlib.py", line 99, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/site-packages/dvc/tree/s3.py", line 113, in _get_bucket
    ) from exc
  File "/usr/local/lib/python3.6/contextlib.py", line 99, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.6/site-packages/dvc/tree/s3.py", line 96, in _get_s3
    ) from exc
dvc.exceptions.DvcException: Unable to find AWS credentials. <https://error.dvc.org/no-credentials>
------------------------------------------------------------
2020-12-21 03:35:48,242 DEBUG: Analytics is enabled.
2020-12-21 03:35:48,339 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmpbl7t18ip']'
2020-12-21 03:35:48,341 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmpbl7t18ip']'

Environment information

Output of dvc version:

$ dvc version

DVC version: 1.11.8 (pip)
---------------------------------
Platform: Python 3.6.12 on Linux-5.4.0-58-generic-x86_64-with-debian-10.6
Supports: http, https, s3
Cache types: hardlink, symlink
Caches: local
Remotes: s3
Repo: dvc, git

This is identified and reproducible when I try to set up a CI for our dataset registry. I use python:3.6 docker image and install dvc via pip install dvc[s3]

@pmrowla
Copy link
Contributor

pmrowla commented Dec 21, 2020

Closing as this is a duplicate of #4604.

The issue is not related to s3 credentials in local config (your local config credentials would be used in commands like dvc push/dvc pull), it's that certain commands (like dvc list) currently do not read anything from the local config at all.

@pmrowla pmrowla closed this as completed Dec 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants