Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dvc.api.open: Will ignore remote option if a default remote is set in config. #7341

Closed
PietrassykFP opened this issue Feb 3, 2022 · 1 comment · Fixed by #9306
Closed
Assignees
Labels
A: api Related to the dvc.api bug Did we break something? regression Ohh, we broke something :-(

Comments

@PietrassykFP
Copy link

Bug Report

dvc.api.open: Will ignore remote option if a default remote is set in config.

Description

In some cases, when passing the remote parameter. The remote specification will get lost during the dvc pull execution and will be overwritten by the default remote from the dvc config.

Reproduce

  1. Create new git Repo
  2. Init dvc
  3. Add two different non local remotes (perferably s3)
  4. Make one of the remotes the default
  5. Push the same data to both remotes
  6. Push git Repo
  7. From somewhere outside a dvc project run:
with dvc.api.open(path="myfile", remote='non-default-remote') 
  1. Track the actually used remote with a Debugger. Or make sure both remotes require different Authentification and just provide the non-defaut-remote authentification during the api call. You will run into Permission errors.

Expected

If the remote argument is passed this remote should allways be used, when opening files.

Environment information

Output of dvc doctor:

DVC version: 2.9.3 (pip)
---------------------------------
Platform: Python 3.8.12 on Linux-5.10.47-linuxkit-x86_64-with-glibc2.2.5
Supports:
        hdfs (fsspec = 2022.1.0, pyarrow = 1.0.1),
        webhdfs (fsspec = 2022.1.0),
        http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
        https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
        s3 (s3fs = 2022.1.0, boto3 = 1.20.24)
root@112f43f4f86b:/workspaces/centerline-identification-job# 
@pared pared added the bug Did we break something? label Feb 3, 2022
@pared
Copy link
Contributor

pared commented Feb 3, 2022

Initial discussion:
https://discord.com/channels/485586884165107732/485596304961962003/933674011982983219

@PietrassykFP has been setting up the credentials via environmental variables, so if the remote gets resolved properly, it should not raise issues. I created a local test aiming to verify if remote gets resolved properly and it seem to be working fine:

def test_open_from_proper_remote(tmp_dir, erepo_dir, make_tmp_dir):
    default_remote = make_tmp_dir("default_storage")
    other_remote = make_tmp_dir("other_storage")

    with erepo_dir.chdir():
        with erepo_dir.dvc.config.edit() as conf:
            conf["remote"]["default"] = {"url": str(default_remote)}
            conf["core"]["remote"] = "default"
            conf["remote"]["other"] = {"url": str(other_remote)}

        erepo_dir.dvc_gen({"foo": "foo content"}, commit="create file")
        erepo_dir.dvc.push(remote="other")

        (erepo_dir / "foo").unlink()
        shutil.rmtree((erepo_dir / ".dvc" / "cache"))

    assert not (
        default_remote / "6d" / "bda444875c24ec1bbdb433456be11f"
    ).is_file()
    assert (other_remote / "6d" / "bda444875c24ec1bbdb433456be11f").is_file()

    with api.open("foo", repo=os.fspath(erepo_dir)) as fd:
        result = fd.read()

    assert result == "foo content"

It would be good to reproduce the problem and create a test, though that needs some more investigation.

@dtrifiro dtrifiro added the A: api Related to the dvc.api label Feb 3, 2022
@efiop efiop assigned efiop and unassigned efiop Feb 8, 2022
@skshetry skshetry added the regression Ohh, we broke something :-( label Mar 15, 2022
skshetry added a commit to skshetry/dvc that referenced this issue Apr 21, 2022
skshetry added a commit that referenced this issue Apr 21, 2022
@efiop efiop self-assigned this Apr 4, 2023
@efiop efiop added this to DVC Apr 4, 2023
@efiop efiop moved this from Backlog to In Progress in DVC Apr 4, 2023
@github-project-automation github-project-automation bot moved this to Backlog in DVC Apr 4, 2023
@efiop efiop moved this from In Progress to Review In Progress in DVC Apr 4, 2023
@github-project-automation github-project-automation bot moved this from Review In Progress to Done in DVC Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: api Related to the dvc.api bug Did we break something? regression Ohh, we broke something :-(
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants