Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

external: refactor external_repo() and its users #3190

Merged
merged 5 commits into from
Jan 20, 2020

Conversation

Suor
Copy link
Contributor

@Suor Suor commented Jan 18, 2020

Some things are fixed along the way:

  • no cache shared between incompatible funcs
  • some incosistent exceptions are now consistent
  • import updates for git files now works both for dvc and git repos
  • "auto-generated-upstream already exists" fixed
  • api.open()/read() works with git repos now
  • api funcs allow specifying rev for local repos now

Fixes #3124, fixes #2976, fixes #3179, and fixes #2824. Closes #3172.

@Suor Suor requested a review from efiop January 18, 2020 15:44
@efiop efiop requested a review from skshetry January 19, 2020 06:35
dvc/api.py Show resolved Hide resolved
@efiop
Copy link
Contributor

efiop commented Jan 19, 2020

@Suor Please rebase.

@Suor
Copy link
Contributor Author

Suor commented Jan 19, 2020

@efiop rebased. Moved no remote/cache present fix to new code, left a TODO there because issues tend to get lost. I will fix it in a subsequent PR, doesn't look small enough to fit here.

@efiop
Copy link
Contributor

efiop commented Jan 19, 2020

@Suor TODOs get lost more, please remove it and create an issue. Add it to the next sprint if you want to handle it later and not get lost.

@efiop
Copy link
Contributor

efiop commented Jan 19, 2020

@Suor there is some problem with an author of your commits, what is going on?

@Suor
Copy link
Contributor Author

Suor commented Jan 19, 2020

We have lots of dead issues and no lost TODOs )

Suor added 4 commits January 19, 2020 21:38
Some things are fixed along the way:
- no cache shared between incompatible funcs
- some incosistent exceptions are now consistent
- import updates for git files now works both for dvc and git repos
- "auto-generated-upstream already exists" fixed
- dvc.api.open()/read() works with git repos now

Fixes iterative#3124, iterative#2976. Closes iterative#3172.
@Suor
Copy link
Contributor Author

Suor commented Jan 19, 2020

rebased

@Suor Suor mentioned this pull request Jan 19, 2020
3 tasks
Copy link
Contributor

@efiop efiop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Comment on lines +112 to 115
if rev is None and (not repo_url or os.path.exists(repo_url)):
yield Repo(repo_url)
else:
with external_repo(url=repo_url, rev=rev) as repo:
Copy link
Contributor

@jorgeorpinel jorgeorpinel Mar 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is external_repo() used if there's a rev? What if I want to access a file in the current DVC repo but from a previous commit? I'm trying to test this but I'm getting AttributeError: module 'dvc' has no attribute 'api'

@Suor @skshetry @shcheklein

This comment was marked as resolved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorgeorpinel, it should just clone the current repo and checkout to given rev and access that file.

Copy link
Contributor

@jorgeorpinel jorgeorpinel Mar 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing this, I see it works anyway. But is it using external_repo() to clone the current repo in a temporary folder and then use that instead of just the current Git repo's tree?

Copy link
Contributor

@jorgeorpinel jorgeorpinel Mar 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s. my test script:

import dvc.api

print(dvc.api.read('data', rev='HEAD^'))

data should be a DVC-tracked file with 2 different consecutive versions (in HEAD^ and in HEAD).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is I think Repo() also accepts rev, just wondering why it's not passed, and whether that would be a better implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repo does not accept rev. Repo is all about current workspace, so, you cannot just checkout the user's workspace and modify/read. So, even though it's still is a Repo underneath (if it's a DVC repo), it's cloned, checked out to a rev, put into a tmp cache directory and operated inside of it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I was just confused by the name external_repo then. I guess that's the only way to get a previous rev from the current repo unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants