Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get/import: --rev BRANCH: shallow clone #3585

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion dvc/external_repo.py
Original file line number Diff line number Diff line change
Expand Up @@ -248,7 +248,7 @@ def _clone_default_branch(url, rev):
else:
logger.debug("erepo: git clone %s to a temporary dir", url)
clone_path = tempfile.mkdtemp("dvc-clone")
git = Git.clone(url, clone_path)
git = Git.clone(url, clone_path, rev=rev)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks clones cache below. As it presumes it has default branch.

CLONES[url] = clone_path
finally:
if git:
Expand Down
32 changes: 24 additions & 8 deletions dvc/scm/git/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
"""Manages Git."""

from functools import partial
import logging
import os
import yaml
Expand Down Expand Up @@ -71,16 +72,31 @@ def clone(url, to_path, rev=None):
# [1] https://github.com/gitpython-developers/GitPython/issues/924
env[ld_key] = ""

clone_from = partial(
git.Repo.clone_from,
url,
to_path,
env=env, # needed before we can fix it in __init__
no_single_branch=True,
)

try:
tmp_repo = git.Repo.clone_from(
url,
to_path,
env=env, # needed before we can fix it in __init__
no_single_branch=True,
)
tmp_repo.close()
if rev:
logger.debug("attempting shallow clone of branch %s", rev)
tmp_repo = clone_from(branch=rev, depth=1)
else:
logger.debug("full clone")
tmp_repo = clone_from()
Comment on lines +84 to +89
Copy link
Contributor

@Suor Suor Apr 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very dangerous code, which will cause us grief. We shouldn't hide that we are sometimes make shallow code and sometimes not from outside. Outside may rely on clone being full as it currently does in external_repo.py.

I mean if we do want to make shallow or single branch clone we should request it in a call signature.

BTW, the retry below doesn't look good either, can we do it some other way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're going to have to fix "outside" (i.e. external_repo.py) first

except git.exc.GitCommandError as exc:
raise CloneError(url, to_path) from exc
if not rev or ("Remote branch %s not found" % rev) not in str(exc):
raise CloneError(url, to_path) from exc
try:
logger.debug("not a branch - performing full clone")
tmp_repo = clone_from()
except git.exc.GitCommandError as exc:
raise CloneError(url, to_path) from exc

tmp_repo.close()

# NOTE: using our wrapper to make sure that env is fixed in __init__
repo = Git(to_path)
Expand Down