Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Really mirror the upstream repository + refactor Repo API #223

Merged
merged 3 commits into from
Feb 18, 2015

Conversation

pv
Copy link
Collaborator

@pv pv commented Feb 15, 2015

This PR does the following refactoring:

  • Remove the concept of "currently checked out revision" from the Repo API. Checkouts of working trees are now always done to a specified directory at an explicit commit hash (whether the result is a clone or not is an implementation detail for the Repo plugin).
  • Make Git mirror the remote repository, so that remote branches are available in all asv commands. Mercurial works like this already.
  • Don't check out a working tree for the main mirror repository, only do it inside environments
  • Replace checkout_parent with get_hash_from_parent
  • Always use get_hash_from_master for getting master branch, to be nicer to Mercurial
  • Make sure to call hg pull on environment-specific repos, as --shared in not used for hg
  • Rename _tag -> _name in Repo methods where appropriate
  • Allow specifying base branch to compare against in asv continuous

The main advantages of this is making all remote branches available to asv (which is what the user will likely expect) and not checking out unnecessary working trees. It also removes the concept of sub-Repo objects in the environment-specific clones, which is slightly simpler and may be better for Subversion.

@mdboom
Copy link
Collaborator

mdboom commented Feb 17, 2015

This PR has lots of little moving parts, so some questions below, mainly just to double check:

Remove the concept of "currently checked out revision" from the Repo API. Checkouts of working trees are now always done to a specified directory at an explicit commit hash (whether the result is a clone or not is an implementation detail for the Repo plugin).

So, the git plugin now always does a clone --shared to the environment (which because it's --shared should be very lightweight), and then checkout $HASH. For mercurial, though, isn't this a regression since there the clone is in fact a copy of the entire history? Should we either (a) find a way to do the equivalent of git clone --shared on mercurial, or (b) revert to the "clone once per environment" model for mercurial? I understand there is a mercurial plugin to do (a), I, as a non-mercurial expert, just didn't have a chance to go down the rabbit hole of how to ship and enable the plugin for asv's local use and hold up mercurial support in general.

Make Git mirror the remote repository, so that remote branches are available in all asv commands.

Isn't that already the case? If I clone a git repository, I can access origin's branches by name (they don't exist locally, so you get a "detached head", but for checking it out and getting a log between two branches etc., it works). I'm not opposed to using --mirror at all, I'm just trying to understand the use case because I'm probably missing something.

Don't check out a working tree for the main mirror repository, only do it inside environments

Seems reasonable.

Replace checkout_parent with get_hash_from_parent

Makes sense.

Always use get_hash_from_master for getting master branch, to be nicer to Mercurial

Yep.

Make sure to call hg pull on environment-specific repos, as --shared in not used for hg

Sure -- though maybe we can do something like --shared there.

Rename _tag -> _name in Repo methods where appropriate

Good point. Those were originally used for tags, but obviously they are more general.

Allow specifying base branch to compare against in asv continuous

Seems useful in some cases. That reminds me that "continuous" is kind of a poor name choice for that command. Maybe "pairwise" would be better? Doesn't have to be part of this PR -- I'll open a new issue.

The main advantages of this is making all remote branches available to asv (which is what the user will likely expect)

Here, I think I'm just missing something...

and not checking out unnecessary working trees.

This makes sense.

It also removes the concept of sub-Repo objects in the environment-specific clones, which is slightly simpler and may be better for Subversion.

Yeah -- if we ever wanted to do Subversion support, we could probably put a subversion repository (i.e. the whole history) at the top level and checkout revisions into the environments from there (rather than remotely), which would certainly speed things up.

@pv
Copy link
Collaborator Author

pv commented Feb 17, 2015

So, the git plugin now always does a clone --shared to the environment (which because it's --shared should be very lightweight), and then checkout $HASH. For mercurial, though, isn't this a regression since there the clone is in fact a copy of the entire history? Should we either (a) find a way to do the equivalent of git clone --shared on mercurial, or (b) revert to the "clone once per environment" model for mercurial? I understand there is a mercurial plugin to do (a), I, as a non-mercurial expert, just didn't have a chance to go down the rabbit hole of how to ship and enable the plugin for asv's local use and hold up mercurial support in general.

Both Git and Mecurial use hard links for cloning repository data locally, so the environment clones don't in practice hog performance/disk, even without --shared. Windows may be an exception here, though. EDIT NTFS is OK, but FAT not

For mercurial, explicit --shared probably could be done with some hackery but it probably won't be very important on Linux/OSX. (Note also that this PR doesn't change the behavior of the Mercurial backend, I think it worked like this previously too.)

Make Git mirror the remote repository, so that remote branches are available in all asv commands.

Isn't that already the case? If I clone a git repository, I can access origin's branches by name (they don't exist locally, so you get a "detached head", but for checking it out and getting a log between two branches etc., it works). I'm not opposed to using --mirror at all, I'm just trying to understand the use case because I'm probably missing something.

Ok, indeed the origin branches can be accessed as "origin/foo", but not as "foo".
(Except in the continuous command which works differently.)

It seems I was thinking about the use case I had with Scipy, where the clone is a clone of the local repository (the benchmarks are inside the main repo). In this case, if you clone the repository without --mirror, you lose the original origin branches and end up having only branches you have available locally.

But I think even without this use case, --mirror is the better bet, since the origin/ prefix is redundant as there are no local branches, and it's then more consistent with Mercurial which apparently doesn't have the concept of remote branches.

@mdboom
Copy link
Collaborator

mdboom commented Feb 17, 2015

It seems I was thinking about the use case I had with Scipy, where the clone is a clone of the local repository (the benchmarks are inside the main repo). In this case, if you clone the repository without --mirror, you lose the original origin branches and end up having only branches you have available locally.

Ah, indeed that's the case for that use case. For the simpler case of cloning a remote repository, git checkout foo does work (where foo is a branch on origin).

This all makes sense then. Thanks for clarifying.

@mdboom
Copy link
Collaborator

mdboom commented Feb 17, 2015

There's a few other known issues on Windows (mostly to do with timing itself), and I've never claimed support for it, so I've no problem with other Windows shortcomings unless someone comes along who cares enough to help maintain Windows issues.

@mdboom
Copy link
Collaborator

mdboom commented Feb 18, 2015

Thanks. I think this just needs a rebase and then I'm happy to merge.

…repository

Make the Repo object act transparently as a mirror of the remote
repository.  Also rework the logic so that a checkout of the working
tree is done only in the environment-specific clones.

This makes remote branches available for all commands without special
prefixes, and removes the need for Repo.check_remote_branch.

Also rename some of the commands, so that branch/tag/commit is called
"name", and replace checkout_parent with get_hash_from_parent.
mdboom added a commit that referenced this pull request Feb 18, 2015
Really mirror the upstream repository + refactor Repo API
@mdboom mdboom merged commit a2318b6 into airspeed-velocity:master Feb 18, 2015
@pv pv deleted the clone-refactor branch April 12, 2015 11:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants