Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making spec the argument of install #147

Closed
ghost opened this issue Apr 24, 2019 · 24 comments
Closed

Making spec the argument of install #147

ghost opened this issue Apr 24, 2019 · 24 comments

Comments

@ghost
Copy link

ghost commented Apr 24, 2019

In #144, @raxod502 mentioned

I am quite confused why the package name I give pipx wouldn't normally be passed to pip install. I don't see any explanation of what the alternative is.

Why do we separately specify the package name and the requirement-specifier? For example, the API I wonder about is just copying pip's interface:

$ pip install --help  

Usage:   
  pip install [options] <requirement specifier> [package-index-options] ...
  pip install [options] -r <requirements file> [package-index-options] ...
  pip install [options] [-e] <vcs project url> ...
  pip install [options] [-e] <local project path> ...
  pip install [options] <archive url/path> ...

The advantages are:

  • A distribution name can be specified, or a url or a path.
  • Any number of requirements can be specified.
  • I don't have to learn a new syntax, I can just use pip's which I already know.
@clbarnes
Copy link
Contributor

clbarnes commented Sep 24, 2019

I believe it is because pipx needs to make a virtualenv based on the package name, which is not always clear from the spec (e.g. if it's a wheel file or github repo). Requiring a package as well as a spec means you don't have to reach into the file system or internet to find out the package name. Agreed that it would be nice if it weren't necessary; possibly if some of pip's internal plumbing could be used to get the package name from the spec, even if it meant calling out to an external resource.

@tekumara
Copy link

tekumara commented Dec 4, 2019

I agree duplicating the pip interface would provide a much more intuitive and easier first user experience.

pip

pip install git+https://github.com/seek-oss/aec.git

vs pipx

pipx install --spec git+https://github.com/seek-oss/aec.git aec

@uranusjr
Copy link
Member

uranusjr commented Dec 4, 2019

The problem is that you cannot know the package name (which you propose for determining the venv’s name) without running setup first, but pipx needs a venv to run setup. So duplicating pip’s interface would mean pipx needs to create two venvs and throw one away just to discover the package name. Not a good trade off IMO.

@tekumara
Copy link

tekumara commented Dec 4, 2019

If a package name is not provided, could it be inferred from the URL?

@uranusjr
Copy link
Member

uranusjr commented Dec 4, 2019

It can in some cases (and in some cases even reliably), but there is no guarantee a Git repo is named the same as the package. It’s all about edge cases, unfortunately.

@tekumara
Copy link

tekumara commented Dec 4, 2019

In what ways would it matter if an inferred package name was different from the actual package name?

@uranusjr
Copy link
Member

uranusjr commented Dec 4, 2019

As mentioned above, pipx names the venv after the package, so a wrongly inferred package name would result in a wrongly named venv. This means that pipx would be unable to correctly identify those packages, and in turn fail to upgrade or uninstall a package correctly.

@raxod502
Copy link

raxod502 commented Dec 4, 2019

duplicating pip’s interface would mean pipx needs to create two venvs and throw one away just to discover the package name

Ok, but consider this: somebody is going to have to discover the package name. No reasonable user is going to realize to begin with that the package name is different from the Git repo name. So what's going to happen is they will try to install it with pipx, hit this error, try using the Git repo as the package name after finding this issue, get another error when the package name doesn't match, and then install a second time with the correct package name that they eventually figure out from looking on GitHub most likely. You still end up with two virtualenvs that were created, and it's a heck of a lot slower because it's the user doing the troubleshooting, instead of pipx. Why don't we assume that the package name is the same as the Git repo, with a slower fallback, as part of the effort to discourage package authors from making poor packaging decisions?

@cs01
Copy link
Member

cs01 commented Dec 6, 2019

I agree that the install target as the only argument would be a nicer API.

Let me provide some history to show how the API became what it was.

  • I make a new CLI application called pipx that is designed only to run things, like npx. pipx cowsay moo will install cowsay from PyPI and then run it.
  • Sometimes, the app name doesn't match the package name, so I add a --spec (specification) argument to declare the PyPI package name. Now pipx has a workaround, and something like pipx someapp --spec some_package works.
  • A while later pipx decides it wants to install things. So pipx run and pipx install are introduced. Now if someone wants to install from vcs (git, hg, etc.), I don't know how determine the package name from the url (or path on the filesystem, which is also valid), so spec can be used. Here, pipx will abstract away the fact that you need to add #egg=PACKAGE to the git url. (pipsi didn't do this and it was confusing to users. See Allow installation from source mitsuhiko/pipsi#174.)

Then I decided to make pipx also install applications to venvs. For this to work, pipx needs to know the name of the package to create the venv, as @clbarnes and @uranusjr mentioned. I was not (and am not) aware of how to do this in general, so I just left --spec as the canonical way to do it.

The point of all this history is that the pipx API was made from the "bottom up", solving problems as I encountered them. From the beginning I would have preferred to only have to provide one argument (the install target) if possible, but didn't know how to do it. The addition of persistent metadata will probably be helpful with this as well, so the name can be determined once and then stored in the metadata (#222).

Based on the above comments, it sounds like it might be technically possible:

The problem is that you cannot know the package name (which you propose for determining the venv’s name) without running setup first, but pipx needs a venv to run setup. So duplicating pip’s interface would mean pipx needs to create two venvs and throw one away just to discover the package name. Not a good trade off IMO.

So if this is possible, I would be all for it. A slight delay in determining the package name would be worth the usability improvement, IMO. Additionally, a warning could be printed that says

Determining package name, this may take a while. To skip this step, you can provide the package name yourself with --name NAME.

Of course this would be a big API change and would require a major revision in version numbers, but I'm fine with that if it's a better end user experience.

cc @itsayellow since this might impact the package metadata tracking

@clbarnes
Copy link
Contributor

clbarnes commented Dec 6, 2019

would mean pipx needs to create two venvs and throw one away just to discover the package name

This is less ridiculous than it sounds - pip already creates one-off disposable virtualenvs when building packages with build systems specified in pyproject.toml.

@itsayellow
Copy link
Contributor

The problem is that you cannot know the package name (which you propose for determining the venv’s name) without running setup first, but pipx needs a venv to run setup. So duplicating pip’s interface would mean pipx needs to create two venvs and throw one away just to discover the package name. Not a good trade off IMO.

So if this is possible, I would be all for it. A slight delay in determining the package name would be worth the usability improvement, IMO. Additionally, a warning could be printed that says

Determining package name, this may take a while. To skip this step, you can provide the package name yourself with --name NAME.

Of course this would be a big API change and would require a major revision in version numbers, but I'm fine with that if it's a better end user experience.

cc @itsayellow since this might impact the package metadata tracking

Actually this is already implemented. Figuring out the package name is exactly what I had to implement to make pipx run work correctly without forcing the user to specify: the spec, the package name, the app name. (The current API only asks for the spec and app name.)

The code already exists in Venv.install_package(). If you pass Venv.install_package() a valid package_or_url argument, and give it argument package=None, then it will determine the package name and properly update the saved metadata with the proper name for the main package.

The only wrinkle that would make it different for pipx install vs. pipx run is that pipx run uses a hashed name for the venv directory, whereas pipx install and various other code wants the venv directory to be named for the main package. So even before Venv.install_package() is run we need to determine the main package name to properly specify the venv name in Venv.create_venv().

So as in the discussion above, we'd have to install into a throw-away venv, discover the main package name, and then go to create the real venv properly named after that package.

@itsayellow
Copy link
Contributor

itsayellow commented Dec 6, 2019

And this can be faster than a regular install of the package, because it works by installing using the --no-dependencies option to pip. So for package name determination you would only wait just for the main package to install and not for any of its dependencies to install.

@itsayellow
Copy link
Contributor

I suppose another option to waste less time would be:

  1. pick a random name for the venv to install to
  2. install the main package and determine its name
  3. then rename the venv.

That way you wouldn't have "wasted" an install of the main package. Not sure if renaming a venv would create problems.

@uranusjr
Copy link
Member

uranusjr commented Dec 7, 2019

IIRC console scripts rely on the absolute path of the venv, so renaming is unlikely to work.

@itsayellow
Copy link
Contributor

IIRC console scripts rely on the absolute path of the venv, so renaming is unlikely to work.

Hmm, good point. The shebang line in the CLI app has the absolute path to the python in the venv. It could be rewritten, but at that point we'd have to decide how messy we want things for a small speedup.

@clbarnes
Copy link
Contributor

clbarnes commented Dec 7, 2019

The slowest thing about package installation, generally, is downloading and building dependencies. Installing the same thing twice in quick succession will skip these thanks to pip's cache, so it's pretty much dependency resolution which isn't too bad.

@uranusjr
Copy link
Member

uranusjr commented Dec 7, 2019

I have another idea. Maybe we can just… don’t rename it? With #222 we can track the package-venv relation in the metadata, so the directory name doesn’t really need to match the package, and is technically only cosmetic. As long as pipx implements correct package duplication checks (so after pipx install foo it would raise an error on pipx install foo[bar]) the venv name can be an implementation detail.

@itsayellow
Copy link
Contributor

I think making non-package venv names is definitely possible. It is kind of nice to see human-readable names when peering into the pipx/venv directory, but I could get over that if I had to lose it.

Since we already have some API changes in the master branch, and probably a major version change in the future, I would love to see this CLI simplification land in time for that, regardless of the implementation. I wouldn't mind if it started with something quicker to implement and get it included in the next major version, and then we could refine/streamline the implementation later.

@cs01
Copy link
Member

cs01 commented Dec 23, 2019

I think making non-package venv names is definitely possible. It is kind of nice to see human-readable names when peering into the pipx/venv directory, but I could get over that if I had to lose it.

Agreed. Let's first implement it and keep the venv names, and if the install time is unbearable, we can drop the readable venv names and go with the single install suggestion.

Does anyone want to take a stab at this?

@cs01
Copy link
Member

cs01 commented Dec 23, 2019

I see your implementation, @itsayellow

                old_package_set = self.list_installed_packages()
                cmd = ["install"] + pip_args + ["--no-dependencies"] + [package_or_url]
                self._run_pip(cmd)
                installed_packages = self.list_installed_packages() - old_package_set
                if len(installed_packages) == 1:
                    package = installed_packages.pop()
                    logging.info(f"Determined package name: '{package}'")
                else:
                    package = None

I was thinking there was some magical python setup.py command that worked for different package formats, but this is so much simpler and works with any package format that pip recognizes. Very cool 👍 .

@itsayellow
Copy link
Contributor

I was already being sorely tempted to implement this and then you go and flatter me. 😄

I can take a stab at it if nobody else wants dibs.

@cs01
Copy link
Member

cs01 commented Dec 23, 2019

haha! I actually do the exact same thing with https://github.com/cs01/pythonloc, so I don't know why it didn't occur to me to do it here. Tbh I would love to implement this but I'm about to do some travelling for the holidays, so it's all yours if you have the time.

@itsayellow
Copy link
Contributor

Ok I'm on it.

I have some spare alone time now, but will be in full Christmas-mode soon and may need to leave it for a bit.

If you are itching to try and implement it before I'm done, no problem, just let me know! But I'll see if I can get it running.

@itsayellow
Copy link
Contributor

Fixed with #302

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants