-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a supported "high level" programmatic API for pip #3121
Comments
I'd be hesistating to suggest an api for anything that needs an interpreter restart, because therefore it seems to me hardly more useful to run inline code but have to exit and restart your interpreter anyway. If we were going to do something anyway, (perhaps, even, start off with operations that wouldn't warrant any interpreter restart, like package information querying), I'd suggest putting them under a fascade under a Also you'd want to be careful about not unconditionally relying on setuptools functionality for anything (without huge reason) and even when so, possibly not duplicating what api setupools already provides. |
If the only blessed api is pip.main, then there is no good reason to do it in-process some it won't have a advantage over just calling pip as subprocess |
I'm not sure if pip should have an API or not, but certainly parts of what you'd use an API for will be shuffled over to |
I think there's enough people talking about using pip from within Python (either prompts or scripts) that we should provide something. Even if it's very limited, at least that will get people thinking about what exactly they need, and we'll be able to firm up on details as we go along. @dstufft for lower-level API's based around the standards PEPs, we should absolutely be pointing people at @RonnyPfannschmidt you have a good point. It's not at all unreasonable to tell people that to call pip they should be using
If that's not sufficient for them (for example, the format of stdout isn't useful for parsing) we can discuss options (for example, adding a On the other hand, I think the big issue is that people expect to be able to do
from within (say) an Idle session. This is actually no different in principle from running pip in a separate command window and not restarting your Idle session (and indeed it's exactly the same if you use I think I'll write up a new section for the pip documentation on the basis of |
I was just tinkering with this kind of thing, and the API that I reached for was:
If there's a dedicated entrypoint like that, it could even choose whether to use subprocess or to do the work itself. Using Is there somewhere a summary of why it's important to call pip via subprocess rather than directly? |
@Rosuav The short version is that Python, setuptools, and pip all have various, process level caches for different pieces of information that is affected by installing, upgrading, and uninstalling projects. Forcing pip to be called with a fresh process each time ensures that pip always sees a consistent view of the world instead of a possibly partially cached view of the world where it might get confused about the state of the system. It is likely possible to go through and make sure that these caches get invalidated or otherwise removed so that pip will always see a consistent state and can live longer than 1 process per invocation. The other issue is that pip sort of purposely has no real public API that can be easily called from Python code. That is again because of the various caches that make it easy for things to go wrong. For example: import foo.bar # Assume foo 1.0 is installed, and Python will load and cache "foo" and "foo.bar" from 1.0 into sys.modules
import pip
pip.install("foo", upgrade=True) # Pretend foo is now upgraded on disk to foo 2.0
import foo.other # Python will return "foo" 1.0 from the sys.modules cache, and then will load and import "foo.other" from foo 2.0 and cache that in sys.modules You now have a process that is in what I think most people would call a broken state. You have different modules from different parts of the system in the same process coming from different versions of foo. This situation is of course completely possible today, you just replace the |
@Rosuav To simplify even further, you should use subprocess because there's no documented or supported in-process pip API. (For why we don't provide a supported API, see @dstufft's answer). And note that even calling pip via subprocess has risks that may bite you unless you're sure you know what you're doing (again as @dstufft explained) |
To be more explicit, I meant to say (emphases on addition):
|
The problem you describe would be the same regardless of how pip is invoked, though, wouldn't it? Whether you call pip.main as it currently is, or fire up a completely independent process in a separate terminal, upgrading and then importing can create exactly this problem. But if pip has its own internal caching, then sure, let it fire off a subprocess. Even on Windows, the cost of starting a process won't be as much as the cost of downloading new software. It'd still be nice to be able to say |
@dstufft Fair enough re changing installed packages. TBH the only use-case I'm really looking at here is:
That would be the least risky operation. If anything beyond that is declared to be unsupported, fine; and if it's implemented by spawning a subprocess, no big deal. It would still be extremely convenient. |
Yes, but by making an official top level API we signal to people that this is a supported thing they can do programmatically from within their own code. I'm of the opinion (and other pip developers may feel differently) that we should not go out of our way to provide APIs that are hard to use in a way that it is actually "safe" (e.g. not broken) to use. Especially when safe usage requires knowing if/how every single module you're importing uses that same API. An easy thing to do would be for someone (doesn't even have to be a pip developer!) to provide a As always, I'm just a single pip developer and the other developers may feel differently and if they wanted to do this anyways, I wouldn't block them. |
You could of course write a helper function for that yourself. # Warning, untested code ahead!
def conditional_install(name, project_name=None):
try:
mod = importlib.import_module(name)
except ImportError:
subprocess.call([sys.executable, '-m', 'pip', 'install', project_name or name])
mod = importlib.import_module(name)
return name
# and call it like this
foo = conditional_install('foo') Whoops, just saw @dstufft's response, I basically agree with him we don't want to deal with people getting into trouble using such an API, so supporting it within pip isn't something I'm keen on (in spite of me having raised this issue - I've since been persuaded it isn't a good idea). But a 3rd party module for this would be OK with me, it'd help us get a feel for what the support costs are like ;-) |
More and more I'm thinking that I need to provide all students with a little utilities module that loads stuff up on startup :) It's all very well to slap a personal function into a personal installation, but when I'm trying to walk someone through something, s/he won't have my toolbox handy. (Anyway, if I want to pip-install something, I'll just hit Ctrl-Alt-T to open up a new terminal, and run pip from there. Job done.) |
Can't the students do something like...
|
Teaching them project packaging best practices (ie using a requirements file) seems like a valuable lesson in its own right :-) |
Indeed - teaching them to install packages while working in an interactive interpreter seems to be leading them towards potential issues. "Hey, I went to do the import foo thing, remembered I hadn't done the install thing so I did that and now my import foo still won't work"... |
This is for early tinkering. I'm still debating with some of the other course writers about some points of best-practice, but for the really early tinkering (in interactive Python), there's no requirements.txt. But if we have to say |
@pfmoore Why would that still not work? If you attempt to import and it fails, wouldn't pip-installing the package make that then succeed? |
Well, I over-simplified - I was thinking of something like "import foo.bar" / "import foo.baz" as per @dstufft's example above. You may be OK for your specific use cases, but I'm concerned you've taught users that it's OK to install stuff "behind the back of" a running interpreter, and this will later come back to bite them. But once again, this is more a case of "not supported" than "won't work". You asked above for advice, and basically the advice is "don't do this". But you know your requirements better than we do - if you feel you can make things work in a way that's beneficial to your students, that's fine. |
Fair enough. It's not that I'm teaching people that it's a good idea to do this, more that I'm just trying to help people get started without too many hoops to jump through. Once they get a bit of the basics down, I can start explaining how to properly structure a project, how to get their SSH keys set up so they don't need to use passwords, how to refactor code to improve readability, how to migrate a database... and how to install/upgrade packages without causing problems. The real solution is probably just to provide a ready-to-use system that has a bunch of packages preinstalled for them. Hence the |
I think the simple case of: import pip
try:
import requests
except ImportError:
pip.install("requests")
import requests will correctly work, but only because CPython doesn't cache failed imports in import thing_that_can_optionally_use_requests
try:
thing_that_can_optionally_use_requests.method_that_requires_requests()
except thing_that_can_optionally_use_requests.NeedsRequestsError:
import pip
pip.install("requests")
thing_that_can_optionally_use_requests.method_that_requires_requests() doesn't work when the first snippet does if the try:
import requests
HAS_REQUESTS = True
except ImportError:
HAS_REQUESTS = False
def method_that_requires_requests():
if not HAS_REQUESTS:
raise NeedsRequestsError It might be helpful to you to know that pip can (unless I am remembering incorrectly) access a requirements file that is located on a remote server. I don't know if your only setup you need is to install pip packages or if there's more beyond that, but you could do something like: $ pip install -r https://awesomeclass.example.com/course-intro-requirements.txt I think pip will first download that file, then parse it and treat it as if it were a local requirements file. Of course if you have more steps you want to do than just install Python packages then a |
Ooh, I didn't know that! I did have a few other things in mind (setting up PostgreSQL with autostarting, and others could be added), but if a single copy/paste of a single command can install a bunch of stuff, that might be sufficient! Thanks for the tip. |
Hmm. If the recommended interface is to be the subprocess module, I hope that |
Agreed, check_call is over-simplified. A more robust wrapper is probably useful, but I'm not 100% convinced there's an "obvious" API for it. For example, with IMO, the obvious answer is to have a 3rd-party project that provides an API to run pip in a subprocess. Do the design in that project (which can be documented as experimental, unlike pip which has to consider backward compatibility) and once the project settles on a stable, widely useful API, then propose that API for inclusion into pip. |
A question that comes up from time to time is "why can't buildout use pip instead of setuptools for installing packages". The answer has always been "because pip doesn't have an api and we won't ever call it on the command line because that would be very weird". You can debate whether setuptools has a proper api, but that's what buildout has been using till now. Question: so, from pip's viewpoint, it is perfectly OK to (I'm leaving aside the question whether |
Sure. IIRC it might still have an issue that it always needs a valid stdout (and stderr?) to run, but apart from that, that's just another way of scripting the running of pip instead of typing in an invocation by hand. |
I am experimenting with bringing pip support to buildout, probably not in core, but via a buildout recipe. I understand from the above why everyone is hesitant on creating and/or blessing an api. So a But in this comment I simply want to present what currently works for me:
See my buildout pip branch though that file contains two conflicting ideas. This is nicer than So for me, if an api would consist of those three imports ( If someone now shouts "yes, let's bless these as api" then that would be great. But I won't hold my breath. And I understand the reasons. |
I suspect |
Okay, thank you for the warning. |
I'm going to close this issue, there's not really anything actionable on it besides making a decision one way or the other about a public API and our defacto decision has been to not add one. I don't think holding open an issue any longer is of value even if we revisit that in the future. |
The pip API is unsupported and subject to change, see: pypa/pip#3121 The recommended way to programmatically install and import a module is to use subprocess and importlib
If I understand correctly, pip-shims ("a set of compatibilty access shims to the pip internal API") now exists to work around the current lack of a high-level API for programmatic access to pip internals. |
That is appropriate, as long as folks using it don't expect it to "just work". Further, as noted in the README of that package:
|
This has come up a few times recently now that pip is available in a standard Python install.
Maybe we should formally support a programmatic API for pip that allows the high-level command line operations pip supports to be run from within a Python interpreter? That may simply mean blessing (and documenting)
pip.main()
as a supported API.There may be some odd corner cases to take care with, if we're considering people running
pip.main()
from within a persistent interpreter (e.g. IPython or Idle). I'm thinking of cases wheresys.modules
caches something you're upgrading, for example. But that may just need documentation saying that those things need an interpreter restart, at least as a starting point.The text was updated successfully, but these errors were encountered: