-
Notifications
You must be signed in to change notification settings - Fork 932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove "twine register" reference in distributing.rst #271
Conversation
Is anybody maintaining this thing? This is not a complicated fix. |
At least in my case, it's the need to investigate the discrepancy between the proposed fix here and the "if necessary" caveat in the twine README that kept me from merging the change: https://github.com/pypa/twine/#usage Looking at pypa/twine#200 it seems the fact it's outright failing for you (rather than being harmlessly redundant) is a bug in Warehouse rather than an error in the documentation, as anyone using the old implementation at |
@ncoghlan I don't really understand the difference between warehouse vs pypi.python.org. How would a user select between the two? Is it not the case that most newcomers will encounter the same problem I did? |
The differences relates to the configuration settings mentioned here: https://packaging.python.org/distributing/#create-an-account The However, not everyone is going to have that new setting - many will have a config file that still uses |
@ncoghlan I'm happy to make the requested change, but how does one check whether they need to do it? |
@jni Basically you check by making sure you aren't using the new endpoint in your |
@brettcannon I honestly don't know what to do with this PR (and the many others referencing it). I'm confused about the multiple endpoints, whether they will continue to exist, and which endpoint supports new or old APIs. My two cents is that this documentation is intended for newcomers to Python packaging, who are unlikely to have a I've allowed edits to my branch from maintainers, so do with it what you will. |
@jni If people don't have a |
@dstufft It would be really be nice to just have the API regression fixed in Warehouse, so this whole problem goes away :) |
@ncoghlan fix or no fix, if you look at the discussion in pypi/warehouse#1627, the preferred approach is to upload directly. |
At least that's my reading of it. |
@jni Thanks for the pointer - I've chimed in over there as well :) |
Just for my understanding - where are we at on this change? Does it still need to be made? |
@jonparrott The current status is that the instructions are still self-contradictory as @jni pointed out:
We just didn't originally notice the contradiction, since the folks encountering the upload problems with the legacy service necessarily already had their projects registered. One option that would allow this to be resolved independently of pypi/warehouse#1627 would be to restructure this section to cover two different paths:
This could be a good improvement anyway, since the first section can be completely opinionated (Use the Warehouse API, use |
I'd prefer to take a strong stance instead of being wishy-washy. I personally think we should go for implicit registration and just add a note that it's also possible to do it explicitly and perhaps link off somewhere else if someone wants to go that route. WDYT? |
The only problem with that is legacy PyPI needs an explicit registration once (but only once). I'm not sure that matters though because Python, twine, setuptools all default to using Warehouse now, so only people on older Pythons/twine/setuptools will upload to legacy PyPI unless they have a In general though I'm +1 on focusing on implicit registration. |
I have a I'm not aware that the More generally, if I were reading this section of the guide, things I'd want to see (as in, I'd be expecting to find) are:
It's quite possible that (3) and (4) are not in scope for this guide, but I'd like at a minimum a "see more" pointer to the full documentation. One other thought - is testpypi going to still be available after Warehouse goes live? Is there a warehouse-based testpypi at the moment (I'd have guessed it would be at https://testpypi.org/pypi, but that doesn't exist)? Would it actually be better to suggest to people that they use a local |
I think that (3) can be solved without the plaintext using keyring support that exists in twine now, although I think it's not very user friendly yet until pypa/twine#216 is solved. (4) There is currently a test warehouse, it is at test.pypi.org/pypi, although long term I want to shut it down and separate the idea of pushing a release to real PyPI and publishing it to be generally available apart which is useful both for testing releases (you can just cancel the temporary upload, or have it auto delete after a week or so) but also for people who want to build up a number of release artifacts across different systems and test the uploaded bits before publishing. That is tracked in pypi/warehouse#726. |
@pfmoore The |
@jwodder My apologies - thank you for pointing this out. That must either have changed since I looked (which was a long time ago, admittedly) or I looked in the wrong place. |
As @pfmoore notes, the idea of only recommending an implicit registration based workflow seems strange to me, as it's inherently prone to race conditions - there may be a period of days or weeks where you've committed to a particular name, and are actively developing the code using that name (whether in private or in the open), but don't have anything worth publishing to PyPI yet. Implicit registration mainly seems to be useful in cases where people already have a project that has been around for a while, and decide "Oh, I should probably publish this to PyPI, let me see if the name is still available". And even then, it seems weird to only be able to register the name after writing your setup.py, building an sdist, and having it ready to upload, rather than going:
It feels akin to only being able to register a domain name at the time you first publish the associated site, rather than having "register the domain name" and "publish a site update" being clearly distinct activities. Perhaps that's the underlying problem here? "Register your project" and "Upload your first release" should really be covered as distinct steps, but that's currently obscured by the fact that even with the legacy PyPI API you still need at least a minimal sdist for the registration step due to the way the client tooling works (it just doesn't need to be a releasable one). |
Honestly, I'm not to fond of the idea of "claiming" a name anyways. AFAIK the other popular languages don't really allow it (RubyGems doesn't allow it implicitly by only allowing you to upload things, npm explicitly has a policy against it saying the name belongs to the first person who publishes an actual project). As a user of pip and PyPI, it is frustrating to me when I find a project whose name appears to do what I want, but whose page is a placeholder waiting for some code. More often then not this has been a placeholder for some period of time because the person came up with a great name, claimed it, then never ended up actually publishing a name under it. There is obviously no technical means we can employ that prevents people from squatting names, if we require a tarball, then people will just upload a tarball, if we require it to be updated frequently, then people will just regularly update it. Since there is no "cost" associated with taking ownership of a name on PyPI, people are incentivized to do so greedily in case they might ever use such a name, rather than when they actually need it. However, just because we can't prevent it, doesn't mean we really need to continue to keep around workflows that primarily exist to benefit it. Longer term I would love to provide a policy similar to what NPM has here, which is "produce working code for your name, or get out of the way for someone who does". On the surface the comparison with DNS names seems like a fair one, but it is different in a few key places. For one DNS names cost money, so there is an incentive to keep only names you are still planning on using. In addition to that DNS names need to be renewed, so if you bought a name that you were planning on using, but then decided not to do so, typically you'll stop renewing for that name, it will expire and be released back into the pool of available names. Finally, the way people discover a domain name and the way people decide if a domain name are taken are distinct mechanisms, whereas for PyPI it is the same mechanism, so simply hiding claimed but unused packages from search (or deprioritizing them) doesn't work, because either you make it appear like the name is available when it's not (by hiding it from search results, making a better UX for people looking for a thing to use) or you make it obvious something by that name exists (by showing it in the search results, making a better UX for people looking to name their project). |
If squatting is your issue, we could presumably allow pre-registering, then expire the registration if no upload were made within (say) 3 months. That allows people to decide on a name, then have sufficient time to develop something before being forced to upload. Honestly, I don't see how "doesn't do anything, just a placeholder" uploads are better than registered projects with no files available. And at least the latter are easier to locate (and delete, if we decide to). The baseline for me is that PyPI has this feature, so Warehouse should too. Whether we do it via twine or via a web form is a matter of UI, but pre-registration of names is an existing feature of PyPI, so if the proposal is to remove it, then that needs to be agreed. Maybe the question should be raised on distutils-sig? If the majority there has the view that name squatting is a sufficiently significant issue to warrant enforcing a (IMO) more clumsy project creation workflow, then I'm OK going with the consensus. But I can't say I like it. |
@pfmoore I don't think the workflow is more clumsy? Whenever you're ready to publish you just upload. It removes a required step, thus streamlining the workflow. If you're not ready to publish, then you probably shouldn't be grabbing the name to begin with. Like I said though, trying to layer on technical solutions to this tends to just be an arms race, if we require some activity within 3 months, people will just make sure they do that when squatting. I'm not super interested in trying to layer in a bunch of technical solutions here because it just makes things a bit harder for everyone else. What I'm somewhat opposed (but not entirely opposed) to is throwing in API end points whose main purpose is squatting names. |
What I meant was that I have to choose a name, then build my code with no guarantee that by the time I'm ready to publish, I won't have to change that name. Or alternatively, I create a dummy project with a Maybe I'm being unnecessarily paranoid that someone might grab my cool name, that's certainly possible. But OTOH, not allowing pre-registration is just as much a technical solution to the squatting problem here, as squatters willing to re-register would be just as willing to upload a dummy project. My main argument is that we should have a better justification for removing an existing feature. Anyhow, I've made my point, I'll leave it to others to make the decision. |
Yea, you want to squat the name with the promise that at some point you're going to upload something to that name :) Maybe you're actually going to upload something relatively soon and thus the impact is small, maybe you're going to get bored of the project and never upload something in which case that name is now being held and others with their own equally cool ideas can't use that name. The difference with not allowing pre-registration vs trying to require an upload in X amount of time or something, is that pre-registration doesn't have much in the way of actual use cases other than squatting, but more importantly I don't think not implementing it is going to solve anything, I just don't think it's worth implementing it and maintaining it since it's primary purpose is something I'm not really wanting to incentivize anyways. Since it's a brand new code base, implementing an existing feature takes more work than "removing" (or really, not implementing) that existing feature, so up front I looked at features we had and I just didn't implement ones that I wasn't super interested in supporting any longer. Beyond just not wanting to take the time to implement a feature I wasn't really thrilled about continuing to support, is there is some evidence that the register API leads to confusion sometimes. I've seen more than one case where projects would register a release with PyPI, but forget to upload it. When they went to PyPI they saw the release there and didn't notice it had no files (since there's no "HEY THERES NO FILES" warning, just the absence of files which is hard to notice) and then got confused when Finally there is just the cognitive burden, having two things to understand the difference between is inherently harder to understand than one thing to understand (and from a UX perspective, having a user presented with the option to upload and register is a lot worse than the option to just upload). We can look at this issue itself and how different folks are trying to propose different ways to cope with the additional complexity of mentioning the explicit registration at all without making it much more complicated for a new user to understand (and ultimately all of them fall short of the simplicity of "upload when you're ready, that's it". |
Note that I'm fine with having "register-on-first-release" being the default approach recommended for small individual projects. It just doesn't always work so well in the corporate open source context, where naming things often gets a lot more complicated, there may be trademark lawyers involved, and the minimum requirements for getting to an initial release may be higher. The main pre-registration use cases that I'm talking about are the ones like https://pypi.python.org/pypi/leappto where we could upload the current sdist as In our kind of situation, "Registered, but no releases yet" is a more accurate reflection of the project's current state than "Registered, with only a dummy release" (hence why the metadata also includes the "Pre-alpha" classifier). Rather than DNS, a better parallel for PyPI is probably GitHub, where creating a repo is a distinct step from pushing useful content to it. I think the GitHub name squatting policy gets this balance right, by making it clear that names cannot be held indefinitely for future use, even though the platform allows you to register arbitrary names if you want to: https://help.github.com/articles/name-squatting-policy/ Making PyPI's explicit policy be "Names registered without making any releases are deemed provisional, and may be automatically relinquished after a period of time" would be relatively straightforward to eventually automate server-side: without the need to judge whether or not a single solitary release on a project is a "real release", it becomes feasible to add a check that means that if a project doesn't make a release within a certain number of days, the provisional registration will lapse, and the name will go back into the generally available pool. Given such an automated garbage collector, truly malicious actors would take the additional step of either uploading a dummy release or setting up a counter-bot to automate re-registration, but it would be sufficient to handle the benign cases of folks that forgot to deregister a name they reconsidered and decided not to use, as well as those that simply didn't realise it wasn't OK to reserve names indefinitely for possible future use. |
@ncoghlan I think doing that sanely would require changes to the PyPI data model. You currently cannot get pages like https://pypi.python.org/pypi/leappto without giving a version number, so you have to pick a version number of some sorts and you're essentially (as far as PyPI is concerned) making a release. Thus if that is a use case we genuinely want to support (and I'm not sure that it is, but I'm not opposed to it), we probably want to bake that in as part of the data model itself, rather than hijacking the concept of making a release. Possibly with a name like |
@dstufft I think the current registration API is deeply flawed (specifically due to the ability to make the last registered metadata differ from the last actual release), and am a fan of it ultimately going away. The specific aspects of the current approach to reaching that goal that I don't like are:
Of those two, it's really the first one that's at issue in this particular thread - it's a clear barrier to the Warehouse migration, because it breaks things, starting with the User Guide. Making it a silently successful operation, rather than the current noisy failure, would be sufficient to resolve that. For the latter point, pre-registration already has two relatively easy workarounds in either uploading a |
I am massively -1 on an API that silently does nothing. If someone is relying on that behavior for something beyond the typical Probably the most useful thing to do for (1) is to update legacy so it doesn't ever require a |
In this particular case, the only known use case for register-without-upload is name squatting (hence why it's OK to drop the API in the first place), and I'm entirely OK with silently breaking automated name squatting scripts rather than noisily breaking them :) Non-automated name squatting will have a human checking PyPI and going "Hey, why didn't my name get registered?" and people will hopefully eventually find this PR and the Warehouse issue at pypi/warehouse#1627 and figure out what is going on. Adding support for implicit registration to the legacy PyPI service would certainly be an acceptable alternative approach, but it seems like a lot of work for the sake of providing an easier to debug error for a use case that we don't really care all that much about breaking (at least in its current form). |
In pypi/warehouse#1627 (comment) I noted that even if the Warehouse server were to change to claim that everything is fine when accessing the legacy registration API, the |
This conversation is really useful, but I want to bring something actionable back to this particular PR. Based on my testing in a fresh debian install and the latest twine (1.8.1)
Twine automatically uses the new |
@ncoghlan @dstufft @brettcannon @pfmoore this PR has been updated. Please take a look. I'll file bugs for remaining issues that occurred in the conversation above once this is merged. |
(FYI: I took the "minimally invasive" approach here. I tried not to "remove" information such as the gpg stuff which I think deserves a separate tutorial. I really want to revisit and streamline this doc, but I want to get all of the outstanding PRs merged/closed and bugs triaged before I start making big changes) |
LGTM. And I agree with your approach of getting the basics sorted out first, then working on any refinements in subsequent PRs. |
Merged with with @pfmoore's approval. If anyone else has concerns, I'm happy to address them in follow-up PRs. :) (Also if anyone disagrees with my methodologies in terms of PRs and bugs holler, i'm adjustable, just trying to keep things moving). |
Fixes #263.