-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds a type check to ensure the description type provided is valid #473
Conversation
twine/commands/check.py
Outdated
@@ -88,6 +89,12 @@ def check(dists, output_stream=sys.stdout): | |||
) | |||
description_content_type = 'text/x-rst' | |||
|
|||
if description_content_type not in _supported_readme_types: | |||
output_stream.write( | |||
'warning; `long_description_content_type` invalid.\n' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on the team's thoughts I could make a change to make this an outright failure.
The problem is if this type is wrong pypi will outright reject the package with a 400 error.
I know that technically this tool is checking if the readme will render so perhaps the type is a separate concern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IN the back and forth on this PR I ended up just making this an error
Looks like I have some lint errors to fix
|
64c401c
to
76efb49
Compare
Codecov Report
@@ Coverage Diff @@
## master #473 +/- ##
========================================
+ Coverage 82.93% 83% +0.06%
========================================
Files 14 14
Lines 750 753 +3
Branches 108 110 +2
========================================
+ Hits 622 625 +3
Misses 92 92
Partials 36 36
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #473 +/- ##
=========================================
+ Coverage 82.93% 84.5% +1.57%
=========================================
Files 14 14
Lines 750 781 +31
Branches 108 116 +8
=========================================
+ Hits 622 660 +38
+ Misses 92 84 -8
- Partials 36 37 +1
Continue to review full report at Codecov.
|
Ok, all fixed now :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a useful addition to check
. I'm not a maintainer, just a new contributor, so I can't speak to the UX questions. However, I do have some style suggestions.
tests/test_check.py
Outdated
}) | ||
|
||
monkeypatch.setattr(check, "_find_dists", lambda a: ["dist/dist.tar.gz"]) | ||
monkeypatch.setattr( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see how this patching makes sense with the structure of check.check()
, but it feels a bit awkward. What do you think about extracting a function from check()
? Something like:
def check(dists, output_stream=sys.stdout):
# ...
for filename in uploads:
output_stream.write("Checking distribution %s: " % filename)
package = PackageFile.from_filename(filename, comment=None)
failure |= check_package(package, output_stream)
That way, you could unit-test check_package
with a PackageFile
stub directly, and maybe not have to monkeypatch anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me take a look. I was mostly just following the existing structure trying to keep the changes pretty minimal.
But I guess I can give it a wider look and just revert back if the maintainer prefers the more conservative approach
tests/test_check.py
Outdated
|
||
output_stream = check.StringIO() | ||
check.check("dist/*", output_stream=output_stream) | ||
assert output_stream.getvalue() == ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this assert
be a little narrower? I.E., just check for the expected error content, rather than the whole message? I received similar feedback on a recent PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean maybe? I could basically check for the existence of warning;
long_description_content_type invalid
to ensure I hit the line. But it would be behaving pretty differently then the other tests of the file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've gotten the impression that there's leeway to deviate from the existing test structures, with this sort of thing being an example. I think your proposed check would be sufficient.
Co-Authored-By: Brian Rutledge <[email protected]>
Co-Authored-By: Brian Rutledge <[email protected]>
Restructured the check logic a bunch. I think it's cleaner now and potentially prepared to take on other functionality if there are thoughts. Though I do worry I have greatly expanded the initial scope of the PR. That being said im game to keep working with it. If people like the bigger change and have further comments ill keep going. Or if they like the idea but would prefer a smaller change I could revert back. Either way. (the original PR was just this initial commit 76efb49) Thanks @bhrutledge for the feedback so far |
twine/commands/check.py
Outdated
output_stream | ||
) | ||
description = _get_description(metadata, output_stream) | ||
failure = _description_will_not_render( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now a render failure is the only thing that triggers a a failure. I personally think the description type being provided but invalid should be a failure as well but I wanted to wait for feedback before I did that. See #473 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to agree, but maybe there are some other considerations that we don't know about.
With that in mind, I think it will be easier to adjust the failure modes if the utility functions above were inlined here, as they were in check()
. It would also be less of a structural change.
Related to that, the _get_*
functions feel like they're doing two things: retrieving a value, and checking its validity. If they don't get inlined, I think it'd be more flexible if they were just validators (possibly with the warning output), similar to _description_will_not_render
.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im not sure. They warn you if something is potentially strange but its basically just a log line. The check_package
function does not really depend on any of that that other than just getting the value. That's why I see them as getters.
If the validation was actually relevant to the check_package function I could see either separating it out or inlining that validation to check_package
actually saying that makes me wanna go ahead and make the type check a proper validation. Since unlike the other two warnings. It actually does matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and inlined the getters. What pushed me over the edge was realizing the order which output stream gets written to does matter. Given this I did not want to be passing around output_stream and having functions return both their values and any warnings they wanted to emit felt like a bridge too far.
tests/test_check.py
Outdated
|
||
output_stream = check.StringIO() | ||
check.check("dist/*", output_stream=output_stream) | ||
assert output_stream.getvalue() == ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've gotten the impression that there's leeway to deviate from the existing test structures, with this sort of thing being an example. I think your proposed check would be sufficient.
tests/test_check.py
Outdated
failed = check_package(package, output_stream) | ||
assert not failed # Invalid Description type is currently just a warning | ||
assert output_stream.getvalue() == ( | ||
'warning; `long_description_content_type` invalid.\n' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'warning; `long_description_content_type` invalid.\n' | |
'warning: `long_description_content_type` invalid.\n' |
twine/commands/check.py
Outdated
output_stream | ||
) | ||
description = _get_description(metadata, output_stream) | ||
failure = _description_will_not_render( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to agree, but maybe there are some other considerations that we don't know about.
With that in mind, I think it will be easier to adjust the failure modes if the utility functions above were inlined here, as they were in check()
. It would also be less of a structural change.
Related to that, the _get_*
functions feel like they're doing two things: retrieving a value, and checking its validity. If they don't get inlined, I think it'd be more flexible if they were just validators (possibly with the warning output), similar to _description_will_not_render
.
What do you think?
So years ago, PyPI didn't accept markdown and only in the last 2-3 years did they start. There's nothing to prevent PyPI potentially adding support for other doc formats (asciidoc, orgmode, etc.) except for someone implementing it and there being a sufficient use-case to support it. Provided that, I don't think a strict error here (without some sort of escape hatch) is necessary. I fully trust users to know what they're doing and if a user is stuck to some older version of twine because that's what their operating system provides and they can't install a newer version (or don't want to), should we really prevent them from uploading their package to that future PyPI? |
That's a good point. Would you be interested in downgrading this to a warning? Or should I close this? |
Oh wait, am I remembering correctly that this is automagically ran on |
I dont think upload runs check. I only found this command after googling around trying to find out why my package was not uploading. |
Yeah, here is a test I ran (using my branch. I can re run on master if you would like) I introduced a syntax error into a readme on a project of mine. Verified the error with Then I tried to upload a version (the version already existed so I expected it to be rejected but it was rejected due to the description instead of the expected reason) If check was running as part of upload I would not have expected a request to have been made at all
|
FWIW, I think the "right" way to go about this would be to do what I suggested in #430: add the ability to validate metadata in I don't think |
I think that makes a lot of sense. I am not super informed in the warehouse/packaging/pip development structures but from what I can tell adding full metadata validation seems to be a larger problem that may take some time. Though it looks like its being worked on based on pypa/packaging#147 and pypa/packaging-problems#264 . Reading though all that its probably something best left to the core team given their experience. Given that, I could close this, I think the only potential reason to keep it is that right now If you put in an invalid type the tool will just go use |
if description in {None, 'UNKNOWN\n\n\n'}: | ||
output_stream.write('warning: `long_description` missing.\n') | ||
description = None | ||
failures = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize the future of this PR is in question, but I'm curious about the decision to inline some validation, and put others in helper functions. FWIW, I'd find this easier to follow if they were all in check_package
, and given the uncertainty around what should be a warning or a failure, I think it'd be easier to change in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking here is that warning being emitted is not relevant to the final result of check_package
. It does not feel like a validation so much as a log line since a package cannot fail due to this
Howdy!
Thanks for this check tool! While using it I made an error that I would have liked to see this tool catch so I thought I would take a stab at adding it myself. Let me know if you have any thoughts!
Thank you!