-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PoC] Add Core Metadata API (as a dataclass) #498
Conversation
The tests in 3.6 fail as expected (due to the lack of support for dataclasses). |
As mentioned in the review for pypa#498, conditions in tests should not be based in what the code being tested, otherwise they might end up hiding other problems. The solution is to pass flags in the test parameters themselves.
Thank you very much for the kind review and improvement suggestions @brettcannon. I tried to fix a lot of the points you mentioned, but there are still some parts that seem fuzzy to me, so I left some questions in my comments, I hope you don't mind. The PR definitely has rough edges so any feedback is more than welcome. |
(Not supported in old versions of Python)
As mentioned in the review for pypa#498, conditions in tests should not be based in what the code being tested, otherwise they might end up hiding other problems. The solution is to pass flags in the test parameters themselves.
Co-authored-by: Brett Cannon <[email protected]>
Co-authored-by: Brett Cannon <[email protected]>
This reverts commit 215c172.
6f655bd
to
f1625d3
Compare
packaging/metadata.py
Outdated
# 2.2 | ||
dynamic: Collection[NormalizedDynamicFields] = () | ||
|
||
def __post_init__(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this method is necessary. If you mess up on what you assign then that's on you; language of consenting adults and all. Otherwise you should be running a type checker to make sure you are keeping your types consistent with what people expect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the comment @brettcannon. Here I tried to follow the principle of being loose with the input but strict with the output. But I completely understand your point.
Something that I am also thinking now is: since packaging
aims to be a collection of "reusable core utilities for various Python Packaging interoperability specifications", it would be nice if it could handle some of the tedious and repetitive processing that would be required to use this class (e.g. converting strings to Requirement
or SpecifierSet
objects) and are currently implemented in __post_init__
.
So what if I remove this method as you suggested, but add another method that would handle this use case? For example a from_unstructured_dict
class method (or any other name really, please feel free to bikeshed) with proper and explicit type annotations?
(Please note that my intention here is not to force the types to be correct, but instead absorb the repetitive tasks that would have to be implemented by any backend using such metadata API)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While "accept loose, output strict" is a good idea in most cases, I don't really see what situations we'd have a user of this API throw an unstructured dictionary at this API?
The only case that I think we'd want to accept a dictionary in, would be if we added PEP 621 ingestion (which, I'll keep the discussion for in #383 since that's a "scope" question, rather than an implementation question).
Every other user should be able to just pass the right arguments to construct this object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the comment @pradyunsg! I will bring this discussion to #383 then 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @pradyunsg : the structure of the data that's going to be coming in is very much going to be from known formats. The only one that might be different is someone trying to convert setup.cfg
/setup.py
, but that doesn't need to live here (i.e. we don't have to be all things to all people).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think such a method would be useful to implement from_pyproject
so I thought that making it public would allow people making more use of the class even before from_pyproject
gets implemented.
But I have no problems with the proposed approach. I will trust in your intuition here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abravalheri it's more paranoia. 😅 It's much harder to take something away then to introduce something to an API. As such, we prefer to keep the APIs small and for scenarios we know exist rather than guess at future needs. "Premature optimization" applies to API design as well. 😉
packaging/metadata.py
Outdated
first_line = lines[0].lstrip() | ||
text = textwrap.dedent("\n".join(lines[1:])) | ||
other_lines = (line.lstrip("|") for line in text.splitlines()) | ||
return "\n".join(chain([first_line], other_lines)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use inspect.cleandoc
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice tip! I didn't know this function. Thank you very much, I will probably add a commit either tonight or during the weekend!
packaging/metadata.py
Outdated
# 2.2 | ||
dynamic: Collection[NormalizedDynamicFields] = () | ||
|
||
def __post_init__(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While "accept loose, output strict" is a good idea in most cases, I don't really see what situations we'd have a user of this API throw an unstructured dictionary at this API?
The only case that I think we'd want to accept a dictionary in, would be if we added PEP 621 ingestion (which, I'll keep the discussion for in #383 since that's a "scope" question, rather than an implementation question).
Every other user should be able to just pass the right arguments to construct this object.
See also #332, which was an earlier half-complete attempt at adding metadata validation. |
Thank you very much @brettcannon and @pradyunsg for the latest reviews. In my latest commit I have removed the Since For the time being I am leaving Please let me know if you want to collapse @pradyunsg I had a look on #332, and I think most of the things implemented there are also addressed here. Sorry for failing to see this open PR. There are a few elements though that I am currently omitting on purpose:
|
No need to apologise. I mentioned that PR just to make sure that you're aware that it exists, and so that you could check that there was something there that you could reuse. I apologise for not being clear enough about that. :)
That's perfectly fine, we can defer that for a follow up. I'll take a look at this closer to the weekend -- if I don't, @-mention me sometime next week! :) |
It actually might not. When I was worrying about |
Oh right! I should have read the spec more carefully. In this PR the I hope to fix this validation later this week. |
I didn't even know that detail and I helped introduce |
As discussed in pypa#383 instead of having 2 separated sets of methods (one for `PKG-INFO` files in sdists and one for `METADATA` files in wheels) we can have a single pair to/from functions with an `allow_unfilled_dynamic` keyword argument.
I think I managed to fix the I also went ahead and made 2 changes in the last 2 commits based on the previous discussions:
If |
packaging/metadata.py
Outdated
B = TypeVar("B") | ||
|
||
if sys.version_info[:2] >= (3, 8) and TYPE_CHECKING: # pragma: no cover | ||
from typing import Literal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than have things type check differently on different Python versions, you can just use:
if TYPE_CHECKING:
from typing_extensions import Literal
You don't need to actually install typing_extensions for this to work, since a) it's in the TYPE_CHECKING block, b) type checkers treat typing_extensions as part of the stdlib, so they always know about its existence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @hauntsaninja, thank you for the tip! That sounds very nice... However when I try to run mypy
with Python 3.7, I get a series of errors, including:
packaging/metadata.py:41: error: Module "typing" has no attribute "Literal"; maybe "_Literal"?
Are you sure about type checkers treating typing_extensions
as part of the stdlib?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm quite sure (I help maintain typeshed and mypy). That error looks like you might not have changed the import to use typing_extensions
instead of typing
?
The following works for me (where emptyenv
does not have typing_extensions
installed in it):
python3.7 -m mypy -c 'from typing_extensions import Literal' --python-version 3.7 --python-executable emptyenv/bin/python
To be explicit, the options are:
1)
# does not require runtime dependency on typing_extensions
# works because all type checkers special case typing_extensions
if TYPE_CHECKING:
from typing_extensions import Literal
NormalizedDynamicFields = Literal[...]
else:
NormalizedDynamicFields = str
from typing_extensions import Literal
# does require runtime dependency on typing_extensions
NormalizedDynamicFields = Literal[...]
Anyway, the only effect of the PR as it stands would be slightly looser type checking for users of packaging who use Python 3.7, so this is mostly a nit :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! I completely misinterpreted your comment and then, when reading your example, my confirmation bias took over and I was able to swear you were still using from typing import Literal
... Sorry for that 😅
That sounds a very good approach to me, I will go ahead and implement it.
If the repository owners prefer keep the previous approach, please let me know and I will "revert" the commit.
As indicated in the [code review](pypa#498 (comment)) type checkers consider `typing_extensions` as part of the stdlib, so it does not need to be explicitly installed.
"1.1": { | ||
"has_dynamic_fields": False, | ||
"is_final_metadata": True, | ||
"file_contents": """\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to embed the file contents, or should we have them as files in the test suite that we load as appropriate?
Now that c533201 has landed, I'm closing this PR. @abravalheri thanks for taking a chance on writing something! |
Thank you very much @brettcannon, looking forward for the |
So am I. 😉 That's my next planned step (unless someone beats me to it).
No pressure. 😅 |
This work is a proof-of-concept for the implementation of the Core Metadata API based on the discussion in #383. Please feel free to ignore/close or select just the interesting parts of this PR, my main intention is just to have something concrete that can be used to advance the discussion.
I started with the API proposed in #383 (comment), but changed it to be use a frozen dataclass (which simplifies the design: we don't have to worry about questions such as: "should
dynamic
be automatically updated when the target field is edited?"). Please note that users can still usedataclasses.replace
to modify the data structure freely.Another divergence from the proposed API is the usage of
Collection
instead ofIterable
(which also provides the handy__contains__
and__len__
methods...).