-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add get port functionality #25
Conversation
Are you still working on this? If yes, change the title to have [WIP] at the front and when you're ready for review, just remove the [WIP]. |
@birdsarah I believe everything is ready |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to take a slightly different approach to the one you've built here in order to reuse code and keep the behavior of the library, and API, consistent across methods.
See the following code from get_stripped_url:
purl = urlparse(url)
_scheme = purl.scheme
# To handle the case where we have no scheme, but we have a port
# we have the following heuristic. Does scheme have a . in it
# which is stdlib behavior when not recognizing a netloc due to
# lack of //. If TLDExtract, can find a suffix in the _scheme
# then it's probably a domain without an http.
if '.' in _scheme:
# From the docs: "urlparse recognizes a netloc only
# if it is properly introduced by ‘//’". So we
# prepend to get results we expect.
if extractor(_scheme).suffix != '' or is_ip_address(_scheme):
url = '//{url}'.format(url=url)
This could be a private method, something like, _update_url_to_handle_port_and_no_scheme
that returns the url either modified or not.
This new private method can be used by both get_port
and get_stripped_url
.
This will have the nice benefit of making get_stripped_url
a little cleaner.
Get port should then have an easy time returning the port.
I would also like to omit the strict
flag unless you have a strong reason for it. If you do, then that reasoning should also be incorporated into the get_stripped_url function. So I'd rather open an issue to discuss the flag and then implement in both places.
I'm thrilled to see lots of tests, but we don't need to overdo it. My general rule is that I don't test imported libraries, especially not stdlib. So you only need sufficient tests to test the logic that you're implementing. If that doesn't make sense let me know.
You may also be interested in the additional comment notes I wrote here: https://github.com/mozilla/domain_utils/pull/22/files#diff-01e7661674408a06f910cdb3bd537536L200-L210 |
I've proceeded as you mentioned abstracting this behave in a shared function. I've found the case of urls like My intention with those tests is to ensure that it behaves as I expect and not to test urllib. If you don't see the need for those they can be removed. About the |
That sounds great. It's a bit of an odd library. Lots of what seem like redundant test cases but we need to make sure we've tested the edge cases we know of. Just wanted to set expectations. |
@@ -219,7 +231,6 @@ def get_stripped_url(url, scheme=False, drop_non_http=False, use_netloc=True, ex | |||
else: | |||
return url | |||
|
|||
purl = urlparse(url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch.
Thanks @yabirgb! |
No description provided.