-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redact basic authentication passwords from log messages #5773
Conversation
src/pip/_internal/utils/misc.py
Outdated
@@ -880,15 +880,29 @@ def split_auth_from_netloc(netloc): | |||
return netloc, user_pass | |||
|
|||
|
|||
def redact_password_from_url(url): | |||
def _redact_netloc(netloc): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would pull this function out to module level so it can be tested separately. You can use the same parametrized test cases as test_split_auth_from_netloc()
. This will combinatorially cut down the number of test cases you need to properly test redact_password_from_url()
, since redact_password_from_url()
will be a simple combination of redact_netloc()
and transform_url()
.
src/pip/_internal/utils/misc.py
Outdated
|
||
def transform_url(url, transform_netlock): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this function to be above redact_password_from_url()
and remove_auth_from_url()
(and after redact_netloc()
), since it is used in both of them, and it is nice to have a progression so that functions within a module depend only on functions that precede them (so it can be read from top to bottom).
src/pip/_internal/utils/misc.py
Outdated
|
||
def transform_url(url, transform_netlock): | ||
purl = urllib_parse.urlsplit(url) | ||
netloc = transform_netlock(purl.netloc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling: transform_netloc()
tests/unit/test_utils.py
Outdated
('https://user:[email protected]/project/tags/v0.2', | ||
'https://domain.tld/project/tags/v0.2'), | ||
('https://user:[email protected]/svn/project/trunk@8181', | ||
'https://domain.tld/svn/project/trunk@8181'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave these test cases alone in the same order, especially if you're not changing them. Otherwise, it makes it needlessly harder to tell whether the tests have been changed or not.
tests/unit/test_utils.py
Outdated
('git+ssh://[email protected]/something', | ||
'git+ssh://[email protected]/something'), | ||
('git+https://user:[email protected]/something', | ||
'git+https://user:****@pypi.org/something'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like more test cases than you need. If you break out redact_netloc()
like I suggested above and test it separately, you should be able to reduce the number of test cases you need for combinatorial reasons. You can focus on testing redact_netloc()
well (which is a simpler function). And then test_redact_password_from_url()
can be limited to making sure that redact_password_from_url()
is "wired up" correctly from redact_netloc()
and transform_url()
.
Thanks for the prompt review @cjerdonek. I've made the changes you requested. Regarding testing, I just did a |
src/pip/_internal/utils/misc.py
Outdated
# and are not recognized in the url. | ||
def redact_netloc(netloc): | ||
netloc, (user, passw) = split_auth_from_netloc(netloc) | ||
if not user: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should check for None it seems.
Thanks. Looks a lot better. I’m not a huge fan of mocks, especially when not needed. I think it would be just as good or better to do one or two test cases to confirm that it’s being called (e.g. one with and one without password). Also, one reason the mock test isn’t so great is that it’s not checking the return value, which is the most important thing to be checking. |
tests/unit/test_utils.py
Outdated
# Test the password containing a : symbol. | ||
('user:pass:[email protected]', 'user:****@example.com'), | ||
]) | ||
def test_redact_netloc(netloc, expected): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this test to be after the split-auth test (so same order as in the module).
tests/unit/test_index.py
Outdated
@@ -153,4 +153,4 @@ def test_get_formatted_locations_basic_auth(): | |||
finder = PackageFinder([], index_urls, session=[]) | |||
|
|||
result = finder.get_formatted_locations() | |||
assert 'user' not in result and 'pass' not in result | |||
assert 'user' in result and 'pass' not in result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about also checking that “****” is in the result to check / make it more obvious that redaction should be happening?
Hello! I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the |
Thanks for the review @cjerdonek, I've made the changes as requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more comments.
tests/unit/test_utils.py
Outdated
|
||
|
||
@patch('pip._internal.utils.misc.transform_url') | ||
def test_redact_password_from_url(mocked_transform_url): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like trying to mock here just obscures the test.
tests/unit/test_utils.py
Outdated
|
||
@patch('pip._internal.utils.misc.transform_url') | ||
def test_redact_password_from_url(mocked_transform_url): | ||
redact_password_from_url('[email protected]') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"[email protected]" isn't a full URL. I would test with an actual URL.
tests/unit/test_utils.py
Outdated
@@ -373,8 +374,6 @@ class Test_normalize_path(object): | |||
# it's easiest just to skip this test on Windows altogether. | |||
@pytest.mark.skipif("sys.platform == 'win32'") | |||
def test_resolve_symlinks(self, tmpdir): | |||
print(type(tmpdir)) | |||
print(dir(tmpdir)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't related to this PR, so I would leave it alone.
src/pip/_internal/utils/misc.py
Outdated
@@ -907,6 +911,18 @@ def remove_auth_from_url(url): | |||
return surl | |||
|
|||
|
|||
def redact_password_from_url(url): | |||
"""Replace the password in a given url with ****""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need a period at the end of the sentence.
src/pip/_internal/utils/misc.py
Outdated
@@ -907,6 +911,18 @@ def remove_auth_from_url(url): | |||
return surl | |||
|
|||
|
|||
def redact_password_from_url(url): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this function to be after remove_auth_from_url()
(since it is a "fancier" version).
src/pip/_internal/utils/misc.py
Outdated
# username/pass params are passed to subversion through flags | ||
# and are not recognized in the url. | ||
def redact_netloc(netloc): | ||
netloc, (user, passw) = split_auth_from_netloc(netloc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just write out "password" since it's not that much longer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more.
src/pip/_internal/utils/misc.py
Outdated
if user is None: | ||
return netloc | ||
password = '' if password is None else ':****' | ||
return '{user}{passw}@{netloc}'.format(user=user, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"passw" -> "password"
tests/unit/test_utils.py
Outdated
|
||
|
||
def test_redact_password_from_url(): | ||
with patch('pip._internal.utils.misc.transform_url') as mocked_transform_url: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was saying before that I think mocking obscures the test. Why can't you just pass in a few real URL's like the other tests (e.g. one with a user-pass, one with no user-pass, and one with a user but no pass).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good idea, I dislike mocks as well. Done!
Looking a lot better. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some final(?) minor comments.
news/4746.bugfix
Outdated
@@ -0,0 +1 @@ | |||
Redact password from the URL in various log messages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Redact the password"
src/pip/_internal/utils/misc.py
Outdated
|
||
def transform_url(url, transform_netloc): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would mark this private: _transform_url
.
src/pip/_internal/utils/misc.py
Outdated
# Return a copy of url with 'username:password@' removed. | ||
# username/pass params are passed to subversion through flags | ||
# and are not recognized in the url. | ||
return transform_url(url, lambda netloc: split_auth_from_netloc(netloc)[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would define the lambda on its own line:
transform_netloc = lambda netloc: split_auth_from_netloc(netloc)[0]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The linter dislikes this, and I also try not to do this. I think it's fine as it is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the linter not like? I’m not a fan of cramming a lot into one line. You can also define the function using def syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lambdas are supposed to be anonymous - if you bind them to a name it kind of defeats the point. The linter points this out and tells you to use def
.
I'll do whatever you suggest (I don't care that much), but a lambda seems perfect here. A whole named function for this seems overkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just think the code will be easier to understand if you give the function a name (whether that be a lambda or def syntax). It will also have a more parallel / symmetric look with redact_password_from_url
below.
tests/unit/test_utils.py
Outdated
('user:pass:[email protected]', 'user:****@example.com'), | ||
]) | ||
def test_redact_netloc(netloc, expected): | ||
result = redact_netloc(netloc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"result" -> "actual" to match the earlier test.
tests/unit/test_utils.py
Outdated
('https://example.com', 'https://example.com') | ||
]) | ||
def test_redact_password_from_url(auth_url, expected_url): | ||
result = redact_password_from_url(auth_url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"result" -> "actual" (or "url") to match previous tests.
tests/unit/test_utils.py
Outdated
@@ -681,3 +701,14 @@ def test_split_auth_from_netloc(netloc, expected): | |||
def test_remove_auth_from_url(auth_url, expected_url): | |||
url = remove_auth_from_url(auth_url) | |||
assert url == expected_url | |||
|
|||
|
|||
@pytest.mark.parametrize('auth_url,expected_url', [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add space to match previous test: auth_url, expected_url
Closing / opening to trigger CI to re-run. |
Thanks a lot for all your work and follow-through on this, @orf. |
No problem, thank you for the prompt and detailed reviews @cjerdonek 🎉 |
Does anyone know when this fix gets released? |
It'll be a part of 19.0, scheduled early next year. pip has a predictable release cadence, which @benoit-pierre linked to. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
This PR #5590 had merge conflicts, so I made a new one with the changes suggested. Fixes #4746