-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Masks secrets in traces. #1797
Masks secrets in traces. #1797
Conversation
✅ Deploy Preview for dlt-hub-docs canceled.
|
9d9b8df
to
374cecc
Compare
374cecc
to
6ae763c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what we need is this test
DESTINATION__POSTGRES__CREDENTIALS="postgresql://loader:loader@localhost:5432/dlt_data" pytest tests/cli/ -k test_deploy_command
to pass. it is checking if secrets were printed out
we'll need a version of this test (use parametrize) that deletes secrets.toml and you placeholder string are present
# this time script will run
venv.run_script("debug_pipeline.py")
with echo.always_choose(False, always_choose_value=True):
with io.StringIO() as buf, contextlib.redirect_stdout(buf):
deploy_command.deploy_command(
"debug_pipeline.py",
deployment_method,
deploy_command.COMMAND_DEPLOY_REPO_LOCATION,
**deployment_args
)
_out = buf.getvalue()
destroy secrets.toml just after venv.run_script("debug_pipeline.py")
is run ie by injecting mock config providers like this fixture does
def toml_providers() -> Iterator[ConfigProvidersContext]:
dlt/pipeline/trace.py
Outdated
@@ -280,8 +284,8 @@ def end_trace_step( | |||
resolved_values = map( | |||
lambda v: SerializableResolvedValueTrace( | |||
v.key, | |||
v.value, | |||
v.default_value, | |||
_mask_secret(v.value) if is_secret_hint(v.hint) else v.value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not mask it in dump. just set None on secrets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it could be helpful to be able to distinguish between (i) None
, meaning that no value is set, and (ii) "*****" meaning that a secret is present?
dlt/pipeline/trace.py
Outdated
@@ -298,6 +302,13 @@ def end_trace_step( | |||
return trace | |||
|
|||
|
|||
def _mask_secret(trace_item_value: Any) -> Any: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should definitely have one common method to display masked secrets... but my take is that nothing should be revealed at all. for now we cna remove this function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I removed it.
I like revealing the first or last or both chars because I often debugged authorization errors by checking if the correct secret is present in the first place. When breakpoints are not practical or possible (e.g. github actions) I often print the first few chars.
Would it be therefore a helpful to reveal the first & last char by default or would it be better to not compromise security and reveal nothing? The developer can then do logging tricks as described above.
Sure, we can change it for the rest_api source too. What do you think, @burnash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rudolfix I agree with @willi-mueller I think it makes it more practical to display first and last char (when the length of the secret string allows). Could you elaborate why you'd want to mask everything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added deploy test. LGTM!
Description
Related Issues
dlt pipeline -v <pipeline> trace
source password not redacted #1687Note
TODO before merging
trace.asstr(verbosity=1)
which is called by the dlt pipeline -v trace commandMASKED_SECRET="*****"
so that we can use it in bothdlt/sources/rest_api/__init__.py
as well asdlt/pipeline/trace.py