Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUF035 (unsafe-markup-use) can be noisy and would benefit from a whitelist #14523

Closed
ThiefMaster opened this issue Nov 22, 2024 · 7 comments · Fixed by #15076
Closed

RUF035 (unsafe-markup-use) can be noisy and would benefit from a whitelist #14523

ThiefMaster opened this issue Nov 22, 2024 · 7 comments · Fixed by #15076
Labels
preview Related to preview mode features rule Implementing or modifying a lint rule

Comments

@ThiefMaster
Copy link
Contributor

There are a few cases where it does not really make sense to get the warning, since it's pretty clearly intentional to avoid escaping, but adding noqa comments in every place where it's used would be very noisy/verbose:

  • Markup(render_template(...))
  • Markup(_('Have a look at <strong>this</strong> translated string!'))
  • msg = _('Have a look at <strong>this</strong>!'); Markup(msg)

So ideally I'd like to have a setting where I can add function names (ideally as import strings, but just names would also be OK most of the time) to be considered safe to have their return value passed to Markdown - either directly or via a variable assignment.

In the above case, I'd expect all 3 warnings to disappear by whitelisting render_template and _.

@MichaReiser
Copy link
Member

@Daverball what are your thoughts on this. I remember that you intentionally decided not to exclude gettext.

@MichaReiser MichaReiser added rule Implementing or modifying a lint rule preview Related to preview mode features labels Nov 22, 2024
@Daverball
Copy link
Contributor

Daverball commented Nov 22, 2024

Generally my recommendation in these cases it to encapsulate the behavior that looks unsafe into a new function and use that instead of the original one, that way you only have to ignore errors in those few helper functions.

i.e. define a render_template_markup and translated_markup (you can configure i18n tooling to pick up additional function names for picking up i18n strings automatically) for those two cases and use those where appropriate. I think that's also good documentation.

Allowing gettext and other translation functions to pass through unchecked still has some risk, since the translations themselves could be compromised and that's a lot harder to spot than a compromised string literal. So I would try to avoid including markup in translation strings. That being said, I realize it can be inconvenient to split translations into chunks just for the sake of safety, so I have included markup in translations myself in some occasions.

In either case we could perhaps add an additional setting to define a number of functions that are considered safe to be wrapped in Markup, things like sanitization functions and template rendering functions come to mind. But I still think it's generally better to define wrapper functions, since then you don't always have to manually wrap the result. It's also less likely that you will forget to wrap the result this way.

@Daverball
Copy link
Contributor

Daverball commented Nov 22, 2024

Another thing I have done in the past is define a function like this:

@overload
def my_gettext(text: str) -> str: ...

@overload
def my_gettext(text:str, *, markup: Literal[True]) -> Markup: ...

def my_gettext(text: str, *, markup: bool = False) -> str:
    return Markup(gettext(text)) if markup else gettext(text)

And then import that as _ instead of gettext.

(Of course you would then also want to add that function to the list of functions that should be treated like Markup for RUF035)

@Daverball
Copy link
Contributor

Daverball commented Nov 22, 2024

All in all I'm +0 on adding a whitelist. It can genuinely make sense for some cases and it puts the power to make the decision where to draw the line for safety into the user's hands (you may e.g. decide that you take proper care to secure your translation data, so it's fine to include markup in translations).

But it also comes with the risk of making it too easy to accidentally ignore a large set of patterns that still have some unsafe corner cases and this is a security rule, so a higher level of churn is to be expected. Prompting people to restructure their code a bit, to make it less likely to introduce safety bugs seems like a net-positive to me, but that could also be resolved by recommending against using that whitelist as anything other than a temporary stop-gap measure for functions other than sanitization functions (like bleach.clean).

@ThiefMaster
Copy link
Contributor Author

I think w/o the whitelist the most likely "solution" people come up with is just ignoring the whole rule... which is not great :)

@Daverball
Copy link
Contributor

You're not wrong, but giving people a false sense of security isn't great either. A lot of bandit rules are even noisier than this one and don't even try to exclude some easily detectable false positives, so by that measure this rule is actually quite forgiving.

I'm not sure what the correct trade-off is going to be. But I'm certainly not strongly opposed to adding a whitelist, just cautioning that it may weaken the rule's effect.

@MichaReiser
Copy link
Member

I like your suggestion of a wrapper function for new code. But I can see how existing code can't easily be rewritten to use such a wrapper function, which is why an allow-list probably makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Related to preview mode features rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants