-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pydocstyle] Escaped docstring in docstring (D301 ) #12192
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -69,8 +69,39 @@ pub(crate) fn backslashes(checker: &mut Checker, docstring: &Docstring) { | |
// Docstring contains at least one backslash. | ||
let body = docstring.body(); | ||
let bytes = body.as_bytes(); | ||
let mut backslash_index = 0; | ||
let double_quote_docstring_backslashes_pattern = b"\"\\\"\\\""; | ||
let single_quote_docstring_backslashes_pattern = b"\'\\\'\\\'"; | ||
if memchr_iter(b'\\', bytes).any(|position| { | ||
let escaped_char = bytes.get(position.saturating_add(1)); | ||
// Allow escaped docstring. | ||
if matches!(escaped_char, Some(b'"' | b'\'')) { | ||
// If the next three characters are equal to """, it indicates an escaped docstring pattern. | ||
let escaped_triple_quotes = | ||
&bytes[position.saturating_add(1)..position.saturating_add(4)]; | ||
if escaped_triple_quotes == b"\"\"\"" || escaped_triple_quotes == b"\'\'\'" { | ||
return false; | ||
} | ||
|
||
// For the `"\"\"` pattern, each iteration advances by 2 characters. | ||
// For example, the sequence progresses from `"\"\"` to `"\"` and then to `"`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think this assumption is correct and this might actually a bug in the existing implementation. For example, the function passed to What I understand is that you want to track if you're at the beginning of an escape sequence. This is not fully fledged out, but I think we may have to rewrite the entire loop while let Some(position) = memchr::memchr(b'\\', &bytes[offset..]) {
let after_escape = &body[position + 1..];
let next_char_len = after_escape.chars().next().unwrap_or_default();
let Some(escaped_char) = &after_escape.chars().next() else {
break;
};
if matches!(escaped_char, '"' | '\'') {
let is_escaped_triple =
after_escape.starts_with("\"\"\"") || after_escape.starts_with("\'\'\'");
if is_escaped_triple {
// don't add a diagnostic
}
if position != 0 && offset == position {
// An escape sequence, e.g. `\a\b`
}
}
offset = position + escaped_char.len_utf8();
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you. This helps a lot! |
||
// Therefore, we utilize an index to keep track of the remaining characters. | ||
let escaped_quotes_backslashes = &bytes | ||
[position.saturating_add(1)..position.saturating_add(6 - backslash_index * 2)]; | ||
if escaped_quotes_backslashes | ||
== &double_quote_docstring_backslashes_pattern[backslash_index * 2..] | ||
|| escaped_quotes_backslashes | ||
== &single_quote_docstring_backslashes_pattern[backslash_index * 2..] | ||
{ | ||
backslash_index += 1; | ||
// Reset to avoid overflow. | ||
if backslash_index > 2 { | ||
backslash_index = 0; | ||
} | ||
return false; | ||
} | ||
return true; | ||
} | ||
// Allow continuations (backslashes followed by newlines) and Unicode escapes. | ||
!matches!(escaped_char, Some(b'\r' | b'\n' | b'u' | b'U' | b'N')) | ||
}) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will panic if what comes after the
\
is shorter than 3 characters. I would rewrite this to something like