Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: use raw strings for regex patterns #3029

Merged
merged 1 commit into from
May 16, 2024
Merged

Conversation

scanny
Copy link
Collaborator

@scanny scanny commented May 15, 2024

Summary
Avoid SyntaxWarning and/or SyntaxError messages when importing unstructured.nlp.patterns by using raw strings ("r" prefix) for regex patterns which may contain \x character sequences not recognized by the Python parser for normal strings.

Fixes: #2495

@@ -27,6 +27,7 @@ lint.select = [
"UP018", # -- Unnecessary {literal_type} call like `str("abc")`. (rewrite as a literal) --
"UP032", # -- Use f-string instead of `.format()` call --
"UP034", # -- Avoid extraneous parentheses --
"W", # -- Warnings, including invalid escape-sequence --
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also make this change in .pre-commit-config.yaml and the Makefile?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We moved the redundant specification of ruff configuration out of those two a while back. So this pyproject.toml is the authoritative source of ruff config now, a dividend of DRY :)
94535e3

Copy link
Collaborator

@Coniferish Coniferish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once comment is addressed. Approving in advance

Avoid `SyntaxWarning` and/or `SyntaxError` messages when importing
`unstructured.nlp.patterns` by using raw strings (`"r"` prefix) for
regex patterns which may contain `\x` character sequences not recognized
by the Python parser for normal strings.
@scanny scanny force-pushed the scanny/fix-syntax-error branch from f81aa4a to 1894467 Compare May 16, 2024 16:23
@scanny scanny enabled auto-merge May 16, 2024 16:41
@scanny scanny added this pull request to the merge queue May 16, 2024
Merged via the queue into main with commit 0de9215 May 16, 2024
42 checks passed
@scanny scanny deleted the scanny/fix-syntax-error branch May 16, 2024 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug/syntaxerror with anaconda and python 3.11
2 participants