Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

utils.str: Handle \0 in perlReToReplacer #1536

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

tatokis
Copy link
Contributor

@tatokis tatokis commented May 12, 2023

It is required to use \g<0> as otherwise Python will process \0 as a 0 byte.

Previous behaviour:

<User> @re "s/hello/\0 world/" "hello"
<Bot> '\x00 world'

It is required to use \g<0> as otherwise Python will process \0 as
a 0 byte.
@jlu5
Copy link
Collaborator

jlu5 commented May 14, 2023

I did some testing and it seems that \0 is not a particularly standard construct among regex implementations. Some will treat it as NUL, some will treat it as the a backreference for the whole string, while others will see it as an error entirely: https://regex101.com/r/SCfTW4/1

I think in these cases it's better to preserve the Python behaviour to avoid introducing inconsistencies, and use \g<0> explicitly in your regexps as needed.

<jlu5_> re "s/o/\0\0/" "hello"
-bitmonster- 'hell\x00\x00'
<jlu5_> re "s/o/\g<0>\g<0>/" "hello"
-bitmonster- helloo

Interestingly it seems perl also treats \0 as NUL:

$ perl -p -e 's/(o)/\0/g' <<< foobar | hexdump -c
0000000   f  \0  \0   b   a   r  \n                                    
0000007

@@ -298,7 +298,8 @@ def perlReToReplacer(s):
regexp = regexp.replace('\x08', r'\b')
replace = replace.replace('\\'+sep, sep)
for i in range(10):
replace = replace.replace(chr(i), r'\%s' % i)
replace = replace.replace(chr(i), r'\g<%s>' % i)
replace = replace.replace(r'\0', r'\g<0>')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line looks redundant

@progval
Copy link
Owner

progval commented May 14, 2023

Some will treat it as NUL

this one doesn't really make sense on IRC

@jlu5
Copy link
Collaborator

jlu5 commented May 14, 2023

Some will treat it as NUL

this one doesn't really make sense on IRC

Given the nested commands support though, not all command output is necessarily made to be displayed.

@progval progval changed the base branch from testing to master May 5, 2024 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants