Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserving the backslash character #37

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Preserving the backslash character #37

wants to merge 4 commits into from

Conversation

jpnjfk
Copy link

@jpnjfk jpnjfk commented Jul 16, 2019

This change leaves the escape character backslash '' when the backslash isn't precede one of " :=fnrt" that need escaping. Currently, any backslash that isn't precede one of the escaped characters is simply dropped.
We don't know how each property value will be processed by the callers.
So, it would be better that the package 'properties' doesn't make modifications as possible because the backslash may be put on purpose.

@jpnjfk jpnjfk changed the title leaveing the backslash character Preserving the backslash character Jul 16, 2019
@magiconair
Copy link
Owner

Hmm, I get it but this would be a breaking change since this behavior has been in the library since 2014. If you need this then this should be configurable.

@jpnjfk
Copy link
Author

jpnjfk commented Jul 16, 2019

Thank you for your comment.
Does configurable means using an environment variable?
Or callers should be able to control turning on/off the function?
Of course, the default value is off. :)
Is a global variable okay? I don't want to mess the existing code.

@magiconair
Copy link
Owner

magiconair commented Jul 17, 2019

I would expect a flag in the Properties struct which enables this, e.g. KeepBackslash, RetainBackslash or something similar which defaults to false (i.e. off).

Global variable is not ok, unfortunately. This needs to be properly integrated and there need to be tests for this.

@jpnjfk
Copy link
Author

jpnjfk commented Jul 17, 2019

Thank you for your suggestion.
I made a new commit whose comment is "To avoid a breaking change, the keepBackslash is false by default".
That works as expected though I messed up the source codes including the tab alignments the Visual Source Code automatically did.
Could you review my changes again?

@jpnjfk
Copy link
Author

jpnjfk commented Jul 19, 2019

In order to propagate the flag Properties.KeepBackslash to the lexer, I had to make changes in the other structure and the argument of lex().

@jpnjfk
Copy link
Author

jpnjfk commented Jul 21, 2019

I gets back the two test cases I commented out in load_test.go.

@jpnjfk
Copy link
Author

jpnjfk commented Jul 30, 2019

I think that my changes are ready to go.
Could you review them?

@magiconair
Copy link
Owner

I've thought about this a bit more. The Java properties implementation silently drops single backslash characters to avoid tripping over invalid escape sequences. The Go version emulates that behavior.

The method does not treat a backslash character, , before a non-valid escape character as an error; the backslash is silently dropped. For example, in a Java string the sequence "\z" would cause a compile time error. In contrast, this method silently drops the backslash. Therefore, this method treats the two character sequence "\b" as equivalent to the single character 'b'.

(https://docs.oracle.com/javase/7/docs/api/java/util/Properties.html#load(java.io.Reader))

However, if you escape the backslash it works as expected. What am I missing?

func TestBackslash(t *testing.T) {
	input := "k\\\\ey=va\\\\lue"
	p := mustParse(t, input)
	assert.Equal(t, p.MustGet("k\\ey"), "va\\lue")
}

@jpnjfk
Copy link
Author

jpnjfk commented Aug 7, 2019

Thank you for explaining the behavior.
Yes, the emulation point of view, the behavior is correct by default.
As I wrote before, we don't know how each property value will be processed by the callers.
So, it would be nice that the package 'properties' doesn't make silent modifications as possible because the backslash may be put on purpose.
Since this is an optional behavior, the change would give the developers simplification of escaping characters in text.
How do you think?

@jpnjfk
Copy link
Author

jpnjfk commented Sep 3, 2019

Hi,
Is it possible to add the option?
What I want is an option that doesn't make silent modifications as possible.
I agree that the workaround you wrote works. However, it would be error prone for translators who aren't programmers. It would be nice to have the package as simple as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants