feat: Cyrillic and Greek support #100

lsimeonov · 2024-07-19T15:24:33Z

Added support for Greek and Cyrillic unicodes

This is done by tracking the byte count and correctly handling the multibyte strings

Added simple test for this

zhengchun · 2024-07-22T13:49:43Z

parse.go

-		(unicode.Is(first, r) || unicode.Is(second, r) || string(r) == "*")
+		(unicode.Is(first, r) ||
+			unicode.Is(second, r) ||
+			unicode.Is(cyrillic, r) ||


Why should we need add this new condition for cyrillic and greek. We already knows Non-English letters yet.

zhengchun · 2024-07-22T13:50:14Z

Thanks for your work. I have some question about this code:

unicode.Is(cyrillic, r) || unicode.Is(greek, r)

Why should we need this condition. Removing this code still can works, I test now supports Non-English like Chinese or another language now.

Thanks again.

lsimeonov · 2024-07-23T13:00:45Z

Good catch. That was a leftover test from removed code. The idea was to allow certain charsets, but it’s better left to the user to decide post-parsing.

I removed the change

feat: Cyrillic and greek support

b38b781

lsimeonov mentioned this pull request Jul 19, 2024

Is it possible to create an xpath expression with non-Latin letters? #99

Closed

zhengchun reviewed Jul 22, 2024

View reviewed changes

fix: remove not needed unicode checks

f8678ab

zhengchun merged commit 4286dab into antchfx:master Jul 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Cyrillic and Greek support #100

feat: Cyrillic and Greek support #100

lsimeonov commented Jul 19, 2024

zhengchun Jul 22, 2024

zhengchun commented Jul 22, 2024

lsimeonov commented Jul 23, 2024

feat: Cyrillic and Greek support #100

feat: Cyrillic and Greek support #100

Conversation

lsimeonov commented Jul 19, 2024

zhengchun Jul 22, 2024

Choose a reason for hiding this comment

zhengchun commented Jul 22, 2024

lsimeonov commented Jul 23, 2024