Carriage return (\r) cannot be considered as new line #79

nbigaouette · 2021-09-14T15:42:39Z

Thanks for this project!

I'm trying to parse a format that considers new line delimiters not only the "line feed" (\n) but also:

"carriage return" (\r)
a combination of both:
- "carriage return" + "line feed" (\r\n)
- "line feed" + "carriage return" (\n\r)

I know it's a mess :(

Right now, nom_locate v4.0.0 only considers \n as new line delimiters. This means that I cannot use location_line() nor get_column()/get_utf8_column() reliably.

Having access to these methods would be useful for sure, but I don't want to expose in my crate something that cannot be used reliably.

Some potential approaches to the problem:

Adapt nom_locate so that it can recognize those bytes or combination of as new line delimiters.
Have a way to hide the methods in the case where they lead to confusion.
Split the LocatedSpan so one part can be used for &[u8] while the other for &str
In my crate, wrap LocatedSpan in a newtype, forwarding everything except those methods.
Keep things like that, hoping location_line()/get_column()/get_utf8_column() will not get used.

I think that one confusion that I got was that LocatedSpan can be used for both text (&str) and bytes (&[u8]). This problem would be the same if the data to parse is binary (&[u8]); lines and columns do not make sense in that context so exposing the methods might not be useful?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Carriage return (\r) cannot be considered as new line #79

Carriage return (\r) cannot be considered as new line #79

nbigaouette commented Sep 14, 2021

Carriage return (\r) cannot be considered as new line #79

Carriage return (\r) cannot be considered as new line #79

Comments

nbigaouette commented Sep 14, 2021