Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic parser over input #187

Open
alrz opened this issue Dec 22, 2024 · 6 comments
Open

Generic parser over input #187

alrz opened this issue Dec 22, 2024 · 6 comments

Comments

@alrz
Copy link

alrz commented Dec 22, 2024

Right now, the Parse method only accepts an string. Accepting a generic ReadOnlySpan<T> would enable defining a parser over a non-string sequence (therefore current text based parsers would be defined on a ReadOnlySpan<char> input).

(note: accepting ReadOnlySpan<T> may be too big of a change because it's a ref struct, but I think any container type should work as long as it's a sequence, like an array)

@sebastienros
Copy link
Owner

Agreed.

There is a very old PR too that I kept for reference that use some streamable source too (don't recall if it was based on streams or pipes). This one was concerning me because it made the code more complex and slower, though it might be the best thing to do in the end whatsoever.

@alrz
Copy link
Author

alrz commented Jan 8, 2025

Support for an streamable source is most definitely useful but this issue is only about accepting a generic (buffered) sequence of T as the input instead of string. I don't think this support alone would be too distant from the existing implementation.

@alrz
Copy link
Author

alrz commented Jan 9, 2025

OK, I just experimented with this to see how much work it would be to do this, still not quite sure but I think it would make more sense to first start by only accepting ReadOnlySpan<char> to remove all the Subscring calls and then maybe go from there. This will make a few types a ref struct and may have greater impact further down the line. If that sounds good to you I may take a stab at it in the future if someone else doesn't get there first.

@sebastienros
Copy link
Owner

Also tried and the TextSpan class is the main issue. I first removed the Buffer to make it implicit but it's actually necessary to distinguish these coming from the input buffer or from a concrete string. Example is the Decode method that tries to evaluate char escapes. I think TextSpan could become some kind of abstraction, maybe even handling composites if necessary. Somehow like ReadOnlySpan but not ref struct.

@alrz
Copy link
Author

alrz commented Jan 10, 2025

Also tried and the TextSpan class is the main issue.

If that's because it's being used as a type arg, would it make sense to use ReadOnlyMemory<char> instead (either within TextSpan or even directly)?

@sebastienros
Copy link
Owner

ROM could be the solution. Could even be used directly since it might be what I have modeled without realizing.

If you are looking into it locally I made Scanner a ref struct and extracted it from ParseContext. Anything that tool a ParseContext would also take a Scanner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants