Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: iterator is extended over end of string if no match occurs with dfa_match #77

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

moritz-geier
Copy link

@moritz-geier moritz-geier commented Apr 22, 2024

Hello,

I got an error on Windows 11 and MSVC 19.39 where the parser fails. This happens if the dfa_match returns uninitialized16 and we try to add it to the iterator. Thus here my proposed fix.

To test the behavior create the following parser

    ///////////////////////////////////////////////////////////////////////////
    static constexpr ctpg::nterm<bool> conditionExpression{"condition"};
    static constexpr ctpg::char_term trueLiteral{'t'};
    static constexpr ctpg::char_term falseLiteral{'f'};

    ///////////////////////////////////////////////////////////////////////////
    static constexpr auto root = conditionExpression;

    ///////////////////////////////////////////////////////////////////////////
    static constexpr auto terminals = ctpg::terms(
        trueLiteral,
        falseLiteral
    );

    ///////////////////////////////////////////////////////////////////////////
    static constexpr auto nonTerminals = ctpg::nterms(
        conditionExpression
    );

    ///////////////////////////////////////////////////////////////////////////
    static constexpr ctpg::parser parser{
        root,
        terminals,
        nonTerminals,
        rules(
            conditionExpression(trueLiteral) >=
                [](const auto&) { return true; },
            conditionExpression(falseLiteral) >=
                [](const auto&) { return false; }
        )
    };

and try to parse

parser.parse(ctpg::buffers::cstring_buffer{"tr"});

@moritz-geier
Copy link
Author

This is a dirty fix, in my opinion we should use std::distance within dfa_match to initiate rt.len , thus eliminating the need for the ifs afer calling the function. This would mean we need to enforce the iterators of each buffer to provide a operator- implementation.

@peter-winter
Copy link
Owner

peter-winter commented Apr 26, 2024

First of all I am reluctant to commit any new work until I merge my term groups feature into master.
The only thing left to do (as always) is msvc compilation, which I just don't have time to do at the moment.

I would gladly accept help regarding this topic.

The feature is very nice itself, it allows much smaller parsers and compile times when grouping similar terminal symbols. C language for instance has 11 assignment operators which from parser point of view are the same.

Second thing, after I'm done with the grouped terms feature, we can think of a better solution (you already mentioned a nice one) not this dirty one.

Would you be willing to work on making the term-groups branch work on msvc?

@moritz-geier
Copy link
Author

Yes I can try and get it working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants