-
This is somehow connected with #58: what is the proper lexy'ish way to capture matched range (especially when optionals are involved) but without producing values for intermediate productions. Consider e.g. a floating point value as in json example: struct float_value : lexy::token_production {
struct integer : lexy::transparent_production {
static constexpr auto rule
= dsl::minus_sign + dsl::integer<std::int64_t>(dsl::digits<>.no_leading_zero());
static constexpr auto value = lexy::as_integer<int64_t>;
};
struct fraction : lexy::transparent_production {
static constexpr auto rule = dsl::lit_c<'.'> >> dsl::capture(dsl::digits<>);
static constexpr auto value = lexy::as_string<std::string>;
};
struct exponent : lexy::transparent_production {
static constexpr auto rule = [] {
auto exp_char = dsl::lit_c<'e'> | dsl::lit_c<'E'>;
return exp_char >> dsl::sign + dsl::integer<std::int16_t>;
}();
static constexpr auto value = lexy::as_integer<std::int16_t>;
};
static constexpr auto rule =
dsl::peek(dsl::lit_c<'-'> / dsl::digit<>) >>
dsl::p<integer> +
dsl::opt(dsl::p<fraction>) +
dsl::opt(dsl::p<exponent>);
// value omitted
}; Here, we'd essentially end with 3 values: If I'd just have a position of start and end of match, then things like struct float_value : lexy::token_production {
static constexpr auto rule = [] {
auto integer = dsl::if_(dsl::lit_c<'-'>) + dsl::digits<>.no_leading_zero();
auto fraction = dsl::lit_c<'.'> >> dsl::digits<>;
auto exp_char = dsl::lit_c<'e'> | dsl::lit_c<'E'>;
auto exponent = exp_char >> (dsl::lit_c<'+'> | dsl::lit_c<'-'>) + dsl::digits<>;
return dsl::peek(dsl::lit_c<'-'> / dsl::digit<>) >>
integer +
dsl::if_(fraction) +
dsl::if_(exponent);
}(); How I'd capture the struct float_value : lexy::token_production {
static constexpr auto rule = [] {
auto integer = dsl::if_(dsl::lit_c<'-'>) + dsl::digits<>.no_leading_zero();
auto fraction = dsl::lit_c<'.'> >> dsl::digits<>;
auto exp_char = dsl::lit_c<'e'> | dsl::lit_c<'E'>;
auto exponent = exp_char >> (dsl::lit_c<'+'> | dsl::lit_c<'-'>) + dsl::digits<>;
return dsl::peek(dsl::lit_c<'-'> / dsl::digit<>) >>
dsl::position +
integer +
dsl::if_(fraction) +
dsl::if_(exponent) +
dsl::position;
}();
static constexpr float atof(const char* first, const char* last) {
// std::from_chars(const char*, const char*, float) is only
// available from libc++ starting from LLVM 14 :(
(void)(last);
return ::atof(first);
}
static constexpr auto value = lexy::callback<float>(
[](const char *first, const char *last) { return atof(first, last); }
);
}; But maybe there is a better way here? PS: Maybe there is way to have something like |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 1 reply
-
I plan on adding a callback that does that for you at some point in the future. However, this is non-trivial and will take some time; just wanted to let you know.
I'm not sure what you mean, lexy never does memory allocation? But yeah, the performance difference should be neglible - but it only works if you're parsing a float that matches the format. E.g. a German floating point number
Not at the moment now. The issue with whitespace from #58 still persists. Maybe I can add
There is no overload that accepts two iterators as produced by |
Beta Was this translation helpful? Give feedback.
-
Indeed. but
Funny enough, in my case I'd not bother about whitespace as no extra whitespace is allowed :)
Right. So far I've ended in some cases with custom callback, pair of delimiters or in other cases (e.g. when I need to capture exactly 2 characters) with |
Beta Was this translation helpful? Give feedback.
-
I have added support for |
Beta Was this translation helpful? Give feedback.
I have added support for
dsl::capture(dsl::p<token_production>)
, as well as iterator range support tolexy::as_string
. The latter allowslexy::as_string<std::string_view>
from twodsl::position
(in C++20 where it has the range constructor).