Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow case insensitive matching of literals #34

Closed
dmajda opened this issue Aug 14, 2011 · 7 comments
Closed

Allow case insensitive matching of literals #34

dmajda opened this issue Aug 14, 2011 · 7 comments
Assignees
Labels
Milestone

Comments

@dmajda
Copy link
Contributor

dmajda commented Aug 14, 2011

Right now, matching literals case-insensitively is hard and ugly. For example, the only way to match "select" case-insensitively is:

select = [Ss][Ee][Ll][Ee][Cc][Tt]

Having one global flag for case-insensitivity would create problems when parts of a language is case-sensitive and another case-insensitive. Also combining languages (a feature I am thinging about for later) would be harder. Better way would be to signify case insensitivity for each literal separately, e.g. like this:

select = "select"i
@ghost ghost assigned dmajda Aug 14, 2011
@izuzak
Copy link

izuzak commented Aug 15, 2011

i really like this proposal ("select"i). however, limiting the flag to just literals will prove problematic in large grammar files which are used for case insensitive matching - adding "i" to the end of each literal is very error-prone, imo.

could the "i" flag work in a way that it may be applied not just to literals but to any expression? for example:

newRule = select#
select = "select1" / "select2"

where "#" is the flag for case insensitive matching and in this case it is applied to everything matched by the select rule (both "select1" and "select2"). of course, the "#" flag could also be applied to literals themselves, e.g. "select1"#.

@dmajda
Copy link
Contributor Author

dmajda commented Sep 30, 2011

I implemented the i suffix proposal for literals and also for character classes.

I chose this solution because any solution that would mark whole parts of the grammar as case-insensitive would have problems with recursive rules (probably requiring forbidding in such cases) and generally introduce complexity and non-local effects. This is something I'd like to avoid.

An easy way to avoid forgetting i when writing literals is to extract all case-insensitive keywords into separate rules and group them together like this:

select = "select"i
from   = "from"i
where  = "where"i
...

@dmajda dmajda closed this as completed Sep 30, 2011
@s3u
Copy link

s3u commented Dec 6, 2011

Has this fix ever made it into the online site or npm modules?

@dmajda
Copy link
Contributor Author

dmajda commented Feb 2, 2012

@s3u Not yet. Will be included in PEG.js 0.7.0.

@JanSemorad
Copy link

DOWNLOAD
= "DOWNLOAD"
/ D O W N L O A D {return "DOWNLOAD" }

D = [Dd]
O = [Oo]
W = [Ww]
N = [Nn]
L = [Ll]
A = [Aa]

@jtenner
Copy link

jtenner commented Jul 20, 2017

None of this change is documented. I've been using peg.js for years now and I had no idea this feature exists.

@futagoza
Copy link
Member

@JanSemorad Is there a point to that post?

@jtenner Thank for pointing this out. I've been using PEG.js for 4 years now, and for some reason I always knew the feature was there but somehow never realised it wasn't documented 😆 Opened a new issue for this, see #518

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants