Use Regular Expressions to generate text strings can be used in the following situations:
- Wrting test data for web forms.
- Writing test data for databases.
- Generating test data for regular expressions.
##Example
use ReverseRegex\Lexer;
use ReverseRegex\Random\SimpleRandom;
use ReverseRegex\Parser;
use ReverseRegex\Generator\Scope;
# load composer
require "vendor/autoload.php";
$lexer = new Lexer('[a-z]{10}');
$gen = new SimpleRandom(10007);
$result = '';
$parser = new Parser($lexer,new Scope(),new Scope());
$parser->parse()->getResult()->generate($result,$gen);
echo $result;
Produces
jmceohykoa
aclohnotga
jqegzuklcv
ixdbpbgpkl
kcyrxqqfyw
jcxsjrtrqb
kvaczmawlz
itwrowxfxh
auinmymonl
dujyzuhoag
vaygybwkfm
##Installing
To install use composer
{
"require" : {
"icomefromthenet/reverse-regex" : "dev-master"
}
}
- Escape all meta-characters i.e. if you need to escape the character in a regex you will need to escape here.
- Not all meta-characters are suppported see list below.
- Use
\X{####}
to specify unicode value use[\X{####}-\X{####}]
to specify range. - Unicdoe
\p
not supported, I could not find a port of UCD to php, maybe in the future support be added. - Quantifiers are applied to left most group, literal or character class.
- Beware of the
+
and*
quantifers they apply a possible maxium number of occurances up toPHP_INT_MAX
.
Example | Description | Resulting String |
---|---|---|
(abcf) | Support literals this would generate string | `abcf` |
\((abcf)\) | Escape meta characters as you normally would in a regex | `(abcf)` |
[a-z] | Character Classes are supported | `a` |
a{5} | Quantifiers supported always last group or literal or character class | `aaaaa` |
a{1,5} | Range Quantifiers supported | `aa` |
a|b|c | Alternation supported pick one of three at random | `b` |
a|(y|d){5} | Groups supported with alternation and quantifiers | `ddddd` or `a` or `yyyyy` |
\d | Digit shorthand equ [0-9] | `1` |
\w | word character shorthand equ [a-zA-Z0-9_] | `j` |
\W | Non word character shorthand equ [^a-zA-Z0-9_] | `j` |
\s | White space shorthand ASCII only | ` ` |
\S | Non White space shorthand ASCII only | `i` |
. | Dot all ASCII characters | `$` |
* + ? | Short hand quantifiers, recommend not use them | |
\X{00FF}[\X{00FF}-\X{00FF}] | Unicode ranges | |
\xFF[\xFF-\xFF] | Hex ranges |