Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for custom tokens #1

Merged
merged 11 commits into from
Aug 20, 2023
4 changes: 2 additions & 2 deletions .eslintrc.json
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
],
"@typescript-eslint/array-type": "error",
"@typescript-eslint/consistent-type-assertions": "error",
"@typescript-eslint/consistent-type-definitions": "error",
"@typescript-eslint/consistent-type-definitions": "off",
"@typescript-eslint/explicit-function-return-type": "off",
"@typescript-eslint/no-explicit-any": "off",
"@typescript-eslint/no-parameter-properties": "off",
Expand Down Expand Up @@ -83,7 +83,7 @@
"sort-imports": "off",
"sort-imports-es6-autofix/sort-imports-es6": "warn",
"spaced-comment": ["error", "always", { "markers": ["/"] }],
"@typescript-eslint/no-unused-vars": ["warn", { "argsIgnorePattern": "^_" }],
"@typescript-eslint/no-unused-vars": ["warn", { "argsIgnorePattern": "^_", "varsIgnorePattern": "^_" }],
"tsdoc/syntax": "warn"
}
}
139 changes: 139 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ Create readable Regular Expressions with concise and flexible syntax.
- [Quantifiers](#quantifiers)
- [Groups](#groups)
- [Misc](#misc)
- [Custom Tokens](#custom-tokens)

## Installation

Expand Down Expand Up @@ -490,3 +491,141 @@ const coordinates = oneOrMore.digit
.toRegExp(Flag.Global);
console.log(coordinates.exec('[1,2] [3,4]')); // expect 2 matches
```

### Custom Tokens

Apart from extracting reusable expressions into variables, you can also define custom tokens directly, allowing you to
use them as if they are part of the readable-regexp package for maximum convenience.

There are 3 types of custom tokens:

- **Constant**: tokens that modify the expression without needing parameters. These tokens are not callable.
- **Dynamic**: tokens that take parameters and return different expressions depending on the parameters given. These tokens must be called to provide them with parameters.
- **Mixed**: tokens with optional parameters. These tokens can be called or accessed directly.

Rules for custom tokens:

- The token name must be a valid JavaScript identifier.
- The token name must not conflict with any existing properties of `RegExpToken`.
- All custom tokens should be defined before **any** tokens are used to build regular expressions.

Defining custom tokens is a 3-step process that requires minimal effort and maintains strong typing.

#### Step 1 - Extend the `RegExpToken` interface

**Only required for TypeScript users**

To maintain strong typing on custom tokens, you should extend the built-in `RegExpToken` interface with the type of your
custom token.

```ts
// Import the interface and helper types (as needed)
import { RegExpToken, LiteralFunction, GenericFunction, IncompleteToken } from 'readable-regexp';

// Extend the interface with a declaration
declare module 'readable-regexp' {
interface RegExpToken {

// ===== CONSTANT tokens =====

severity: RegExpToken;
matchAll: RegExpToken;

// ===== DYNAMIC tokens =====
// Dynamic tokens must intersect the IncompleteToken type to signify that parameters are required

// Use the LiteralFunction type for tokens that take a single string parameter
notExactly: LiteralFunction & IncompleteToken;
// Use the GenericFunction type for all other dynamic tokens
// Use a union for function overloading
exactValue: GenericFunction<[num: number] | [bool: boolean], RegExpToken> & IncompleteToken;

// ===== MIXED tokens =====

alpha: GenericFunction<[upper: boolean], RegExpToken> & RegExpToken;

}
}
```

#### Step 2 - Implement the tokens

Use the `defineToken` function to implement the tokens. This function takes the name of the token and its
implementation and returns the implemented token.

**For CONSTANT tokens:**

- Implement the `constant` function
- `this` in the function is a `RegExpToken` that contains the expression preceding the custom token
- The token can append to, wrap around, or modify `this` in any way

```ts
const severity = defineToken('severity', {
constant(this: RegExpToken) {
// Append a constant expression
return this.oneOf`error``warning``info``debug`;
},
});

const matchAll = defineToken('matchAll', {
constant(this: RegExpToken) {
// Wrap around the existing expression
return lineStart.match(this).lineEnd;
},
});
```

**For DYNAMIC tokens:**

- Implement the `dynamic` function
- `this` in the function is a `RegExpToken` that contains the expression preceding the custom token
- Token arguments are passed to the `dynamic` function as arguments
- Template string arguments are converted to ordinary strings automatically

```ts
const notExactly = defineToken('notExactly', {
// Tagged template literals are converted to ordinary strings in the "value" argument
dynamic(this: RegExpToken, value: string) {
return this.notAhead(exactly(value)).repeat(value.length).notCharIn``;
},
});

const exactValue = defineToken('exactValue', {
// Implementation of function overloads
dynamic(this: RegExpToken, num: number | boolean) {
return this.exactly(String(num));
},
});
```

**For MIXED tokens:**

- Implement both the `constant` and `dynamic` functions
- Same rules apply for both functions
- If the custom token is called, the `dynamic` function will handle the call. Otherwise, the `constant` function will be used.

```ts
const alpha = defineToken('alpha', {
constant(this: RegExpToken) {
return this.charIn`a-zA-Z`;
},
dynamic(this: RegExpToken, upper: boolean) {
return upper ? this.charIn`A-Z` : this.charIn`a-z`;
},
});
```

#### Step 3 - Use the token

Custom tokens are integrated as part of readable-regexp. So you can use them just like how you use a built-in token.

```ts
// Start an expression with a custom token returned by defineToken
const expr1 = notExactly`foo`.exactly`bar`.toRegExp(); // /(?!foo)[^]{3}bar/

// Use custom tokens as part of an expression chain
const expr2 = capture.severity.matchAll.toRegExp(); // /^(error|warning|info|debug)$/

// Use custom tokens from the `r` shorthand
const expr3 = r.alpha(false).toRegExp(); // /[a-z]/
```
9 changes: 9 additions & 0 deletions jest.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -192,4 +192,13 @@ export default {

// Whether to use watchman for file crawling
// watchman: true,

transform: {
'^.+\\.tsx?$': [
'ts-jest',
{
isolatedModules: true,
},
],
},
};
157 changes: 157 additions & 0 deletions src/expression.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@ import {
CharClassFunction,
FlagUnion,
FlagsString,
GenericFunction,
GroupFunction,
IncompleteToken,
LiteralFunction,
RegExpLiteral,
RegExpModifier,
Expand Down Expand Up @@ -1876,3 +1878,158 @@ export const oneOf = r.oneOf;
* ```
*/
export const match = r.match;

// a list of tokens with RegExpBuilder assigned to a function
const funcTokens = [
not,
maybe,
maybeLazily,
zeroOrMore,
zeroOrMoreLazily,
oneOrMore,
oneOrMoreLazily,
capture,
group,
ahead,
behind,
notAhead,
notBehind,
];

/**
* Checks if a token intersects either the {@link RegExpToken} or the {@link IncompleteToken} interface.
*/
type IncompleteTokenCheck<TokenType> = TokenType extends RegExpToken
? true
: TokenType extends IncompleteToken
? true
: false;

/**
* Transforms template string arguments to string literals while leaving other arguments unchanged.
*/
type TransformStringLiteralArgs<Args> = Args extends [infer U, ...infer Rest]
? [
U extends TemplateStringsArray ? string : U, // replace template string with ordinary string
...(Rest extends unknown[] ? (unknown[] extends Rest ? [] : Rest) : Rest), // Remove the rest parameter if the argument is a string literal
]
: Args;

/**
* Specifies the configurations required for a given token type.
*/
type CustomTokenConfig<TokenType> = (TokenType extends RegExpToken
? {
constant: (this: RegExpToken) => RegExpToken;
}
: {}) &
(TokenType extends GenericFunction<infer Args, infer ReturnType>
? { dynamic: (this: RegExpToken, ...args: TransformStringLiteralArgs<Args>) => ReturnType }
: {});

const invalidReturnMessage = (val: unknown) =>
`Invalid return value from a constant token: ${val}.\n` +
'If you want to return any other values (which are non-chainable), ' +
'you should implement a dynamic token without parameters to make the chain termination explicit.';

function ensureTokenReturned<T extends object>(value: T): T {
if ((typeof value !== 'object' && typeof value !== 'function') || value === null)
throw new Error(invalidReturnMessage(value));
if ('toRegExp' in value && 'toString' in value) return value;
throw new Error(invalidReturnMessage(value));
}

/**
* Define a custom token that can be used in conjunction with other tokens.
* For a detailed guide on custom tokens, please read https://github.com/hlysine/readable-regexp#custom-tokens
*
* Notes:
*
* - TypeScript users should extend the {@link RegExpToken} interface to add their own custom tokens before calling this function.
* - The token name must be a valid JavaScript identifier.
* - The token name must not conflict with any existing properties of {@link RegExpToken}.
* - All custom tokens should be defined before **any** tokens are used to build regular expressions.
*
* @param tokenName - The name of the custom token. In TypeScript, it needs to be defined in the {@link RegExpToken} interface.
* @param config - The configuration for the custom token. Implement the `constant` method to return a constant token, or the `dynamic` method for a token that accepts arguments. Implement both for a mixed token.
* @returns The custom token
*
* @example
* Create a constant token
*
* Extend the RegExpToken interface to add a new token:
*
* ```ts
* import { RegExpToken } from 'readable-regexp';
*
* declare module 'readable-regexp' {
* interface RegExpToken {
* severity: RegExpToken;
* }
* }
* ```
*
* Implement the custom token:
*
* ```ts
* const severity = defineToken('severity', {
* constant(this: RegExpToken) {
* return this.oneOf`error` `warning` `info` `debug`;
* },
* });
* ```
*
* Use the custom token:
*
* ```ts
* // Referencing the token returned by the defineToken function
* console.log(severity.toString()); // (?:error|warning|info|debug)
*
* // Referencing the token in an expression
* console.log(lineStart.severity.lineEnd.toString()); // ^(?:error|warning|info|debug)$
* ```
*/
export function defineToken<Name extends keyof RegExpToken, Check = IncompleteTokenCheck<RegExpToken[Name]>>(
tokenName: Name,
config: Check extends true
? CustomTokenConfig<RegExpToken[Name]>
: {
error: 'Invalid token type: tokens should intersect the RegExpToken type if they are constant, or the IncompleteToken type if they are dynamic.';
}
): Check extends true ? RegExpToken[Name] : never {
if (tokenName in RegExpBuilder.prototype) throw new Error(`Token ${tokenName} already exists`);
Object.defineProperty(RegExpBuilder.prototype, tokenName, {
get() {
function configure(
this: RegExpBuilder,
...configArgs: RegExpToken[Name] extends (...args: infer Args) => any ? Args : never
): RegExpToken[Name] extends (...args: any) => infer Ret ? Ret : never {
if ('dynamic' in config) {
const value = isLiteralArgument(configArgs) ? [getLiteralString(configArgs)] : configArgs;
return (config.dynamic as (this: RegExpToken, ...args: typeof value) => ReturnType<typeof configure>).apply(
this,
value
);
} else {
throw new Error('Invalid arguments for ' + tokenName + '. This is probably a bug.');
}
}
if (`constant` in config && !('dynamic' in config)) {
return ensureTokenReturned(config.constant.apply(this));
} else if (!(`constant` in config) && 'dynamic' in config) {
return bindAsIncomplete(configure, this, tokenName);
} else if (`constant` in config && 'dynamic' in config) {
return assign(configure.bind(this), ensureTokenReturned(config.constant.apply(this)), false);
} else {
throw new Error(`The custom token ${tokenName} does not have any valid configurations.`);
}
},
enumerable: true,
configurable: true,
});
funcTokens.forEach(token => {
if (!('toRegExp' in token)) return;
Object.defineProperty(token, tokenName, Object.getOwnPropertyDescriptor(RegExpBuilder.prototype, tokenName)!);
});
return r[tokenName] as ReturnType<typeof defineToken>;
}
5 changes: 3 additions & 2 deletions src/helper.ts
Original file line number Diff line number Diff line change
Expand Up @@ -74,10 +74,11 @@ export function bindAsIncomplete<T extends Function, U>(
* Copy all properties from source to target, including those in the prototype chain of source.
* @param target - The target object to which the properties will be added.
* @param source - The source object from which the properties will be copied.
* @param bindFunc - Whether to bind target to source.
* @returns The target object.
*/
export function assign<T extends Function, U>(target: T, source: U): T & U {
target = target.bind(source);
export function assign<T extends Function, U>(target: T, source: U, bindFunc = true): T & U {
if (bindFunc) target = target.bind(source);

const props: string[] = [];
do {
Expand Down
18 changes: 17 additions & 1 deletion src/index.ts
Original file line number Diff line number Diff line change
@@ -1,2 +1,18 @@
export * from './expression';
export { Flag } from './types';
export {
Flag,
/* These types are exported for the convenience of custom extensions */
type RegExpToken,
type LiteralFunction,
type GenericFunction,
type NumberFunction,
type TokenFunction,
type MultiTokenFunction,
type GroupFunction,
type AlternationFunction,
type NamedCaptureFunction,
type RepeatFunction,
type LimitFunction,
type CharClassFunction,
type IncompleteToken,
} from './types';
Loading