-
Notifications
You must be signed in to change notification settings - Fork 887
Introduce AbstractWalker for performance #2093
Conversation
Thanks for working on this. Excited to see the performance gains. But I think there's still one too many classes. You have: class Rule {
...
public apply(sourceFile: ts.SourceFile): Lint.RuleFailure[] {
return this.applyWithWalker(new NullWalker(sourceFile, this.getOptions()));
}
}
class NullWalker extends Lint.AbstractWalker {
public walk(sourceFile: ts.SourceFile) {
const cb = (node: ts.Node): void => {
if (node.kind === ts.SyntaxKind.NullKeyword && !Lint.someAncestor(node.parent!, isTypeNode)) {
return this.addFailureAtNode(node, Rule.FAILURE_STRING);
}
return ts.forEachChild(node, cb);
};
return ts.forEachChild(sourceFile, cb);
}
} But the second class only implements one method. An abstract class with a single method is a Java-esque workaround for a lack of ordinary functions. This is JavaScript, we just use a function! All we need is an ordinary function that adds failures to a context and returns nothing. class Rule {
...
public apply(sourceFile: ts.SourceFile) {
return this.applyWithWalk(walk);
}
}
function walk(ctx: Lint.WalkContext): void {
ts.forEachChild(ctx.sourceFile, recur);
function cb(node: ts.Node) {
if (node.kind === ts.SyntaxKind.NullKeyword && !Lint.someAncestor(node.parent!, isTypeNode)) {
ctx.addFailureAtNode(node, Rule.FAILURE_STRING);
}
ts.forEachChild(node, cb);
}
} (Note that a |
src/language/walker/walker.ts
Outdated
return this.failures; | ||
} | ||
|
||
public hasOption(option: any) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should encourage parsing options up-front rather than calling hasOption
(potentially deep within a loop). Since options are extraneous to this class' responsibility of adding failures, couldn't this all (the options
property and hasOptions
) be removed from here? Options could simply be passed to walk
as an ordinary argument -- or parsed first, then passed as structured data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that idea. I have the following in mind:
class AbstractWalker<T> {
...
constructor(public readonly sourceFile: ts.SourceFile, public readonly options: T) {
...
}
}
This way you have the choice to parse the options in Rule
or inside the constructor of your specific walker implementation ... or you can opt out completely by setting the type parameter to any
;
src/language/walker/walker.ts
Outdated
|
||
public abstract walk(sourceFile: ts.SourceFile): void; | ||
|
||
public getSourceFile() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just expose the sourceFile
property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sourceFile
is exposed as public property. This methods only needs to be there to implement the IWalker
interface for compatibility with RuleWalker
@andy-hanson Regarding the Or do you mean replacing the Walker class with a simple function only where no instance methods are needed? // can be used as the context for a walk function
class WalkContext {
addFailure() {...}
getFailures() {...}
... // everything else that is on AbstractWalker except walk and getSourceFile
}
// will be used as Walker if the logic is more complex and involves instance methods in the concrete implementation
abstract class AbstractWalker extends WalkContext implements IWalker {
abstract walk(sourceFile);
getSourceFile() {...} // so satisfy the interface
} |
I don't think using a class for walking is necessary for any rule. A class where all methods but one are private is just as suspect as an interface with only one method. Private methods can be changed to just function declarations inside the scope of I put some sample code at andy-hanson#1. |
@andy-hanson I understand your point, but I would not recommend going that route (in this particular case). First of all this class should and will be extended to migrate things like Second, I wanted this to be somewhat compatible with Third, performance: closures are not at all your friend in the hot path.
I'm following the performance work done in node.js very closely. They try to avoid closures by wrapping callbacks and the variables it would close over in a class. As of node v7.x they even use standalone function declarations and use |
It wouldn't be too hard to write an |
I don't have a strong opinion whether one should prefer classes over closures or the other way around. At the end of the day it's only personal preference which style you choose. That's not my decision ... I'm just a random guy on the internet 😆 I made some changes based on your feedback. Now you can do both. Note that |
super.visitNode(node); | ||
} | ||
}; | ||
return ts.forEachChild(sourceFile, cb); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ts.forEachChild
actually does return a value (assuming the callback does), so this was confusing at first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's why the callback function here has the return type annotation : void
.
The return is actually there on purpose to serve V8's tail call optimizer. AND once ES6 proper tail calls get shipped (currently only available behind --harmony_tailcalls
flag) we get the ability to handle arbitrarily deep nested trees for free.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a return value, but why return it from walk
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All return statements in this method return void
. As explained above, it is much faster to use return for calls in tail position when it does not return anything.
src/language/rule/rule.ts
Outdated
this.limit = sourceFile.getFullWidth(); | ||
} | ||
|
||
public getFailures() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the point of making it private then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The failures array could just be declared as public readonly failures: RuleFailure[] = []
and let the AbstractWalker implement the getFailures
method.
Other than that code looks good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes look really good. I can't wait to start using the lean AbstractWalker on the rules. I hope the changes make it in.
src/language/rule/rule.ts
Outdated
this.limit = sourceFile.getFullWidth(); | ||
} | ||
|
||
public getFailures() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The failures array could just be declared as public readonly failures: RuleFailure[] = []
and let the AbstractWalker implement the getFailures
method.
Also made some changes to simplify AbstractRule and added LanguageService to WalkContext
@andy-hanson @jmlopez-rod I moved I also made some changes in
export class WalkContext<T> {
+ public readonly languageService: ts.LanguageService;
constructor(public readonly sourceFile: ts.SourceFile,
public readonly ruleName: string,
public readonly options: T,
- public readonly languageService?: ts.LanguageService) {
+ languageService?: ts.LanguageService) {
+ if (languageService !== undefined) {
+ this.languageService = languageService;
+ } |
Does the solution you presented above actually work? if we don't pass the language service then its property won't ever get initialized. I would think typescript would complain about something like that. I think it is fine though, we can make it into a runtime error if we need the languageService and it wasn't provided. Otherwise if this is very popular (not to me right now) we could create a |
TypeScript doesn't check for uninitialized properties. Maybe you want a lint rule 😃. |
Ok, removed the languageService parameter property for now. As @andy-hanson suggested there should be a Seems like this is ready for now, awaiting final approval. I would rather wait some time to document this in case changes are required shortly after landing. Short summary why those changes were made:
|
src/language/walker/walker.ts
Outdated
export abstract class AbstractWalker implements IWalker { | ||
public readonly options: ReadonlyArray<any>; | ||
public readonly ruleName: string; | ||
private limit: number; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fileLength
?
src/language/rule/abstractRule.ts
Outdated
} | ||
|
||
public abstract apply(sourceFile: ts.SourceFile, languageService: ts.LanguageService): RuleFailure[]; | ||
|
||
public applyWithWalker(walker: IWalker): RuleFailure[] { | ||
protected applyWithWalker(walker: IWalker): RuleFailure[] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a breaking change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make this public?
src/language/rule/abstractRule.ts
Outdated
public isEnabled(): boolean { | ||
return AbstractRule.isRuleEnabled(this.value); | ||
protected applyWithFunction(sourceFile: ts.SourceFile, walkFn: (ctx: WalkContext<void>) => void): RuleFailure[]; | ||
protected applyWithFunction<T>(sourceFile: ts.SourceFile, walkFn: (ctx: WalkContext<T>) => void, options: T): RuleFailure[]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, I don't think the generic option helps all that much since it's not constrained and only used by the user
src/language/rule/rule.ts
Outdated
|
||
export class WalkContext<T> { | ||
public readonly failures: RuleFailure[]; | ||
private limit: number; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to sourceLastIndex
or similar
src/language/rule/rule.ts
Outdated
@@ -265,3 +278,33 @@ export class RuleFailure { | |||
return new RuleFailurePosition(position, lineAndCharacter); | |||
} | |||
} | |||
|
|||
export class WalkContext<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
belongs in separate file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you put this class in a separate file?
super.visitNode(node); | ||
} | ||
}; | ||
return ts.forEachChild(sourceFile, cb); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a return value, but why return it from walk
?
src/language/rule/rule.ts
Outdated
this.addFailure(start, start + width, failure, fix); | ||
} | ||
|
||
public addFailure(start: number, end: number, failure: string, fix?: Fix) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should use addFailureFromStartToEnd
so it mirrors the methods in RuleWalker
? I know it's a bit of an awkward name... I see why you named it as you did though, since this is the new "base" method that all the other methods call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to leave it as it is now. This new implementation does not need to keep the mistakes made in RuleWalker
.
When migrating an existing rule, you will have some conflicts anyway because of the other missing methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming all rules will eventually be migrated over to the new style, I'm in agreement. The type system will find incorrect usages though, and the new naming is nicer.
If some rules are going to stay as extensions of RuleWalker
though, then it'll get confusing to have the different methods existing for perpetuity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, we should try to migrate all existing rules, even ScopeAwareRuleWalker
and subclasses thereof. At least that's my naive plan.
I don't know if or when we can eventually deprecate and remove RuleWalker
for implementers of custom rules. In the end it's your decision if you want to keep it for backwards compatibility.
Regarding the method name I don't have a strong opinion. I will change it if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You wrote the code and have thought about it more than me - happy to defer to your preference and leave it as is on this one.
const cb = (node: ts.Node): void => { | ||
if (node.kind === ts.SyntaxKind.NumericLiteral) { | ||
this.checkNumericLiteral(node, (node as ts.NumericLiteral).text); | ||
} else if (node.kind === ts.SyntaxKind.PrefixUnaryExpression && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unimportant, but I do think this is a nice use case for a type guard function so that the upcoming type assertion can be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but I don't think it's a good idea to scatter the declarations of typeguard functions through the whole codebase. Currently it is done this way which leads to many duplicated small functions. Maybe we should add them to a new module src/language/typeguard.ts
as needed.
But that's not directly related to this PR and can be considered when migrating other exisiting rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, no need to change for this PR. Also saw the discriminated union issue you created, that would be a neat solution as well.
@@ -63,44 +61,29 @@ export class Rule extends Lint.Rules.AbstractRule { | |||
public static DEFAULT_ALLOWED = [ -1, 0, 1 ]; | |||
|
|||
public apply(sourceFile: ts.SourceFile): Lint.RuleFailure[] { | |||
return this.applyWithWalker(new NoMagicNumbersWalker(sourceFile, this.getOptions())); | |||
const allowedNumbers = this.ruleArguments.length > 0 ? this.ruleArguments : Rule.DEFAULT_ALLOWED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this concept of parsing rule options as part of the rule instead of part of the rule implementation - seems like a better separation of concerns to me.
Nice work @ajafff and everyone involved in reviewing! I really like the performance results we're seeing from these changes, should be a major improvement to lint times for all users. |
Reverted accidental breaking change. Move WalkContext to its own file Removed limit property and made it a local variable.
avoid unnecessary code churn
@nchen63 Your latest review comments were on an outdated diff. They are already fixed in a more recent commit. As of now this PR is only waiting for a final decision whether to rename |
@ajafff thanks! |
This is awesome @ajafff , can't wait till I can start using this. |
^ yeah, this is awesome. Really excited for the perf boost! |
of course, we'll need to convert the rules to really start seeing the benefits of this 😄 @nchen63 might be worth making a wiki page or issue with a list of rules that need to be converted in order of priority? |
Also might be worth addressing this issue first before doing a lot of conversion: #2093 (comment) |
@jkillian It would be nice to see this list of rules that are up to date so that we may refer to them as examples. A guide to writing and updating the rules would be welcomed as this could make developers mindful of possible performance pitfalls. |
PR checklist
What changes did you make?
Implement
AbstractWalker
as announced in #1900 (comment)As always: all names are available for discussion.
When extending
AbstractWalker
one has to implement thewalk
method to only handle the nodes needed for this rule.This is done to avoid the poor performance of
SyntaxWalker
. For exampleSyntaxWalker.visitNode
currently accounts for 25% of the total runtime - that means the method itself takes that long, not including the methods it calls along the way.I took the time to rewrite most of the rather simple rules to use the new AbstractWalker. That's not included in this PR, but can be viewed in https://github.com/ajafff/tslint/commit/e6754c998071eeda4afc061cc8c437a16aaf976f (plus some minor unrelated changes)
If you approve this PR, I will submit separate PRs for the rules I changed for easier review.
Perf numbers for linting
tslint
's sources:Total time of the actual linting (time spent inside
Linter.applyRule
) excluding any IO.master: ~9.0 sec
https://github.com/ajafff/tslint/commit/e6754c998071eeda4afc061cc8c437a16aaf976f: ~5.5 sec
//cc @jkillian @andy-hanson @jmlopez-rod @JoshuaKGoldberg as they seem to care for performance, too