Skip to content
This repository has been archived by the owner on Mar 25, 2021. It is now read-only.

Introduce AbstractWalker for performance #2093

Merged
merged 9 commits into from
Feb 3, 2017
Merged

Introduce AbstractWalker for performance #2093

merged 9 commits into from
Feb 3, 2017

Conversation

ajafff
Copy link
Contributor

@ajafff ajafff commented Jan 21, 2017

PR checklist

What changes did you make?

Implement AbstractWalker as announced in #1900 (comment)
As always: all names are available for discussion.
When extending AbstractWalker one has to implement the walk method to only handle the nodes needed for this rule.
This is done to avoid the poor performance of SyntaxWalker. For example SyntaxWalker.visitNode currently accounts for 25% of the total runtime - that means the method itself takes that long, not including the methods it calls along the way.

I took the time to rewrite most of the rather simple rules to use the new AbstractWalker. That's not included in this PR, but can be viewed in https://github.com/ajafff/tslint/commit/e6754c998071eeda4afc061cc8c437a16aaf976f (plus some minor unrelated changes)
If you approve this PR, I will submit separate PRs for the rules I changed for easier review.

Perf numbers for linting tslint's sources:
Total time of the actual linting (time spent inside Linter.applyRule) excluding any IO.
master: ~9.0 sec
https://github.com/ajafff/tslint/commit/e6754c998071eeda4afc061cc8c437a16aaf976f: ~5.5 sec

//cc @jkillian @andy-hanson @jmlopez-rod @JoshuaKGoldberg as they seem to care for performance, too

@andy-hanson
Copy link
Contributor

andy-hanson commented Jan 21, 2017

Thanks for working on this. Excited to see the performance gains. But I think there's still one too many classes.

You have:

class Rule {
    ...
    public apply(sourceFile: ts.SourceFile): Lint.RuleFailure[] {
         return this.applyWithWalker(new NullWalker(sourceFile, this.getOptions()));
    }
}

class NullWalker extends Lint.AbstractWalker {
    public walk(sourceFile: ts.SourceFile) {
        const cb = (node: ts.Node): void => {
            if (node.kind === ts.SyntaxKind.NullKeyword && !Lint.someAncestor(node.parent!, isTypeNode)) {
                return this.addFailureAtNode(node, Rule.FAILURE_STRING);
            }
            return ts.forEachChild(node, cb);
        };
        return ts.forEachChild(sourceFile, cb);
     }
}

But the second class only implements one method. An abstract class with a single method is a Java-esque workaround for a lack of ordinary functions. This is JavaScript, we just use a function! All we need is an ordinary function that adds failures to a context and returns nothing.

class Rule {
    ...
    public apply(sourceFile: ts.SourceFile) {
        return this.applyWithWalk(walk);
    }
}

function walk(ctx: Lint.WalkContext): void {
    ts.forEachChild(ctx.sourceFile, recur);

    function cb(node: ts.Node) {
        if (node.kind === ts.SyntaxKind.NullKeyword && !Lint.someAncestor(node.parent!, isTypeNode)) {
            ctx.addFailureAtNode(node, Rule.FAILURE_STRING);
        }
        ts.forEachChild(node, cb);
    }
}

(Note that a WalkContext would just be an AbstractWalker with the abstract method taken out to make it a concrete class.)

return this.failures;
}

public hasOption(option: any) {
Copy link
Contributor

@andy-hanson andy-hanson Jan 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should encourage parsing options up-front rather than calling hasOption (potentially deep within a loop). Since options are extraneous to this class' responsibility of adding failures, couldn't this all (the options property and hasOptions) be removed from here? Options could simply be passed to walk as an ordinary argument -- or parsed first, then passed as structured data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that idea. I have the following in mind:

class AbstractWalker<T> {
    ...
    constructor(public readonly sourceFile: ts.SourceFile, public readonly options: T) {
        ...
    }
}

This way you have the choice to parse the options in Rule or inside the constructor of your specific walker implementation ... or you can opt out completely by setting the type parameter to any;


public abstract walk(sourceFile: ts.SourceFile): void;

public getSourceFile() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just expose the sourceFile property?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sourceFile is exposed as public property. This methods only needs to be there to implement the IWalker interface for compatibility with RuleWalker

@ajafff
Copy link
Contributor Author

ajafff commented Jan 21, 2017

@andy-hanson Regarding the WalkContext thing:
Are you suggesting to remove all Walker classes and implement everything as functions, even if there are other methods now? That would be very awkward to pass the context to every function that could basically just be an instance method and access this.

Or do you mean replacing the Walker class with a simple function only where no instance methods are needed?

// can be used as the context for a walk function
class WalkContext {
    addFailure() {...}
    getFailures() {...}
    ... // everything else that is on AbstractWalker except walk and getSourceFile
}

// will be used as Walker if the logic is more complex and involves instance methods in the concrete implementation
abstract class AbstractWalker extends WalkContext implements IWalker {
    abstract walk(sourceFile);
    getSourceFile() {...} // so satisfy the interface
}

@andy-hanson
Copy link
Contributor

andy-hanson commented Jan 22, 2017

I don't think using a class for walking is necessary for any rule. A class where all methods but one are private is just as suspect as an interface with only one method. Private methods can be changed to just function declarations inside the scope of walk. No need for this. all over.

I put some sample code at andy-hanson#1.

@ajafff
Copy link
Contributor Author

ajafff commented Jan 22, 2017

Private methods can be changed to just function declarations inside the scope of walk. No need for this. all over.

@andy-hanson I understand your point, but I would not recommend going that route (in this particular case).

First of all this class should and will be extended to migrate things like ScopeAwareRuleWalker. Currently there are already some rules extending the walker of another rule. Sure, that's all possible with closures, but not very maintainable IMO.

Second, I wanted this to be somewhat compatible with RuleWalker to make migration easier. I'm not a fan of refactoring just for the sake of it.

Third, performance: closures are not at all your friend in the hot path.
Some interesting reads on this matter:

I'm following the performance work done in node.js very closely. They try to avoid closures by wrapping callbacks and the variables it would close over in a class. As of node v7.x they even use standalone function declarations and use .bind() to provide context to the function, because this is slightly faster in recent V8 (v5.4 IIRC).

@andy-hanson
Copy link
Contributor

andy-hanson commented Jan 22, 2017

It wouldn't be too hard to write an AbstractWalker in terms of a WalkContext (just delegate or inherit from it), so why not support both?
Updated my perf tester to use both styles. They're very close in performance to each other and to plain recursion.
Note that the first link recommends to use a function on the prototype instead of adding a closure property per object, i.e. to avoid this.doSomething = () => { ... } and to prefer public doSomething() { ... }; not to replace every single closure with a class.

@ajafff
Copy link
Contributor Author

ajafff commented Jan 23, 2017

I don't have a strong opinion whether one should prefer classes over closures or the other way around. At the end of the day it's only personal preference which style you choose. That's not my decision ... I'm just a random guy on the internet 😆

I made some changes based on your feedback. Now you can do both. Note that hasOption is also gone and a parsed options object should be passed to AbstractWalker, WalkContext or Rule.applyWithFunction.
To showcase this I refactored two existing rules to either extend AbstractWalker or use a closure with WalkContext.
Looking forward for your feedback.

super.visitNode(node);
}
};
return ts.forEachChild(sourceFile, cb);
Copy link
Contributor

@andy-hanson andy-hanson Jan 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ts.forEachChild actually does return a value (assuming the callback does), so this was confusing at first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's why the callback function here has the return type annotation : void.

The return is actually there on purpose to serve V8's tail call optimizer. AND once ES6 proper tail calls get shipped (currently only available behind --harmony_tailcalls flag) we get the ability to handle arbitrarily deep nested trees for free.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a return value, but why return it from walk?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All return statements in this method return void. As explained above, it is much faster to use return for calls in tail position when it does not return anything.

this.limit = sourceFile.getFullWidth();
}

public getFailures() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of making it private then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failures array could just be declared as public readonly failures: RuleFailure[] = [] and let the AbstractWalker implement the getFailures method.

@andy-hanson
Copy link
Contributor

Other than that code looks good.

Copy link
Contributor

@jmlopez-rod jmlopez-rod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look really good. I can't wait to start using the lean AbstractWalker on the rules. I hope the changes make it in.

this.limit = sourceFile.getFullWidth();
}

public getFailures() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failures array could just be declared as public readonly failures: RuleFailure[] = [] and let the AbstractWalker implement the getFailures method.

Also made some changes to simplify AbstractRule and added LanguageService to WalkContext
@ajafff
Copy link
Contributor Author

ajafff commented Jan 24, 2017

@andy-hanson @jmlopez-rod I moved getFailures() to AbstractWalker and made failures public readonly in WalkContext based on your feedback.

I also made some changes in AbstractRule to get the ruleName and ruleArguments easier.

WalkContext and therefore AbstractWalker now have a fourth parameter languageService which is optional. This enables users of Rule.applyWithFunction to use the language service, program and type checker.
Because it is optional one always needs to check for undefined or use a non-null assertion (!) to use it. I currently have no idea how to circumvent that.
One possible solution is to make the parameter required, although I don't think it is actually needed very often.
Another very hacky solution would be the following:

 export class WalkContext<T> {
+    public readonly languageService: ts.LanguageService;
     constructor(public readonly sourceFile: ts.SourceFile,
                 public readonly ruleName: string,
                 public readonly options: T,
-                public readonly languageService?: ts.LanguageService) {
+                languageService?: ts.LanguageService) {
+        if (languageService !== undefined) {
+            this.languageService = languageService;
+        }

@jmlopez-rod
Copy link
Contributor

Does the solution you presented above actually work? if we don't pass the language service then its property won't ever get initialized. I would think typescript would complain about something like that. I think it is fine though, we can make it into a runtime error if we need the languageService and it wasn't provided. Otherwise if this is very popular (not to me right now) we could create a WalkContextWithLanguageService<T> class that extends WalkContext<T>?

@andy-hanson
Copy link
Contributor

TypeScript doesn't check for uninitialized properties. Maybe you want a lint rule 😃.
Anyway, agree that a WalkContextWithLanguageService would be preferrable rather than to have it optional on WalkContext. Since languageService is mostly used for the typeChecker, there should be a TypedContext providing typeChecker directly as a readonly property. But that all can wait.

@ajafff
Copy link
Contributor Author

ajafff commented Jan 25, 2017

Ok, removed the languageService parameter property for now. As @andy-hanson suggested there should be a TypedContext added in a followup PR.

Seems like this is ready for now, awaiting final approval.

I would rather wait some time to document this in case changes are required shortly after landing.

Short summary why those changes were made:

  • Add WalkContext and Rule.applyWithFunction: for simple rules you no longer need to declare a new class extending one of the base walkers. Instead you only declare a single function. (For more involved logic I would still recommend a class to keep it maintainable and avoid possible performance penalty)
  • Make options generic: encourage implementers to preparse their options to avoid constantly calling hasOption()
  • Implement AbstractWalker on top of WalkContext:
    • abstract method walk forces implementers to do the walking as needed for this rule, which yields a significant performance gain
    • can be used more or less as drop in replacement for RuleWalker
    • omits all those unnecessary class methods like createReplacement and friends: those are now static methods in Replacement

export abstract class AbstractWalker implements IWalker {
public readonly options: ReadonlyArray<any>;
public readonly ruleName: string;
private limit: number;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fileLength?

}

public abstract apply(sourceFile: ts.SourceFile, languageService: ts.LanguageService): RuleFailure[];

public applyWithWalker(walker: IWalker): RuleFailure[] {
protected applyWithWalker(walker: IWalker): RuleFailure[] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a breaking change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make this public?

public isEnabled(): boolean {
return AbstractRule.isRuleEnabled(this.value);
protected applyWithFunction(sourceFile: ts.SourceFile, walkFn: (ctx: WalkContext<void>) => void): RuleFailure[];
protected applyWithFunction<T>(sourceFile: ts.SourceFile, walkFn: (ctx: WalkContext<T>) => void, options: T): RuleFailure[];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, I don't think the generic option helps all that much since it's not constrained and only used by the user


export class WalkContext<T> {
public readonly failures: RuleFailure[];
private limit: number;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to sourceLastIndex or similar

@@ -265,3 +278,33 @@ export class RuleFailure {
return new RuleFailurePosition(position, lineAndCharacter);
}
}

export class WalkContext<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

belongs in separate file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you put this class in a separate file?

super.visitNode(node);
}
};
return ts.forEachChild(sourceFile, cb);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a return value, but why return it from walk?

this.addFailure(start, start + width, failure, fix);
}

public addFailure(start: number, end: number, failure: string, fix?: Fix) {
Copy link
Contributor

@jkillian jkillian Jan 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should use addFailureFromStartToEnd so it mirrors the methods in RuleWalker? I know it's a bit of an awkward name... I see why you named it as you did though, since this is the new "base" method that all the other methods call

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to leave it as it is now. This new implementation does not need to keep the mistakes made in RuleWalker.
When migrating an existing rule, you will have some conflicts anyway because of the other missing methods

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming all rules will eventually be migrated over to the new style, I'm in agreement. The type system will find incorrect usages though, and the new naming is nicer.

If some rules are going to stay as extensions of RuleWalker though, then it'll get confusing to have the different methods existing for perpetuity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we should try to migrate all existing rules, even ScopeAwareRuleWalker and subclasses thereof. At least that's my naive plan.

I don't know if or when we can eventually deprecate and remove RuleWalker for implementers of custom rules. In the end it's your decision if you want to keep it for backwards compatibility.

Regarding the method name I don't have a strong opinion. I will change it if you want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You wrote the code and have thought about it more than me - happy to defer to your preference and leave it as is on this one.

const cb = (node: ts.Node): void => {
if (node.kind === ts.SyntaxKind.NumericLiteral) {
this.checkNumericLiteral(node, (node as ts.NumericLiteral).text);
} else if (node.kind === ts.SyntaxKind.PrefixUnaryExpression &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unimportant, but I do think this is a nice use case for a type guard function so that the upcoming type assertion can be removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but I don't think it's a good idea to scatter the declarations of typeguard functions through the whole codebase. Currently it is done this way which leads to many duplicated small functions. Maybe we should add them to a new module src/language/typeguard.ts as needed.

But that's not directly related to this PR and can be considered when migrating other exisiting rules.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, no need to change for this PR. Also saw the discriminated union issue you created, that would be a neat solution as well.

@@ -63,44 +61,29 @@ export class Rule extends Lint.Rules.AbstractRule {
public static DEFAULT_ALLOWED = [ -1, 0, 1 ];

public apply(sourceFile: ts.SourceFile): Lint.RuleFailure[] {
return this.applyWithWalker(new NoMagicNumbersWalker(sourceFile, this.getOptions()));
const allowedNumbers = this.ruleArguments.length > 0 ? this.ruleArguments : Rule.DEFAULT_ALLOWED;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this concept of parsing rule options as part of the rule instead of part of the rule implementation - seems like a better separation of concerns to me.

@jkillian
Copy link
Contributor

Nice work @ajafff and everyone involved in reviewing! I really like the performance results we're seeing from these changes, should be a major improvement to lint times for all users.

Reverted accidental breaking change.
Move WalkContext to its own file
Removed limit property and made it a local variable.
@ajafff
Copy link
Contributor Author

ajafff commented Feb 1, 2017

@nchen63 @jkillian I think I addressed all of your review comments.

avoid unnecessary code churn
@ajafff
Copy link
Contributor Author

ajafff commented Feb 3, 2017

@nchen63 Your latest review comments were on an outdated diff. They are already fixed in a more recent commit.

As of now this PR is only waiting for a final decision whether to rename AbstractWalker#addFailure to addFailureFromStartToEnd. #2093 (comment)

@nchen63 nchen63 merged commit 8a7cfbf into palantir:master Feb 3, 2017
@nchen63
Copy link
Contributor

nchen63 commented Feb 3, 2017

@ajafff thanks!

@jmlopez-rod
Copy link
Contributor

This is awesome @ajafff , can't wait till I can start using this.

@JoshuaKGoldberg
Copy link
Contributor

^ yeah, this is awesome. Really excited for the perf boost!

@jkillian
Copy link
Contributor

jkillian commented Feb 3, 2017

of course, we'll need to convert the rules to really start seeing the benefits of this 😄

@nchen63 might be worth making a wiki page or issue with a list of rules that need to be converted in order of priority?

@jkillian
Copy link
Contributor

jkillian commented Feb 3, 2017

Also might be worth addressing this issue first before doing a lot of conversion: #2093 (comment)

@jmlopez-rod
Copy link
Contributor

@jkillian It would be nice to see this list of rules that are up to date so that we may refer to them as examples. A guide to writing and updating the rules would be welcomed as this could make developers mindful of possible performance pitfalls.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants