Prevent duplicate invocations of per-compilation analyzer actions for… #3105

mavasani · 2015-05-26T23:49:34Z

… a given symbol/syntax node in IDE.

User scenario: An analyzer author writes a compilation end analyzer, which registers symbol and/or syntax node actions to maintain some state about them, and a compilation end action to report diagnostics based on the final sate. When running in the IDE, the analyzer will get duplicate callbacks to the symbol/syntax node actions for the same symbol/node in the compilation: once while computing active document diagnostics and once while computing project diagnostics. This can cause one of the following issues:

Crash the analyzer if it is using some map keyed on the callback symbol/node, and doesn't handle duplicate callbacks.
Corrupt the analyzer state by recording the state twice for a single symbol/node.
Possibly cause the analyzer to enter a race or deadlock.

Fix Description: Fix is to ensure that when computing the compilation end diagnostics for such stateful analyzers, we use a new instance of the analyzer. All the compilation wide actions are then invoked on this new instance of the analyzer, avoiding duplicate callbacks into the same instance.

Fixes #248

Testing: Added a regression test + existing tests.

/cc @ManishJayaswal @srivatsn This is for 1.0 (stable).

mavasani · 2015-05-26T23:50:24Z

@JohnHamby @heejaechang @tmeschter @jmarolf can you please review?

msJohnHamby · 2015-05-27T20:47:27Z

src/Features/Core/Diagnostics/EngineV1/DiagnosticAnalyzerDriver.cs

+
+                        analyzerExecutor = GetAnalyzerExecutor(analyzer, compilation, localDiagnostics.Add);
+                        analyzerActions = await this.GetAnalyzerActionsAsync(analyzer, analyzerExecutor).ConfigureAwait(false);
+                        analyzerToDispose = analyzer;


You might want to move this assignment to be immediately after the assignment to "analyzer" within the try above, to avoid risk of something that might throw an exception creeping in between that assignment and this one. (If that would not be correct, because ClearAnalyzerState requires the two statements above to execute first, then I suggest a comment to the effect that this is the earliest correct point for this assignment.)

You are right, I think it might be safer to just move the assignment ahead.

this might leak. since analyzer will hand out to outer system. assumption was analyzer either leave forever or all handed out analyzer will be cleaned up when reference is removed. which was okay since analyzer were shared.

now, newly created analyzer is handed out. for host analyzer, newly created analyzer might live very long time.

what is the reason we need to clone? analyzer itself is stateless right? the initialize method is the starting point of the state, why not just re-initialize those?

is this for the case where there might be a person who put host wide state? what happen if a user put things in static field, cloning own fix those?

@heejaechang

Regarding the leak: Cloned analyzer instance is only handed to the ReanalyzeDiagnosticGetter, and the instance is only used for core analyzer callbacks into the AnalyzerExecutor, so the host analyzer manager never has access to the cloned instance. One place where the current fix can cause leak is for analyzer exception diagnostics, as they will be reported for the cloned instance. I will tweak the current fix so that the exception handler for this case reports exception diagnostics against the original analyzer instance, and behaves as if everything happened on the original instance, including cleaning up the diagnostics. I don't see any other leak possibility, do you know of other caches where this code path can leak?

Regarding the reason for fix: We cannot assume anything about analyzer being stateless, i.e.

We cannot make duplicate callbacks to Initialize on same analyzer instance and

We cannot make duplicate callbacks to RegisterCompilationStartAction for same compilation and

We cannot make duplicate callbacks for same symbol or node or code block from a given compilation to an analyzer.

All of these would violate our analyzer execution semantics, and this change fixes the case where IDE driver violates this. A simple compilation analyzer that say has a map from declared symbol to some value (not on analyzer field, but on the scope of compilation start action), and adds a kvp on this map on every symbol action callback will cause it to crash due to this issue. Such an analyzer is completely correct and abides by our guaranteed execution semantics, so this change aims at avoiding such a scenario.

@JohnHamby @srivatsn can confirm if the analyzer semantics noted in #(2) above are indeed true, and if so, whether or not they should be altered at this point. I don't know if an analyzer would expect/handle being called repeatedly to Initialize.
Probably this doc comment and this document's first paragraph for Ordering of actions section needs more clarity.

If we feel its ok to invoke Initialize multiple times on the same analyzer instance, then the fix for this issue would be a simple one line fix here: Just clear all existing analyzer state on AnalyzerManager at this point and rest should just work by re-invoking Initialize/CompilationStartActions when asked for analyzer actions.

We have specified that Initialize will be called no more than once per analyzer instance. It's hard to predict what might break as a consequence of violating that. One might imagine an analyzer that acquires some resource in Initialize and frees it in a finalizer. Calling Initialize more than once for an instance of such an analyzer could leak the resource.

Such resource should be acquired in constructor.

heejae

From: John Hamby [mailto:[email protected]]
Sent: Thursday, May 28, 2015 12:38 PM
To: dotnet/roslyn
Cc: HeeJae Chang
Subject: Re: [roslyn] Prevent duplicate invocations of per-compilation analyzer actions for… (#3105)

In src/Features/Core/Diagnostics/EngineV1/DiagnosticAnalyzerDriver.cshttps://github.com//pull/3105#discussion_r31269579:

// Doing so on the original analyzer instance might cause duplicate callbacks into the analyzer.

// So we create a new instance of this analyzer and compute compilation diagnostics on that instance.

try

{

analyzer = Activator.CreateInstance(analyzer.GetType()) as DiagnosticAnalyzer;

}

catch

{

// Unable to created a new analyzer instance, bail out on reporting diagnostics.

return;

}

analyzerExecutor = GetAnalyzerExecutor(analyzer, compilation, localDiagnostics.Add);

analyzerActions = await this.GetAnalyzerActionsAsync(analyzer, analyzerExecutor).ConfigureAwait(false);

analyzerToDispose = analyzer;

We have specified that Initialize will be called no more than once per analyzer instance. It's hard to predict what might break as a consequence of violating that. One might imagine an analyzer that acquires some resource in Initialize and frees it in a finalizer. Calling Initialize more than once for an instance of such an analyzer could leak the resource.

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/3105/files#r31269579.

For analyzers in particular, acquiring an expensive resource might best be done in Initialize rather than in a constructor, because the instance might be constructed only to determine that the analyzer will not be used (due to all of its diagnostics being suppressed).

The point I was trying to make is that things might go wrong for some analyzers if we don't obey the specified contract. Debating whether a particular way things might go wrong could be avoided by changing the design of an analyzer is not going to resolve the issue.

I agree with @JohnHamby on this. Changing the semantics at this stage might break existing analyzers, who rely on our current semantics.

Btw, I am seeing this issue very often for a new statful code block analyzer that I am trying to write for #2949. I think we ought to fix this issue for 1.0, regardless of whether we clone analyzer or change Initialize semantics.

msJohnHamby · 2015-05-27T20:54:34Z

I sign off. This looks to me as if it solves the issue, and I do not see correctness risks. The fix does execute some actions twice (with separate state), and so there are arbitrary space and time implications depending on the behavior of an analyzer. The issue of more-than-strictly-necessary action execution is by itself sufficient to warrant another serious look at overhauling the incremental analyzer for V2, but this is a good change, and probably the best available, for V1.

mavasani · 2015-05-28T00:28:22Z

Thanks John. I agree with your analysis that this can be fixed in a better and correct way with v2, but this is probably the best, low risk fix for current engine.

… a given symbol/syntax node in IDE. User scenario: An analyzer author writes a compilation end analyzer, which registers symbol and/or syntax node actions to maintain some state about them, and a compilation end action to report diagnostics based on the final sate. When running in the IDE, the analyzer will get duplicate callbacks to the symbol/syntax node actions for the same symbol/node in the compilation: once while computing active document diagnostics and once while computing project diagnostics. This can cause one of the following issues: 1. Crash the analyzer if it is using some map keyed on the callback symbol/node, and doesn't handle duplicate callbacks. 2. Corrupt the analyzer state by recoring the state twice for a single symbol/node. 3. Possible cause the analyzer to enter a race or deadlock. Fix Description: Fix is to ensure that when computing the compilation end diagnostics for such stateful analyzers, we use a new instance of the analyzer. All the compilation wide actions are then invoked on this new instance of the analyzer, avoiding duplicate callbacks into the same instance. Fixes dotnet#248 Testing: Added a regression test + existing tests.

mavasani · 2015-05-29T10:11:50Z

Moving this back to shiproom as John/Heejae are fine with the proposed change.

MattGertz · 2015-05-29T21:07:22Z

Approved 5/29 by ML Shiproom.

Prevent duplicate invocations of per-compilation analyzer actions for a given symbol/syntax node in IDE. User scenario: An analyzer author writes a compilation end analyzer, which registers symbol and/or syntax node actions to maintain some state about them, and a compilation end action to report diagnostics based on the final sate. When running in the IDE, the analyzer will get duplicate callbacks to the symbol/syntax node actions for the same symbol/node in the compilation: once while computing active document diagnostics and once while computing project diagnostics. This can cause one of the following issues: 1. Crash the analyzer if it is using some map keyed on the callback symbol/node, and doesn't handle duplicate callbacks. 2. Corrupt the analyzer state by recording the state twice for a single symbol/node. 3. Possibly cause the analyzer to enter a race or deadlock. Fix Description: Fix is to ensure that when computing the compilation end diagnostics for such stateful analyzers, we use a new instance of the analyzer. All the compilation wide actions are then invoked on this new instance of the analyzer, avoiding duplicate callbacks into the same instance. Fixes #248 Testing: Added a regression test + existing tests. Approved 5/29 by ML Shiproom.

msJohnHamby · 2015-05-29T21:31:26Z

Dozens of thanks, Manish!!

mavasani added the Area-Analyzers label May 26, 2015

dnfclas added the cla-already-signed label May 26, 2015

mavasani assigned MattGertz May 26, 2015

srivatsn unassigned MattGertz May 27, 2015

msJohnHamby reviewed May 27, 2015
View reviewed changes

mavasani force-pushed the DuplicateAnalyzerCallbacks branch 2 times, most recently from d8b6c0f to 229d1fe Compare May 28, 2015 10:20

mavasani force-pushed the DuplicateAnalyzerCallbacks branch from 229d1fe to 7943a17 Compare May 28, 2015 10:24

mavasani assigned MattGertz May 29, 2015

ManishJayaswal added this to the 1.0 (stable) milestone May 29, 2015

MattGertz added the Approved to merge label May 29, 2015

mavasani merged commit 0154c37 into dotnet:stabilization May 29, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent duplicate invocations of per-compilation analyzer actions for… #3105

Prevent duplicate invocations of per-compilation analyzer actions for… #3105

mavasani commented May 26, 2015

mavasani commented May 26, 2015

msJohnHamby May 27, 2015

mavasani May 27, 2015

heejaechang May 28, 2015

heejaechang May 28, 2015

mavasani May 28, 2015

mavasani May 28, 2015

msJohnHamby May 28, 2015

heejaechang May 28, 2015

msJohnHamby May 28, 2015

mavasani May 28, 2015

msJohnHamby commented May 27, 2015

mavasani commented May 28, 2015

mavasani commented May 29, 2015

MattGertz commented May 29, 2015

msJohnHamby commented May 29, 2015

Prevent duplicate invocations of per-compilation analyzer actions for… #3105

Prevent duplicate invocations of per-compilation analyzer actions for… #3105

Conversation

mavasani commented May 26, 2015

mavasani commented May 26, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

msJohnHamby commented May 27, 2015

mavasani commented May 28, 2015

mavasani commented May 29, 2015

MattGertz commented May 29, 2015

msJohnHamby commented May 29, 2015