Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent duplicate invocations of per-compilation analyzer actions for… #3105

Merged
merged 1 commit into from
May 29, 2015

Conversation

mavasani
Copy link
Contributor

… a given symbol/syntax node in IDE.

User scenario: An analyzer author writes a compilation end analyzer, which registers symbol and/or syntax node actions to maintain some state about them, and a compilation end action to report diagnostics based on the final sate. When running in the IDE, the analyzer will get duplicate callbacks to the symbol/syntax node actions for the same symbol/node in the compilation: once while computing active document diagnostics and once while computing project diagnostics. This can cause one of the following issues:

  1. Crash the analyzer if it is using some map keyed on the callback symbol/node, and doesn't handle duplicate callbacks.
  2. Corrupt the analyzer state by recording the state twice for a single symbol/node.
  3. Possibly cause the analyzer to enter a race or deadlock.

Fix Description: Fix is to ensure that when computing the compilation end diagnostics for such stateful analyzers, we use a new instance of the analyzer. All the compilation wide actions are then invoked on this new instance of the analyzer, avoiding duplicate callbacks into the same instance.

Fixes #248

Testing: Added a regression test + existing tests.

/cc @ManishJayaswal @srivatsn This is for 1.0 (stable).

@mavasani
Copy link
Contributor Author

@JohnHamby @heejaechang @tmeschter @jmarolf can you please review?


analyzerExecutor = GetAnalyzerExecutor(analyzer, compilation, localDiagnostics.Add);
analyzerActions = await this.GetAnalyzerActionsAsync(analyzer, analyzerExecutor).ConfigureAwait(false);
analyzerToDispose = analyzer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to move this assignment to be immediately after the assignment to "analyzer" within the try above, to avoid risk of something that might throw an exception creeping in between that assignment and this one. (If that would not be correct, because ClearAnalyzerState requires the two statements above to execute first, then I suggest a comment to the effect that this is the earliest correct point for this assignment.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I think it might be safer to just move the assignment ahead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might leak. since analyzer will hand out to outer system. assumption was analyzer either leave forever or all handed out analyzer will be cleaned up when reference is removed. which was okay since analyzer were shared.

now, newly created analyzer is handed out. for host analyzer, newly created analyzer might live very long time.

what is the reason we need to clone? analyzer itself is stateless right? the initialize method is the starting point of the state, why not just re-initialize those?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this for the case where there might be a person who put host wide state? what happen if a user put things in static field, cloning own fix those?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@heejaechang

  1. Regarding the leak: Cloned analyzer instance is only handed to the ReanalyzeDiagnosticGetter, and the instance is only used for core analyzer callbacks into the AnalyzerExecutor, so the host analyzer manager never has access to the cloned instance. One place where the current fix can cause leak is for analyzer exception diagnostics, as they will be reported for the cloned instance. I will tweak the current fix so that the exception handler for this case reports exception diagnostics against the original analyzer instance, and behaves as if everything happened on the original instance, including cleaning up the diagnostics. I don't see any other leak possibility, do you know of other caches where this code path can leak?

  2. Regarding the reason for fix: We cannot assume anything about analyzer being stateless, i.e.

    1. We cannot make duplicate callbacks to Initialize on same analyzer instance and
    2. We cannot make duplicate callbacks to RegisterCompilationStartAction for same compilation and
    3. We cannot make duplicate callbacks for same symbol or node or code block from a given compilation to an analyzer.

    All of these would violate our analyzer execution semantics, and this change fixes the case where IDE driver violates this. A simple compilation analyzer that say has a map from declared symbol to some value (not on analyzer field, but on the scope of compilation start action), and adds a kvp on this map on every symbol action callback will cause it to crash due to this issue. Such an analyzer is completely correct and abides by our guaranteed execution semantics, so this change aims at avoiding such a scenario.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JohnHamby @srivatsn can confirm if the analyzer semantics noted in #(2) above are indeed true, and if so, whether or not they should be altered at this point. I don't know if an analyzer would expect/handle being called repeatedly to Initialize.
Probably this doc comment and this document's first paragraph for Ordering of actions section needs more clarity.

If we feel its ok to invoke Initialize multiple times on the same analyzer instance, then the fix for this issue would be a simple one line fix here: Just clear all existing analyzer state on AnalyzerManager at this point and rest should just work by re-invoking Initialize/CompilationStartActions when asked for analyzer actions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have specified that Initialize will be called no more than once per analyzer instance. It's hard to predict what might break as a consequence of violating that. One might imagine an analyzer that acquires some resource in Initialize and frees it in a finalizer. Calling Initialize more than once for an instance of such an analyzer could leak the resource.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such resource should be acquired in constructor.

  •      heejae
    

From: John Hamby [mailto:[email protected]]
Sent: Thursday, May 28, 2015 12:38 PM
To: dotnet/roslyn
Cc: HeeJae Chang
Subject: Re: [roslyn] Prevent duplicate invocations of per-compilation analyzer actions for… (#3105)

In src/Features/Core/Diagnostics/EngineV1/DiagnosticAnalyzerDriver.cshttps://github.com//pull/3105#discussion_r31269579:

  •                    // Doing so on the original analyzer instance might cause duplicate callbacks into the analyzer.
    
  •                    // So we create a new instance of this analyzer and compute compilation diagnostics on that instance.
    
  •                    try
    
  •                    {
    
  •                        analyzer = Activator.CreateInstance(analyzer.GetType()) as DiagnosticAnalyzer;
    
  •                    }
    
  •                    catch
    
  •                    {
    
  •                        // Unable to created a new analyzer instance, bail out on reporting diagnostics.
    
  •                        return;
    
  •                    }
    
  •                    analyzerExecutor = GetAnalyzerExecutor(analyzer, compilation, localDiagnostics.Add);
    
  •                    analyzerActions = await this.GetAnalyzerActionsAsync(analyzer, analyzerExecutor).ConfigureAwait(false);
    
  •                    analyzerToDispose = analyzer;
    

We have specified that Initialize will be called no more than once per analyzer instance. It's hard to predict what might break as a consequence of violating that. One might imagine an analyzer that acquires some resource in Initialize and frees it in a finalizer. Calling Initialize more than once for an instance of such an analyzer could leak the resource.


Reply to this email directly or view it on GitHubhttps://github.com//pull/3105/files#r31269579.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For analyzers in particular, acquiring an expensive resource might best be done in Initialize rather than in a constructor, because the instance might be constructed only to determine that the analyzer will not be used (due to all of its diagnostics being suppressed).

The point I was trying to make is that things might go wrong for some analyzers if we don't obey the specified contract. Debating whether a particular way things might go wrong could be avoided by changing the design of an analyzer is not going to resolve the issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @JohnHamby on this. Changing the semantics at this stage might break existing analyzers, who rely on our current semantics.

Btw, I am seeing this issue very often for a new statful code block analyzer that I am trying to write for #2949. I think we ought to fix this issue for 1.0, regardless of whether we clone analyzer or change Initialize semantics.

@msJohnHamby
Copy link
Contributor

I sign off. This looks to me as if it solves the issue, and I do not see correctness risks. The fix does execute some actions twice (with separate state), and so there are arbitrary space and time implications depending on the behavior of an analyzer. The issue of more-than-strictly-necessary action execution is by itself sufficient to warrant another serious look at overhauling the incremental analyzer for V2, but this is a good change, and probably the best available, for V1.

@mavasani
Copy link
Contributor Author

Thanks John. I agree with your analysis that this can be fixed in a better and correct way with v2, but this is probably the best, low risk fix for current engine.

@mavasani mavasani force-pushed the DuplicateAnalyzerCallbacks branch 2 times, most recently from d8b6c0f to 229d1fe Compare May 28, 2015 10:20
… a given symbol/syntax node in IDE.

User scenario: An analyzer author writes a compilation end analyzer, which registers symbol and/or syntax node actions to maintain some state about them, and a compilation end action to report diagnostics based on the final sate. When running in the IDE, the analyzer will get duplicate callbacks to the symbol/syntax node actions for the same symbol/node in the compilation: once while computing active document diagnostics and once while computing project diagnostics. This can cause one of the following issues:

1. Crash the analyzer if it is using some map keyed on the callback symbol/node, and doesn't handle duplicate callbacks.

2. Corrupt the analyzer state by recoring the state twice for a single symbol/node.

3. Possible cause the analyzer to enter a race or deadlock.

Fix Description: Fix is to ensure that when computing the compilation end diagnostics for such stateful analyzers, we use a new instance of the analyzer. All the compilation wide actions are then invoked on this new instance of the analyzer, avoiding duplicate callbacks into the same instance.

Fixes dotnet#248

Testing: Added a regression test + existing tests.
@mavasani mavasani force-pushed the DuplicateAnalyzerCallbacks branch from 229d1fe to 7943a17 Compare May 28, 2015 10:24
@mavasani
Copy link
Contributor Author

Moving this back to shiproom as John/Heejae are fine with the proposed change.

@MattGertz
Copy link
Contributor

Approved 5/29 by ML Shiproom.

mavasani added a commit that referenced this pull request May 29, 2015
Prevent duplicate invocations of per-compilation analyzer actions for a given symbol/syntax node in IDE.

User scenario: An analyzer author writes a compilation end analyzer, which registers symbol and/or syntax node actions to maintain some state about them, and a compilation end action to report diagnostics based on the final sate. When running in the IDE, the analyzer will get duplicate callbacks to the symbol/syntax node actions for the same symbol/node in the compilation: once while computing active document diagnostics and once while computing project diagnostics. This can cause one of the following issues:

  1. Crash the analyzer if it is using some map keyed on the callback symbol/node, and doesn't handle duplicate callbacks.
  2. Corrupt the analyzer state by recording the state twice for a single symbol/node.
  3. Possibly cause the analyzer to enter a race or deadlock.

Fix Description: Fix is to ensure that when computing the compilation end diagnostics for such stateful analyzers, we use a new instance of the analyzer. All the compilation wide actions are then invoked on this new instance of the analyzer, avoiding duplicate callbacks into the same instance.

Fixes #248 

Testing: Added a regression test + existing tests.

Approved 5/29 by ML Shiproom.
@mavasani mavasani merged commit 0154c37 into dotnet:stabilization May 29, 2015
@msJohnHamby
Copy link
Contributor

Dozens of thanks, Manish!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants