Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race in GetProjectOptionsFromScript prevents getting referenced packages #12683

Open
auduchinok opened this issue Feb 2, 2022 · 7 comments
Open
Labels
Area-FSI Bug Impact-Medium (Internal MS Team use only) Describes an issue with moderate impact on existing code.
Milestone

Comments

@auduchinok
Copy link
Member

auduchinok commented Feb 2, 2022

Repro steps:

  • call GetProjectOptionsFromScript
  • start the resulting async as task and pass a cancellation token
  • request cancellation on the token source
  • call GetProjectOptionsFromScript again (probably from a different thread)
  • start the task with a new cancellation token

The GetProjectOptionsFromScript calls are made using lock application to prevent concurrent access.

Calling and cancelling it like this may lead to situation where two scrips closures are calculated concurrently, despite one of them has already been cancelled by the callee. Here're two screenshots for different threads, both made while staying on the same breakpoint:

Screenshot 2022-02-02 at 14 17 37

Screenshot 2022-02-02 at 14 17 16

This leads to a situation where both threads try to write a generated file during the package restore. If the file is locked then the restore fails and results with missing package references are added to cache inside DependencyProvider, so all later GetProjectOptionsFromScript invocations return failed results for the same compiler directive collection.

The build failure details

The cached script closure:
Screenshot 2022-02-02 at 09 44 52

Exception stacktrace:
Screenshot 2022-02-02 at 09 45 19

@auduchinok auduchinok added the Bug label Feb 2, 2022
@dsyme
Copy link
Contributor

dsyme commented Feb 2, 2022

I don't quite understand the likely nature of the bug from the above

Calling and cancelling it like this may lead to situation where two scrips closures are calculated concurrently, despite one of them has already been cancelled by the callee.

When you say "calculated concurrently" could you indicate which routine or code you think is being executed concurrently for the same script?

Also when you say "has been cancelled" do you mean that the cancellation hasn't been effective?

Or is it more that a bad failing result (related to cancellation) has been written to the cache?

@auduchinok
Copy link
Member Author

auduchinok commented Feb 2, 2022

When you say "calculated concurrently" could you indicate which routine or code you think is being executed concurrently for the same script?

@dsyme Yes, please check the stacks on the attached screenshots. 🙂

Also when you say "has been cancelled" do you mean that the cancellation hasn't been effective?

I think this is what is happening indeed, the cancellation tokens are also captured on the screenshots.

Or is it more that a bad failing result (related to cancellation) has been written to the cache?

The very last screenshot shows a cached failure to write to a locked file inside an MSBuild task.

@dsyme
Copy link
Contributor

dsyme commented Feb 2, 2022

Looks like FSharpDependencyManager is allowing multiple concurrent callers to attempt resolutions within its working directory at the same time

@dsyme
Copy link
Contributor

dsyme commented Feb 2, 2022

@KevinRansom Could you take a look at this? I think FSharpDependencyManager needs to be made concurrency safe - not its data structures, but rather its use of the working directory as a dedicated non-concurrent resource.

@KevinRansom
Copy link
Member

I shall take a look. For sure there is no real reason beyond debuggability that we reuse the working directory. Although, when trying to figure out what is going on, that is my go to.

@dsyme
Copy link
Contributor

dsyme commented Feb 2, 2022

I suppose a risk of using separate directories (ie fixing this bug) is that we will end up spawning hundreds of resolutions as the programmer is editing the references of a script, each typing of a character doing a new spawn. Right now the bug is probably gating the process of doing resolutions in some way.

Equally we've gradually moved away from have FSharp.Compiler.Service manage resources and act as a mini-operating-system (e.g. by removing the reactor thread), but it's something to be aware of and watch for.

@dsyme dsyme added Impact-Medium (Internal MS Team use only) Describes an issue with moderate impact on existing code. Area-FSI labels Mar 3, 2022
@vzarytovskii vzarytovskii moved this to Not Planned in F# Compiler and Tooling Jun 17, 2022
@auduchinok
Copy link
Member Author

I suppose a risk of using separate directories (ie fixing this bug) is that we will end up spawning hundreds of resolutions as the programmer is editing the references of a script, each typing of a character doing a new spawn.

It should only create new directories/files for unique dependency provider keys, so it won't be happening for each keystroke, unless changing the reference directives:

let key = (packageManager.Key, scriptExt, Seq.toArray packageManagerTextLines, executionTfm, executionRid, implicitIncludeDir, mainScriptName, fileName)

And when changing the directives, it currently may reuse the files, leading to hard-to-diagnose race issues.

@vzarytovskii vzarytovskii added this to the Backlog milestone Sep 21, 2022
auduchinok added a commit to JetBrains/fsharp that referenced this issue Mar 9, 2023
auduchinok added a commit to JetBrains/fsharp that referenced this issue Jul 7, 2023
auduchinok added a commit to auduchinok/fsharp that referenced this issue Jul 11, 2023
auduchinok added a commit to auduchinok/fsharp that referenced this issue Oct 24, 2023
DedSec256 pushed a commit to DedSec256/fsharp that referenced this issue Nov 24, 2023
auduchinok added a commit to auduchinok/fsharp that referenced this issue Jan 30, 2024
auduchinok added a commit to auduchinok/fsharp that referenced this issue Jan 30, 2024
auduchinok added a commit to JetBrains/fsharp that referenced this issue Feb 14, 2024
auduchinok added a commit to JetBrains/fsharp that referenced this issue Jun 17, 2024
auduchinok added a commit to JetBrains/fsharp that referenced this issue Sep 18, 2024
auduchinok added a commit to auduchinok/fsharp that referenced this issue Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-FSI Bug Impact-Medium (Internal MS Team use only) Describes an issue with moderate impact on existing code.
Projects
Status: New
Development

No branches or pull requests

4 participants