Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race when provisioning the EMSDK #52254

Closed
ViktorHofer opened this issue May 4, 2021 · 3 comments · Fixed by #52273
Closed

Race when provisioning the EMSDK #52254

ViktorHofer opened this issue May 4, 2021 · 3 comments · Fixed by #52273
Assignees
Labels
area-Infrastructure-mono blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms'
Milestone

Comments

@ViktorHofer
Copy link
Member

eng/testing/tests.wasm.targets(138,5): error MSB3026: (NETCORE_ENGINEERING_TELEMETRY=Build) Could not copy "/usr/local/emscripten/emsdk/node/14.15.5_64bit/bin/npx" to "/__w/1/s/src/mono/wasm/emsdk/node/14.15.5_64bit/bin/npx". Beginning retry 1 in 1000ms. The process cannot access the file '/__w/1/s/src/mono/wasm/emsdk/node/14.15.5_64bit/bin/npx' because it is being used by another process. 

from https://dev.azure.com/dnceng/public/_build/results?buildId=1120261&view=logs&jobId=fddee5b2-3265-5a91-40c1-750d81e69f3f.

This happens when wasm test projects are bundled in parallel. Usually we try to avoid to provision things in a context that might run in parallel and which doesn't reference the project in any way. In this case I believe the EMSDK provisioning should happen in the context of the sendtohelix project and not as part of a test project's bundle publishing.

cc @steveisok @akoeplinger

@dotnet-issue-labeler dotnet-issue-labeler bot added area-Infrastructure-mono untriaged New issue has not been triaged by the area owner labels May 4, 2021
@ghost
Copy link

ghost commented May 4, 2021

Tagging subscribers to this area: @directhex
See info in area-owners.md if you want to be subscribed.

Issue Details
eng/testing/tests.wasm.targets(138,5): error MSB3026: (NETCORE_ENGINEERING_TELEMETRY=Build) Could not copy "/usr/local/emscripten/emsdk/node/14.15.5_64bit/bin/npx" to "/__w/1/s/src/mono/wasm/emsdk/node/14.15.5_64bit/bin/npx". Beginning retry 1 in 1000ms. The process cannot access the file '/__w/1/s/src/mono/wasm/emsdk/node/14.15.5_64bit/bin/npx' because it is being used by another process. 

from https://dev.azure.com/dnceng/public/_build/results?buildId=1120261&view=logs&jobId=fddee5b2-3265-5a91-40c1-750d81e69f3f.

This happens when wasm test projects are bundled in parallel. Usually we try to avoid to provision things in a context that might run in parallel and which doesn't reference the project in any way. In this case I believe the EMSDK provisioning should happen in the context of the sendtohelix project and not as part of a test project's bundle publishing.

cc @steveisok @akoeplinger

Author: ViktorHofer
Assignees: -
Labels:

area-Infrastructure-mono, untriaged

Milestone: -

@ViktorHofer ViktorHofer added blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' and removed untriaged New issue has not been triaged by the area owner labels May 4, 2021
@ViktorHofer ViktorHofer added this to the 6.0.0 milestone May 4, 2021
@ViktorHofer
Copy link
Member Author

cc @safern just for awareness and additional thoughts.

@steveisok
Copy link
Member

/cc @radical

@radical radical self-assigned this May 4, 2021
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label May 4, 2021
radical added a commit to radical/runtime that referenced this issue May 4, 2021
- For running on AOT tests on helix, we need to send emsdk as a payload
- On the CI machines, emscripten is available in `/usr/local/emscripten/`
- but the helix tasks try to write a `.payload` file in that dir, which fails
  because of permissions

- So, we copy emsdk to local `src/mono/wasm/emsdk`
- But we were doing it as part of *every* test build!
    - this meant unncessarily copying, and races causing errors like:

```
/__w/1/s/eng/testing/tests.wasm.targets(138,5):
    error MSB3026: Could not copy "/usr/local/emscripten/emsdk/node/14.15.5_64bit/bin/node" to "/__w/1/s/src/mono/wasm/emsdk/node/14.15.5_64bit/bin/node".

    Beginning retry 1 in 1000ms. The process cannot access the file '/__w/1/s/src/mono/wasm/emsdk/node/14.15.5_64bit/bin/node' because it is being used by another process.  [/__w/1/s/src/libraries/System.Collections.Specialized/tests/System.Collections.Specialized.Tests.csproj]
```

- Instead, do that once when preparing work items for helix

Fixes dotnet#52254 .
radical added a commit that referenced this issue May 5, 2021
- For running on AOT tests on helix, we need to send emsdk as a payload
- On the CI machines, emscripten is available in `/usr/local/emscripten/`
- but the helix tasks try to write a `.payload` file in that dir, which fails
  because of permissions

- So, we copy emsdk to local `src/mono/wasm/emsdk`
- But we were doing it as part of *every* test build!
    - this meant unncessarily copying, and races causing errors like:

```
/__w/1/s/eng/testing/tests.wasm.targets(138,5):
    error MSB3026: Could not copy "/usr/local/emscripten/emsdk/node/14.15.5_64bit/bin/node" to "/__w/1/s/src/mono/wasm/emsdk/node/14.15.5_64bit/bin/node".

    Beginning retry 1 in 1000ms. The process cannot access the file '/__w/1/s/src/mono/wasm/emsdk/node/14.15.5_64bit/bin/node' because it is being used by another process.  [/__w/1/s/src/libraries/System.Collections.Specialized/tests/System.Collections.Specialized.Tests.csproj]
```

- Instead, do that once when preparing work items for helix

Fixes #52254 .
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label May 5, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Jun 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Infrastructure-mono blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms'
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants