Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vmr '--test' is sometimes skipping test projects #4635

Closed
tmds opened this issue Sep 26, 2024 · 11 comments
Closed

vmr '--test' is sometimes skipping test projects #4635

tmds opened this issue Sep 26, 2024 · 11 comments
Assignees
Labels
area-testing Improvements in CI and testing

Comments

@tmds
Copy link
Member

tmds commented Sep 26, 2024

In our CI, the test projects that are under the vmr build --test argument are no longer both being executed.

This happens both on x64 as well as s390x/ppc64le, though there seem to be two separate problems.

In the s390x/ppc64le case, the test command fails with a non-zero exit code and neither scenario tests nor smoke tests are ran:

+ ./build.sh --with-system-libs +brotli++zlib+ --with-sdk /home/tester/dotnet/sdk --with-packages /home/tester/dotnet/packages --source-only /p:TargetRid=rhel.8-ppc64le --test
Using custom bootstrap SDK from '/home/tester/dotnet/sdk', version '9.0.100-rc.2.24470.1'
Found bootstrap versions: SDK 9.0.100-rc.2.24470.1, Arcade 9.0.0-beta.24466.2, NoTargets 3.7.0 and Traversal 3.4.0
MSBuild version 17.12.0-preview-24467-02+9fcd87631 for .NET

  Determining projects to restore...
  Restored /home/tester/dotnet/repo-projects/scenario-tests.proj (in 365 ms).
  Restored /home/tester/dotnet/test/tests.proj (in 365 ms).
  Restored /home/tester/dotnet/test/TestUtilities/TestUtilities.csproj (in 2.06 sec).
  Restored /home/tester/dotnet/test/Microsoft.DotNet.SourceBuild.SmokeTests/Microsoft.DotNet.SourceBuild.SmokeTests.csproj (in 2.5 sec).
  1 of 5 projects are up-to-date for restore.
  TestUtilities -> /home/tester/dotnet/artifacts/bin/TestUtilities/Release/TestUtilities.dll
  Microsoft.DotNet.SourceBuild.SmokeTests -> /home/tester/dotnet/artifacts/bin/Microsoft.DotNet.SourceBuild.SmokeTests/Release/Microsoft.DotNet.SourceBuild.SmokeTests.dll
  Building dependencies [source-build-reference-packages;arcade;command-line-api;source-build-externals] needed by 'scenario-tests'.
  Building dependencies [source-build-reference-packages] needed by 'arcade'.
  Building dependencies [source-build-reference-packages;arcade] needed by 'command-line-api'.
  Building dependencies [source-build-reference-packages;arcade] needed by 'source-build-externals'.
+ TEST_EXIT_CODE=1

In the x64 case, the test command exits with a success exit code but it did not run the smoke tests:

+ ./build.sh --with-system-libs +brotli++libunwind++rapidjson++zlib+ --source-only --test
Found bootstrap versions: SDK 9.0.100-rc.1.24452.12, Arcade 9.0.0-beta.24408.2, NoTargets 3.7.0 and Traversal 3.4.0
MSBuild version 17.12.0-preview-24422-09+d17ec720d for .NET

  Determining projects to restore...
  Restored /home/tester/dotnet/test/tests.proj (in 144 ms).
  Restored /home/tester/dotnet/repo-projects/scenario-tests.proj (in 144 ms).
  Restored /home/tester/dotnet/test/TestUtilities/TestUtilities.csproj (in 2.83 sec).
  Restored /home/tester/dotnet/test/Microsoft.DotNet.SourceBuild.SmokeTests/Microsoft.DotNet.SourceBuild.SmokeTests.csproj (in 4.35 sec).
  1 of 5 projects are up-to-date for restore.
  TestUtilities -> /home/tester/dotnet/artifacts/bin/TestUtilities/Release/TestUtilities.dll
  Microsoft.DotNet.SourceBuild.SmokeTests -> /home/tester/dotnet/artifacts/bin/Microsoft.DotNet.SourceBuild.SmokeTests/Release/Microsoft.DotNet.SourceBuild.SmokeTests.dll
  Building dependencies [source-build-reference-packages;arcade;command-line-api;source-build-externals] needed by 'scenario-tests'.
  Building dependencies [source-build-reference-packages;arcade] needed by 'command-line-api'.
  Building dependencies [source-build-reference-packages] needed by 'arcade'.
  Building dependencies [source-build-reference-packages;arcade] needed by 'source-build-externals'.
  [05:03:28.12] Testing scenario-tests
  Running command:
    /home/tester/dotnet/src/scenario-tests/eng/common/build.sh --restore --test --ci --configuration Release /bl:artifacts/log/Release/Test.binlog
    With Environment Variables:
      TestRoot=/home/tester/dotnet/artifacts/scenario-tests/artifacts/
      DotNetRoot=/home/tester/dotnet/artifacts/obj/extracted-dotnet-sdk/
      TestSdkVersion=9.0.100-rc.2.24476.1
      AdditionalTestArgs=--xml /home/tester/dotnet/artifacts/TestResults/Release/scenario-tests/2024-09-26_05_03_28.xml --target-rid fedora.39-x64 --no-cleanup --no-traits Category=MultiTFM
      DotNetTool=/home/tester/dotnet/artifacts/obj/extracted-dotnet-sdk/dotnet
      _InitializeDotNetCli=/home/tester/dotnet/artifacts/obj/extracted-dotnet-sdk/
  ##vso[task.setvariable variable=Artifacts;isSecret=false;isOutput=true]/home/tester/dotnet/src/scenario-tests/artifacts
  ##vso[task.setvariable variable=Artifacts.Toolset;isSecret=false;isOutput=true]/home/tester/dotnet/src/scenario-tests/artifacts/toolset
  ##vso[task.setvariable variable=Artifacts.Log;isSecret=false;isOutput=true]/home/tester/dotnet/src/scenario-tests/artifacts/log/Release
  ##vso[task.setvariable variable=Temp;isSecret=false;isOutput=true]/home/tester/dotnet/src/scenario-tests/artifacts/tmp/Release
  ##vso[task.setvariable variable=TMP;isSecret=false;isOutput=true]/home/tester/dotnet/src/scenario-tests/artifacts/tmp/Release
  ##vso[task.setvariable variable=NUGET_PLUGIN_HANDSHAKE_TIMEOUT_IN_SECONDS;isSecret=false;isOutput=true]20
  ##vso[task.setvariable variable=NUGET_PLUGIN_REQUEST_TIMEOUT_IN_SECONDS;isSecret=false;isOutput=true]20
  
    Determining projects to restore...
    Restored /home/tester/dotnet/.packages/BootstrapPackages/microsoft.dotnet.arcade.sdk/9.0.0-beta.24408.2/tools/Tools.proj (in 5.11 sec).
    Determining projects to restore...
    Restored /home/tester/dotnet/src/scenario-tests/src/Microsoft.DotNet.ScenarioTests.SdkTemplateTests/Microsoft.DotNet.ScenarioTests.SdkTemplateTests.csproj (in 6.06 sec).
    Test environment:
      Dotnet Root: /home/tester/dotnet/artifacts/obj/extracted-dotnet-sdk/
      Test root: /home/tester/dotnet/artifacts/scenario-tests/artifacts/
      Target RID: fedora.39-x64
      Sdk Version: 9.0.100-rc.2.24476.1
      Platform: LINUX
    [SKIP] Microsoft.DotNet.ScenarioTests.SdkTemplateTests.SdkTemplateTests.VerifyAspireTemplate
    Finished Microsoft.DotNet.ScenarioTests.SdkTemplateTests, Version=9.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35
    
    Tests run: 31, Errors: 0, Failures: 0, Skipped: 1. Time: 844.630497s
  
  Build succeeded.
      0 Warning(s)
      0 Error(s)
  
  Time Elapsed 00:14:18.29
  [05:17:48.18] Testing scenario-tests...done
+ EXIT_CODE=0

For the smoke tests not running, we unfortunately have a large gap in our CI because we didn't notice the rename of --run-smoke-test to --test in May. The change in behavior may be related to that rename.

From the CI ppc64le logs, we see scenario-tests ran with vmr 1a65f6db1eec78dfd8ebff48dda59ce0a0c86a70 (Sep 7) and no longer with vmr fe7af57fd36dcbeb0c07d899b66c9ab36e2201de.

cc @MichaelSimons @omajid @Swapnali911

@mthalman
Copy link
Member

Can you attach binlogs for both of these scenarios? Use /bl:<binlog-output-path> in your call to the build script.

@MichaelSimons MichaelSimons added area-testing Improvements in CI and testing and removed untriaged area-ci-testing labels Sep 26, 2024
@Swapnali911
Copy link

@mthalman here is binlogs 4635_ppc64le.zip for ppc64le scenario

@mthalman
Copy link
Member

These are likely two separate issues. Given that we have a binlog for the ppc scenario, let's move forward with diagnosing that. The MSBuild task that is returning the failure is here: https://github.com/dotnet/dotnet/blob/ed57bf8d1fecae71c1c23b92180afd93043a8836/src/vstest/src/Microsoft.TestPlatform.Build/Tasks/VSTestTask.cs#L48

Reverse engineering that logic, it looks like it's making this call:

dotnet exec /root/jenkins-scripts/ci/dotnet/sdk/sdk/9.0.100-rc.2.24474.1/vstest.console.dll --framework:.NETCoreApp,Version=v9.0 --logger:console;verbosity=normal --logger:trx;LogFileName=Microsoft.DotNet.SourceBuild.SmokeTests.trx --resultsDirectory:/root/jenkins-scripts/ci/dotnet/artifacts/TestResults/Release/ /root/jenkins-scripts/ci/dotnet/artifacts/bin/Microsoft.DotNet.SourceBuild.SmokeTests/Release/Microsoft.DotNet.SourceBuild.SmokeTests.dll

Try running that locally to debug things further.

@tmds
Copy link
Member Author

tmds commented Oct 3, 2024

I looked into this today.

I found out that on x64, the smoke tests do in fact execute, but they don't print any output to stdout. I think something in msbuild/vstest is redirecting the output.

I assume that we're not seeing the error that happens when the smoke tests fail to start on s390x/ppc64le for the same reason.

This issue looks related: microsoft/vstest#10358.

cc @nohwnd @rainersigwald

@rainersigwald
Copy link
Member

rainersigwald commented Oct 3, 2024

@tmds yes, probably that. Can you easily set MSBUILDENSURESTDOUTFORTASKPROCESSES=1 as an env var around that invocation to work around?

@tmds
Copy link
Member Author

tmds commented Oct 3, 2024

Can you easily set MSBUILDENSURESTDOUTFORTASKPROCESSES=1 as an env var around that invocation to work around?

I have tried setting that before running the vmr build.sh with --test but that did not cause the standard output to show for the SmokeTests.

The tests.proj of the vmr has references to the SmokeTests project and also to the scenario-tests project. For the latter project the output does get printed (regardless of MSBUILDENSURESTDOUTFORTASKPROCESSES being explicitly set).

Logging from the CI:

+ export MSBUILDENSURESTDOUTFORTASKPROCESSES=1
+ MSBUILDENSURESTDOUTFORTASKPROCESSES=1
+ ./build.sh -bl --with-system-libs +brotli++libunwind++rapidjson++zlib+ --source-only --test
Found bootstrap versions: SDK 9.0.100-rc.1.24452.12, Arcade 9.0.0-beta.24408.2, NoTargets 3.7.0 and Traversal 3.4.0
MSBuild version 17.12.0-preview-24422-09+d17ec720d for .NET

  Determining projects to restore...
  Restored /home/tester/dotnet/repo-projects/scenario-tests.proj (in 149 ms).
  Restored /home/tester/dotnet/test/tests.proj (in 149 ms).
  Restored /home/tester/dotnet/test/TestUtilities/TestUtilities.csproj (in 1.78 sec).
  Restored /home/tester/dotnet/test/Microsoft.DotNet.SourceBuild.SmokeTests/Microsoft.DotNet.SourceBuild.SmokeTests.csproj (in 2.81 sec).
  1 of 5 projects are up-to-date for restore.
  TestUtilities -> /home/tester/dotnet/artifacts/bin/TestUtilities/Release/TestUtilities.dll
  Microsoft.DotNet.SourceBuild.SmokeTests -> /home/tester/dotnet/artifacts/bin/Microsoft.DotNet.SourceBuild.SmokeTests/Release/Microsoft.DotNet.SourceBuild.SmokeTests.dll
  Building dependencies [source-build-reference-packages;arcade;command-line-api;source-build-externals] needed by 'scenario-tests'.
  Building dependencies [source-build-reference-packages;arcade] needed by 'source-build-externals'.
  Building dependencies [source-build-reference-packages] needed by 'arcade'.
  Building dependencies [source-build-reference-packages;arcade] needed by 'command-line-api'.
  [16:54:02.55] Testing scenario-tests
  Running command:
    /home/tester/dotnet/src/scenario-tests/eng/common/build.sh --restore --test --ci --configuration Release /bl:artifacts/log/Release/Test.binlog
    With Environment Variables:
      TestRoot=/home/tester/dotnet/artifacts/scenario-tests/artifacts/
      DotNetRoot=/home/tester/dotnet/artifacts/obj/extracted-dotnet-sdk/
      TestSdkVersion=9.0.100-rtm.24503.1
      AdditionalTestArgs=--xml /home/tester/dotnet/artifacts/TestResults/Release/scenario-tests/2024-10-03_16_54_02.xml --target-rid fedora.39-x64 --no-cleanup --no-traits Category=MultiTFM
      DotNetTool=/home/tester/dotnet/artifacts/obj/extracted-dotnet-sdk/dotnet
      _InitializeDotNetCli=/home/tester/dotnet/artifacts/obj/extracted-dotnet-sdk/
  ##vso[task.setvariable variable=Artifacts;isSecret=false;isOutput=true]/home/tester/dotnet/src/scenario-tests/artifacts
  ##vso[task.setvariable variable=Artifacts.Toolset;isSecret=false;isOutput=true]/home/tester/dotnet/src/scenario-tests/artifacts/toolset
  ##vso[task.setvariable variable=Artifacts.Log;isSecret=false;isOutput=true]/home/tester/dotnet/src/scenario-tests/artifacts/log/Release
  ##vso[task.setvariable variable=Temp;isSecret=false;isOutput=true]/home/tester/dotnet/src/scenario-tests/artifacts/tmp/Release
  ##vso[task.setvariable variable=TMP;isSecret=false;isOutput=true]/home/tester/dotnet/src/scenario-tests/artifacts/tmp/Release
  ##vso[task.setvariable variable=NUGET_PLUGIN_HANDSHAKE_TIMEOUT_IN_SECONDS;isSecret=false;isOutput=true]20
  ##vso[task.setvariable variable=NUGET_PLUGIN_REQUEST_TIMEOUT_IN_SECONDS;isSecret=false;isOutput=true]20
  
    Determining projects to restore...
    Retrying 'FindPackagesByIdAsync' for source 'https://pkgs.dev.azure.com/dnceng/9ee6d478-d288-47f7-aacc-f6e6d082ae6d/_packaging/4828dfac-e9f8-49bc-acb6-319be99331fc/nuget/v3/flat2/microsoft.dotnet.signtool/index.json'.
    Response status code does not indicate success: 503 (Service Unavailable).
    Restored /home/tester/dotnet/.packages/BootstrapPackages/microsoft.dotnet.arcade.sdk/9.0.0-beta.24408.2/tools/Tools.proj (in 4.19 sec).
    Determining projects to restore...
    Restored /home/tester/dotnet/src/scenario-tests/src/Microsoft.DotNet.ScenarioTests.SdkTemplateTests/Microsoft.DotNet.ScenarioTests.SdkTemplateTests.csproj (in 8.27 sec).
    Test environment:
      Dotnet Root: /home/tester/dotnet/artifacts/obj/extracted-dotnet-sdk/
      Test root: /home/tester/dotnet/artifacts/scenario-tests/artifacts/
      Target RID: fedora.39-x64
      Sdk Version: 9.0.100-rtm.24503.1
      Platform: LINUX
    [SKIP] Microsoft.DotNet.ScenarioTests.SdkTemplateTests.SdkTemplateTests.VerifyAspireTemplate
    Finished Microsoft.DotNet.ScenarioTests.SdkTemplateTests, Version=9.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35
    
    Tests run: 31, Errors: 0, Failures: 0, Skipped: 1. Time: 920.4452007s
  
  Build succeeded.
      0 Warning(s)
      0 Error(s)
  
  Time Elapsed 00:15:36.89
  [17:09:41.19] Testing scenario-tests...done
+ EXIT_CODE=0

@tmds
Copy link
Member Author

tmds commented Oct 7, 2024

The tests.proj of the vmr has references to the SmokeTests project and also to the scenario-tests project. For the latter project the output does get printed (regardless of MSBUILDENSURESTDOUTFORTASKPROCESSES being explicitly set).

Since one of test projects manages to produce output, I tried setting MSBUILDDISABLENODEREUSE=1.
With that, the smoke tests output also gets printed.

@rainersigwald it would seem that we are loosing the smoke test output due to it being ran on a node has redirected stdout.

On ppc64le/s390x the few jobs that ran so far didn't show an issue when trying to start the smoke tests (as we had observed earlier in the binlog). If I don't see an issue this week, I'll close this ticket.

@rainersigwald
Copy link
Member

@rainersigwald it would seem that we are loosing the smoke test output due to it being ran on a node has redirected stdout.

That's what MSBUILDENSURESTDOUTFORTASKPROCESSES controls: when set, all worker nodes share stdout. I thought it also inherently disabled node reuse but I guess we did that at the dotnet test layer instead: https://github.com/dotnet/sdk/blob/c5dbaedae3a97f6dee69acf59d4af6454c218608/src/Cli/dotnet/commands/dotnet-test/Program.cs#L83-L84

@MichaelSimons
Copy link
Member

@tmds, @mthalman - What work remains here or can this be closed?

@MichaelSimons
Copy link
Member

Gentle ping @tmds - have you confirmed this fix on ppc64le/s390x?

@tmds
Copy link
Member Author

tmds commented Oct 23, 2024

With the change we made, we now see the execution of the test suites. The issue on s390x/ppc64le is no longer reproducing. If it would resurface, I think we will now also see the error show up in the log.

@tmds tmds closed this as completed Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-testing Improvements in CI and testing
Projects
Status: Done
Development

No branches or pull requests

5 participants