Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Test Failure] EventSource_SendingRequest_PathAndQueryRedaction_LogsStartStop #106268

Closed
rzikm opened this issue Aug 12, 2024 · 4 comments · Fixed by #106294
Closed

[Test Failure] EventSource_SendingRequest_PathAndQueryRedaction_LogsStartStop #106268

rzikm opened this issue Aug 12, 2024 · 4 comments · Fixed by #106294
Labels
area-System.Net.Http blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' in-pr There is an active PR which will close this issue when it is merged Known Build Error Use this to report build issues in the .NET Helix tab
Milestone

Comments

@rzikm
Copy link
Member

rzikm commented Aug 12, 2024

Build Information

Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=768382
Build error leg or test failing: System.Net.Http.Functional.Tests.TelemetryTest_Http11.EventSource_SendingRequest_PathAndQueryRedaction_LogsStartStop
Pull request:

Error Message

Fill the error message using step by step known issues guidance.

{
  "ErrorMessage": ["System.Net.Http.Functional.Tests.TelemetryTest.ValidateEventCounters", "Assert.Contains() Failure: Filter not matched in collection"],
  "ErrorPattern": "",
  "BuildRetry": false,
  "ExcludeConsoleLog": false
}

Example failure

System.Net.Http.Functional.Tests.TelemetryTest_Http11.EventSource_SendingRequest_PathAndQueryRedaction_LogsStartStop(uriTail: "/test/path?q1=a&q2=b", disableRedaction: True) [FAIL]
      Microsoft.DotNet.RemoteExecutor.RemoteExecutionException : Remote process failed with an unhandled exception.
      Stack Trace:

        Child exception:
          Xunit.Sdk.ContainsException: Assert.Contains() Failure: Filter not matched in collection
        Collection: [0, 0, 0, 0, 0, ···]
        /_/src/libraries/System.Net.Http/tests/FunctionalTests/TelemetryTest.cs(666,0): at System.Net.Http.Functional.Tests.TelemetryTest.ValidateEventCounters(ConcurrentQueue`1 events, Int32 requestCount, Boolean shouldHaveFailures, Int32 versionMajor, Boolean requestLeftQueue)
        /_/src/libraries/System.Net.Http/tests/FunctionalTests/TelemetryTest.cs(479,0): at System.Net.Http.Functional.Tests.TelemetryTest.<>c.<<EventSource_SendingRequest_PathAndQueryRedaction_LogsStartStop>b__7_0>d.MoveNext()
        --- End of stack trace from previous location ---
        /_/src/Microsoft.DotNet.RemoteExecutor/src/Program.cs(65,0): at Microsoft.DotNet.RemoteExecutor.Program.Main(String[] args)

        Child process:
          System.Net.Http.Functional.Tests, Version=9.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51 System.Net.Http.Functional.Tests.TelemetryTest+<>c System.Threading.Tasks.Task <EventSource_SendingRequest_PathAndQueryRedaction_LogsStartStop>b__7_0(System.String, System.String, System.String)

        Child arguments:
          1.1, /test/path?q1=a&q2=b, /test/path?q1=a&q2=b

Known issue validation

Build: 🔎 https://dev.azure.com/dnceng-public/public/_build/results?buildId=768382
Error message validated: [System.Net.Http.Functional.Tests.TelemetryTest.ValidateEventCounters Assert.Contains() Failure: Filter not matched in collection]
Result validation: ✅ Known issue matched with the provided build.
Validation performed at: 8/12/2024 1:44:58 PM UTC

Report

Build Definition Test Pull Request
768382 dotnet/runtime System.Net.Http.Functional.Tests.TelemetryTest_Http20.EventSource_SendingRequest_PathAndQueryRedaction_LogsStartStop

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 1 1
@rzikm rzikm added blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' Known Build Error Use this to report build issues in the .NET Helix tab labels Aug 12, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Aug 12, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

@rzikm
Copy link
Member Author

rzikm commented Aug 12, 2024

The check fails at

Assert.True(eventCounters.TryGetValue("requests-started-rate", out double[] requestRate));
Assert.Contains(requestRate, r => r > 0);

Which is weird because the check above it passes

Assert.True(eventCounters.TryGetValue("requests-started", out double[] requestsStarted));
Assert.Equal(requestCount, requestsStarted[^1]);

But both counters are using the same underlying field _startedRequests

// The cumulative number of HTTP requests started since the process started.
_startedRequestsCounter ??= new PollingCounter("requests-started", this, () => Interlocked.Read(ref _startedRequests))
{
DisplayName = "Requests Started",
};
// The number of HTTP requests started per second since the process started.
_startedRequestsPerSecondCounter ??= new IncrementingPollingCounter("requests-started-rate", this, () => Interlocked.Read(ref _startedRequests))
{
DisplayName = "Requests Started Rate",
DisplayRateTimeScale = TimeSpan.FromSeconds(1)
};

@MihaZupan
Copy link
Member

MihaZupan commented Aug 12, 2024

@noahfalk

#105548 introduces a new race condition where the first increment reported by an IncrementingPollingCounter can be inaccurate.
Because we're now calling UpdateMetric from the background timer thread instead of as part of the EventSource initialization, any changes to the counter value made between the EventSource creation and the timer thread first running will be dropped.

Consider the following repro project that always prints a 1 on .NET 8, but not on 9.
The setup is similar to how we use EventSources in the BCL.

Simplified repro project
using System.Diagnostics.Tracing;

using var listener = new Listener();

var log = MyEventSource.Log;

// If you add some delay, you will see the increment=1 on .NET 9 as well
// Thread.Sleep(1000);

log.SomeInterestingValue++;

Console.ReadKey();

class MyEventSource : EventSource
{
    public static MyEventSource Log { get; } = new();

    private IncrementingPollingCounter? _couter;

    public int SomeInterestingValue;

    private MyEventSource() : base(nameof(MyEventSource)) { }

    protected override void OnEventCommand(EventCommandEventArgs command)
    {
        if (command.Command == EventCommand.Enable)
        {
            _couter ??= new IncrementingPollingCounter("counter", this, () => SomeInterestingValue);
        }
    }
}

class Listener : EventListener
{
    protected override void OnEventSourceCreated(EventSource eventSource)
    {
        if (eventSource.Name != nameof(MyEventSource)) return;

        EnableEvents(eventSource, EventLevel.Informational, EventKeywords.None,
            new Dictionary<string, string?> { { "EventCounterIntervalSec", 0.5f.ToString() } });
    }

    protected override void OnEventWritten(EventWrittenEventArgs eventData)
    {
        if (eventData.EventSource.Name != nameof(MyEventSource)) return;

        var counters = (IDictionary<string, object>)eventData.Payload![0]!;

        Console.WriteLine($"Increment: {counters["Increment"]}");
    }
}

I'm assuming this wasn't intentional?

@noahfalk
Copy link
Member

I'm assuming this wasn't intentional?

While it wasn't an explicit goal to break such a test, it was intentional to make that callback asynchronous. All calls to EventListener.OnEventSourceCreated() are holding a lock and invoking an IncrementingEventCounter callback function while holding that lock can lead to a deadlock (the issue that #105548 was fixing).

More discussion about the tests for those counters here: dotnet/aspnetcore#57259

@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Aug 12, 2024
@dotnet-policy-service dotnet-policy-service bot removed the untriaged New issue has not been triaged by the area owner label Aug 13, 2024
@karelz karelz added this to the 9.0.0 milestone Sep 3, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Oct 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Net.Http blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' in-pr There is an active PR which will close this issue when it is merged Known Build Error Use this to report build issues in the .NET Helix tab
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants