Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PeriodicBatchingSink Sealed #74

Closed
dominicp1973 opened this issue Apr 10, 2024 · 12 comments
Closed

PeriodicBatchingSink Sealed #74

dominicp1973 opened this issue Apr 10, 2024 · 12 comments

Comments

@dominicp1973
Copy link

As mentioned in issue #69 class PeriodicBatchingSink is back to being Sealed. This caused regressions in the past and then was undone, and now it's Sealed again in 4.0.

@nblumhardt fixed the issue within dependent package Serilog.Sinks.Elasticsearch but @Verdurakh experienced the same problem with Serilog.Sinks.Elasticsearch.

And I now do with Serilog.Sinks.SumoLogic. Example - System.TypeLoadException: 'Could not load type 'Serilog.Sinks.SumoLogic.Sinks.SumoLogicSink' from assembly 'Serilog.Sinks.SumoLogic, Version=2.4.0.0, Culture=neutral, PublicKeyToken=null' because the parent type is sealed.'

So marking the class as Sealed impacts other packages such as Serilog.Sinks.SumoLogic and probably others.

Could we not make the base class NOT sealed? This would fix the issue across dependent packages. Previously, this was an accepted fix.

Thank you for your attention.

@nblumhardt
Copy link
Member

Hi! Thanks for the note.

Just to clarify the situation with the Elasticsearch sink - the fixed version shipped a few days after the user mentioned above commented on it.

The issue isn't only the sealing of the type - it's substantially redesigned, so there's a lot of gunk involved in trying to keep the old API working:

https://github.com/serilog/serilog-sinks-periodicbatching/pull/58/files

As it's quite a popular package, Serilog.Sinks.PeriodicBatching faces the usual "catch-22": either never break anyone, and leave the code to slowly decay, or update it and require some downstream changes.

We did choose the "never break anyone" path for a very, very long stretch of time, but in the end had to prioritize keeping the sink in good health for the benefit of the well-maintained consumers that use it.

The good news, though - the reason I think Serilog.Sinks.SumoLogic hasn't been updated in six years is because newer alternatives exist:

In addition to being maintained, these three follow the pattern recommended by Sumo of batching multiple events into each request, while Serilog.Sinks.SumoLogic sends a request per log event.

Let me know if the above helps. I'd also be happy to help you fork and update Serilog.Sinks.SumoLogic, either to publish separately or to contribute back to that project, if the maintainer is interested.

@dallasbeek
Copy link

sealed change also broke https://github.com/ThiagoBarradas/serilog-sinks-newrelic-logs

@dominicp1973
Copy link
Author

Thank you very much @nblumhardt for taking the time to research this! Package SumoLogic.Logging.Serilog does not support SeriLog version >= 3.0. However, serilog.sinks.http looks promising - thank you again. @dallasbeek found another broken dependency. I wasn't sure why the class is now marked as Sealed. Could it be valid for a a developer to want to derive from the class? If of course, if this is problematic (it leads to the wrong design), then a Sealed attribute makes total sense. I wasn't sure about the intent behind this change.

@nblumhardt
Copy link
Member

Thanks for the follow-up @dominicp1973.

If you're on a platform that supports .NET Standard 2 or better, SumoLogic.Logging.Serilog does support Serilog 3+:

image

Could it be valid for a a developer to want to derive from the class?

The virtual methods that the older projects used no longer exist to derive from, so there's more to it. There still might be room to work around this, but the number of affected sinks has become very small, so where PRs are being accepted I've just updated and refreshed them.

Keen to hear how you go with SumoLogic.Logging.Serilog, if you're able to get it running.

HTH,
Nick

@nblumhardt
Copy link
Member

@bartelink QQ, I've been thinking about this one some more, perhaps this battle isn't worth fighting for Serilog.Sinks.PeriodicBatching, if we were to move

  • its implementation,
  • IBatchedLogEventSink,
  • a new WriteTo.Batched(), and
  • an updated WriteTo.Async() sharing the same internal logic,

over to the Serilog package?

This would mean a dependency on System.Threading.Channels there, but its TFM support matrix is the same as Serilog's right now.

So much of the ecosystem uses Serilog.Sinks.PeriodicBatching and Async that for the cost of four or five classes, it could be worth considering simplifying things by just bundling them.

This would be done alongside a revert of the API change here, and (ideally) typeforwarding IBatchedLogEventSink over to Serilog.dll. Old, unmaintained packages could use the (deprecated, increasingly stagnant, but maintained) Serilog.Sinks.PeriodicBatching, while maintained/updated ones could drop the dependency and rely on Serilog only.

Apologies for the somewhat raw proposal, just a passing thought and seems to be the kind of thing you might spot some more "cons" in...

@bartelink
Copy link
Member

bartelink commented Apr 20, 2024

It feels like a good plan to me in general:

  • Channels is about as core as it gets
  • Abstract Sink types and packages cause lots of noise and busywork from a maintenance perspective
  • From the ecosystem perspective, being able to traverse these pieces in a single sln makes it much easier for someone less familiar with how everything fits together to map things

I'm less sure about bundling Async though. I get that in terms of code and/or deps, its more or less a no-brainer. I accept that it causes busywork and confusion for many consumers for it to be separated.

The per day download numbers (281.3K vs 32.2K) suggest:

  • they're an order of magnitude apart in terms of prevalence
  • (very arguably!) less people use Async than should (maybe because it means another dependency)

While I and others have made generic app-internal common wrapper libs that rely on Async, with only the real composition root per concrete apps doing the overall tree of sinks, I'm not sure it's a in that the temptation to abstract that is a good thing.

However, as alluded to in the other recent issue, I think having it separated is also very important:

  • the fact that is that the It's Just a Sink premise is an important part of the overall Serilog design; it's worth labouring the point of.
  • the fact that Channels and the behaviors it brings to a system overall in terms of memory and threading profile are non trivial - the sort of thing that in general you want to be able to mentally separate (e.g. you would be less inclined to inline it if it was tied to an impl that permanently grabs a thread like the current impl does)
  • the fact that there's lots of perf tests and other details that can be separated suggest that keeping it in a separate repo has value in terms of the perceived complexity of Serilog as a whole

So, while inlining this into core (keeping the same namespacing and general separation) makes sense:

  • having some common/abstract interfaces and/or helpers live in a Serilog.Sinks namespace might also be useful
  • some helpers that are pubternal (stuff that Async and/or forks of it might need) could perhaps live in a Serilog.Sinks.Internal namespace.
  • Async is not actually an abstract sink, even though it has a lot in common with infra like PeriodicBatching

I guess that's me using a lot of words to say that inlining Async is almost definitely a thing to defer for a release cycle or two. I wouldn't rule it out permanently, as there might be other wins to be had from moving it to core:

  • details like e.g. how you do buffer monitoring, might be solved better if they lived in core
  • stuff like auditing contracts can and will get solved better if Async is in your face as you thing through things like that
  • I know it causes busywork from the point of view of F# interactive/C# interactive to have a separated Async package in play
  • adding stuff to core is not something that can be taken easily in practice - obviously, as you know better than anyone, anything in there will have lots of forked libs and random things people in corners build anchored to every possible public affordance, regardless of whether it's labelled Sinks.Internal and/or any other such semi-formal pubternal contracts

I should point out some biases of mine though:

  • I have not spent too as much time recently times maintaining App-Common libs and/or the general sort of infra that people juggling microservices substrates do (lucky me!), so I might be missing some benefits of inlining Async might have
  • I recently did a similar set of changes in Equinox (putting abstract/common things into the core lib, even though it breaks the simplicity/architectural purity a bit, and foists an additional not-100% clearly Core to .NET (System.Runtime.Caching) dependency on something that once was very boring and inert. Experience there showed what one might expect:
    • lots of busywork of separation melting away, bringing opportunities to simplify by virtue of being colocated
    • I'm still glad that years of forced separation of common helpers brought a lot of good things in terms of making for an overall well balanced set of abstractions and separation of duties overall
    • having a more simple "concrete vs abstract" rule of thumb (because there's no "and 60% of the sinks use this common helper lib too" thing going on) is a win
    • BUT I'm still holding off on inlining one piece on pretty much the above basis (it's Just a Store. Yes, even if it's only tiny. Yes, even if in practice probably more systems should use it than do (you're probably not testing things well if your app does not at least have some areas where you test with an in-memory Store). (Though I can definitely see myself eventually relenting on this, as having something that would be a no-brainer to inline for most less-involved people be separated is more painful for new users than people who have spent days/weeks/months digging in)
    • There are also common helpers that are commonly used alongside a core lib, but would be distracting e.g this and this which it's easy to make the case for putting in the box, that have plenty subtlety and hence also benefit from living separate lives for as long as possible (that lib once depended on the core lib and fulfilled a role similar to Serilog.Sinks.PeriodicBatching in that the bulk of the concrete stores depended on it).
    • Concrete packages can inline forks or copy-and-paste renditions of helpers in the core - not everything has to be strictly DRY and hence public (e.g. one can Include things if you're in the same repo, without having to elevate things to full pubternal status. (Though you can also end up doing highly suspicious hacks if you give up on the non-circumventable discipline that separated repos bring!)

It seems the logical thing to do is to start a discussion issue in Core though; the chances are that there's plenty people about who have wrangled dependency diamonds wrt Serilog, Batching and Async and will have valuable concrete factors in mind (and won't be looking around here for such a discussion!)

@nblumhardt
Copy link
Member

Appreciate all the feedback, @bartelink!

RE Async, I think the interesting opportunity there is to line it up with Batching so that the two were variations of the same API and implementation. The new PeriodicBatchingSink manages to avoid tying up a background thread while waiting for work and while dispatching it - if we were able to make Async effectively PBS with a batch size of 1 (efficiently implemented) that might guide us towards exploring something like IAsyncLogEventSink as a way to release the worker thread while dispatching events there, too 🤔 ...

I agree that there's not as much immediate incentive to do anything with Async, but keen to think through it in case there are design considerations in how Batching comes across that would limit options or create inconsistencies bringing Async over later 👍

I'll raise a proposal ticket over in serilog/serilog. I'll also PR the API revert, here, since I can't afford more support time trying to wrangle unmaintained sinks over to the new API. At least as a result of this process, the updated sinks should be trivial to re-target to an in-the-box batching mechanism :-)

@bartelink
Copy link
Member

It does make sense when you put it like that. I guess if you have a pair of stacked spike PRs to illustrate the benefit of doing both, that'd help.

The existing buffer monitoring API in .Async should probably not come over as-is though (just coz I don't like what I did and can't imagine many people using it as it is!) - it would seem that a more general cleaned up thing that covers monitoring a PBS (including being able to report shedding of buffered data) should replace it.

The deciding factor might be your appetite to cover off all this stuff as a single span of work vs letting Async be function as a demo of a minimal NUll Object-like PBS and doing it separately. If feels like you have a good pictre of how it all might collapse nicely so don't let me talk you out of it if you have the bandwidth.

@dominicp1973
Copy link
Author

I was able to solve the incompatibility between PeriodicBatchingSink (class is now Sealed) and SumoLogic.Logging.Serilog (no longer maintained).

We can use Serilog.Sinks.Http instead, with the following:
Logger.Log = new LoggerConfiguration()
.ReadFrom.AppSettings()
.Destructure.ToMaximumCollectionCount(3)
.Destructure.ToMaximumStringLength(1024)
.Destructure.ToMaximumDepth(1)
.WriteTo.Http(sumoLogicEndpoint,
queueLimitBytes: null,
logEventLimitBytes: null,
logEventsInBatchLimit: 100,
batchSizeLimitBytes: null,
TimeSpan.FromSeconds(10),
textFormatter: new SumoLogicTextFormatter(),
batchFormatter: new SumoLogicBatchFormatter()

With those new classes:
// Sumo logic is configured to separate lines using ^[\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}.*
public class SumoLogicTextFormatter : MessageTemplateTextFormatter
{
private const string sumoLogicTemplate = "[{Timestamp:yyyy-MM-dd HH:mm:ss.fff zzz}] [{Level}] {Message}{NewLine}{Exception}";

    public SumoLogicTextFormatter() : base(sumoLogicTemplate) { }
}

// Sumo logic is configured to separate lines using ^\[\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}.*
public class SumoLogicBatchFormatter : IBatchFormatter
{
    public SumoLogicBatchFormatter() { }

    public void Format(IEnumerable<string> logEvents, TextWriter output)
    {
        // Data validation
        if (null == logEvents)
            return;
        if (null == output)
            return;

        // Write strings
        foreach (string logEvent in logEvents)
        {
            if (string.IsNullOrWhiteSpace(logEvent))
                continue;
            output.WriteLine(logEvent);
        }
    }
}

Assuming SumoLog has been configured to separate messages posted together using:
^[\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}.*

@nblumhardt
Copy link
Member

Reinstated in #75

@nblumhardt
Copy link
Member

Adding a <PackageReference Include="Serilog.Sinks.PeriodicBatching" Version="4.1.0" /> will now resolve this.

@dallasbeek
Copy link

confirming that #74 (comment) is now fixed....ty

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants