Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

[3.1 port] Fix LTTng build for build environments with older liblttng-ust-dev #27294

Merged
merged 1 commit into from
Oct 21, 2019

Conversation

sywhang
Copy link

@sywhang sywhang commented Oct 18, 2019

Description

This fixes #27024.

One-line summary:
Official builds of .NET Core SDKs have broken LTTng support due to an unexpected macro redefinition causing the compiler to optimize out the bodies of LTTng event sinks.

More detailed description:
For all of our event sinks for LTTng, we wrap the call to fire events with tracepoint_enabled, which is a macro defined in a header file liblttng-ust-dev. This macro checks whether a particular probe point is enabled before we fire an event, so that we're not unnecessarily making the call to fire events through LTTng when nobody is listening to it.

For environments that don't want to build using LTTng, we have a macro definition for tracepoint_enabled, which used to be defined as:

#define tracepoint_enabled(provider, name) TRUE

This meant that for custom CoreCLR builds that were produced without LTTng library enabled, the runtime will always try to fire all events only to find out nobody is listening. So this behavior was changed to

#define tracepoint_enabled(provider, name) FALSE

Unfortunately this caused the compiler to optimize out the bodies for all the event sinks when such thing happens. This would be fine for custom builds that didn't want LTTng since they become no-ops. However, some build environments that we still use, we use old enough distributions that have old LTTng libraries, specifically CentOS 6 which uses liblttng-ust-dev version 2.4. This particular version of the library doesn't have the definition of tracepoint_enabled in its header, so we hit this bad behavior in builds produced in CentOS 6 container.

Even more unfortunately, CentOS 6 is the distro we use for our official SDK build containers for Linux, so this caused LTTng support in CoreCLR to regress in the official builds.

This fix changes the macro definition of tracepoint_enabled to use XplatEventLogger::IsEventLoggingEnabled method, which checks if the environment variable we require for using LTTng tracing (COMPlus_EnableEventLog) is enabled.

Customer Impact

Tracing using LTTng has been impacted. Any customer using LTTng for tracing their application is impacted.

Regression

Yes. From 2.x -> 3.x.

Risk

Low - changes are small and have been tested across variety of Linux distros that are affected by the code change (it doesn't affect non-Linux runtimes).

Tests

Tests for using LTTng and verifying the result traces have been written and were provided to vendors as part of the weekly diagnostics tool tests. dotnet/diagnostics#571 was filed to track the work item for adding automated testing for LTTng.

…otnet#27273)

* Fix macro redefinition to use XplatEventLogger instead of simply writing FALSE

* Fix linker error

* undo newline changes

* Some changes to comment

* Move wrapper export from eventpipe.cpp to eventtrace.cpp
@sywhang sywhang added this to the 3.1 milestone Oct 18, 2019
@sywhang sywhang requested a review from MeiChin-Tsai October 18, 2019 20:13
@janvorli
Copy link
Member

@sywhang A small correction:

Even more unfortunately, CentOS 6 is the distro we use for our official SDK build containers for Linux

We don't, we use CentOS 7.

@MeiChin-Tsai
Copy link

Approved for 3.1

@sywhang sywhang merged commit 7a8e08c into dotnet:release/3.1 Oct 21, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants