Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] APM Correlations: Fix chart errors caused by inconsistent histogram range steps. #138259

Merged
merged 12 commits into from
Aug 11, 2022
Merged
Original file line number Diff line number Diff line change
Expand Up @@ -168,11 +168,16 @@ describe('useFailedTransactionsCorrelations', () => {
);

try {
// Each simulated request takes 100ms. After an inital 50ms
// we track the internal requests the hook is running and
// check the expected progress after these requests.
jest.advanceTimersByTime(50);
await waitFor(() => expect(result.current.progress.loaded).toBe(0));
jest.advanceTimersByTime(100);
await waitFor(() => expect(result.current.progress.loaded).toBe(0));
jest.advanceTimersByTime(100);
await waitFor(() => expect(result.current.progress.loaded).toBe(0));
qn895 marked this conversation as resolved.
Show resolved Hide resolved
jest.advanceTimersByTime(100);
await waitFor(() => expect(result.current.progress.loaded).toBe(0.05));

expect(result.current.progress).toEqual({
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -85,48 +85,54 @@ export function useFailedTransactionsCorrelations() {
fallbackResult: undefined,
};

const [overallHistogramResponse, errorHistogramRespone] =
await Promise.all([
// Initial call to fetch the overall distribution for the log-log plot.
callApmApi(
'POST /internal/apm/latency/overall_distribution/transactions',
{
signal: abortCtrl.current.signal,
params: {
body: {
...fetchParams,
percentileThreshold: DEFAULT_PERCENTILE_THRESHOLD,
chartType:
LatencyDistributionChartType.failedTransactionsCorrelations,
},
},
}
),
callApmApi(
'POST /internal/apm/latency/overall_distribution/transactions',
{
signal: abortCtrl.current.signal,
params: {
body: {
...fetchParams,
percentileThreshold: DEFAULT_PERCENTILE_THRESHOLD,
termFilters: [
{
fieldName: EVENT_OUTCOME,
fieldValue: EventOutcome.failure,
},
],
chartType:
LatencyDistributionChartType.failedTransactionsCorrelations,
// Initial call to fetch the overall distribution for the log-log plot.
const overallHistogramResponse = await callApmApi(
'POST /internal/apm/latency/overall_distribution/transactions',
{
signal: abortCtrl.current.signal,
params: {
body: {
...fetchParams,
percentileThreshold: DEFAULT_PERCENTILE_THRESHOLD,
chartType:
LatencyDistributionChartType.failedTransactionsCorrelations,
},
},
}
);

const {
overallHistogram,
totalDocCount,
percentileThresholdValue,
durationMin,
durationMax,
} = overallHistogramResponse;

const errorHistogramResponse = await callApmApi(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered not blocking this call? So we'd render the chart with the histogram and load the errors later? We could even add an indication that errors are still being fetched on the UI.

cc: @formgeist

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, I tweaked the loading behavior in 85b2ff2. After loading the overall histogram data we'll now already trigger an update that displays the chart with the available data and updates the progress bar. The updated jest tests also reflect the updated progress steps of the progress bar.

'POST /internal/apm/latency/overall_distribution/transactions',
{
signal: abortCtrl.current.signal,
params: {
body: {
...fetchParams,
percentileThreshold: DEFAULT_PERCENTILE_THRESHOLD,
termFilters: [
{
fieldName: EVENT_OUTCOME,
fieldValue: EventOutcome.failure,
},
},
}
),
]);
],
durationMin,
durationMax,
chartType:
LatencyDistributionChartType.failedTransactionsCorrelations,
},
},
}
);

const { overallHistogram, totalDocCount, percentileThresholdValue } =
overallHistogramResponse;
const { overallHistogram: errorHistogram } = errorHistogramRespone;
const { overallHistogram: errorHistogram } = errorHistogramResponse;

responseUpdate.errorHistogram = errorHistogram;
responseUpdate.overallHistogram = overallHistogram;
Expand Down Expand Up @@ -179,7 +185,12 @@ export function useFailedTransactionsCorrelations() {
{
signal: abortCtrl.current.signal,
params: {
body: { ...fetchParams, fieldCandidates: fieldCandidatesChunk },
body: {
...fetchParams,
fieldCandidates: fieldCandidatesChunk,
durationMin,
durationMax,
},
},
}
);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -86,20 +86,25 @@ export function useLatencyCorrelations() {
};

// Initial call to fetch the overall distribution for the log-log plot.
const { overallHistogram, totalDocCount, percentileThresholdValue } =
await callApmApi(
'POST /internal/apm/latency/overall_distribution/transactions',
{
signal: abortCtrl.current.signal,
params: {
body: {
...fetchParams,
percentileThreshold: DEFAULT_PERCENTILE_THRESHOLD,
chartType: LatencyDistributionChartType.latencyCorrelations,
},
const {
overallHistogram,
totalDocCount,
percentileThresholdValue,
durationMin,
durationMax,
} = await callApmApi(
'POST /internal/apm/latency/overall_distribution/transactions',
{
signal: abortCtrl.current.signal,
params: {
body: {
...fetchParams,
percentileThreshold: DEFAULT_PERCENTILE_THRESHOLD,
chartType: LatencyDistributionChartType.latencyCorrelations,
Comment on lines +89 to +103
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call is exactly the same as the one on x-pack/plugins/apm/public/components/app/correlations/use_failed_transactions_correlations.ts do you think it is worth extracting it and reusing it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we'd like to get this as a bugfix into 8.4 I'd like to avoid more refactoring, I added a comment to the meta issue in the tech debt section instead to follow up on it (Revisit hooks that fetch correlations results and possibly consolidate duplicate code like fetching the overall histogram.): #118009

},
}
);
},
}
);
responseUpdate.overallHistogram = overallHistogram;
responseUpdate.totalDocCount = totalDocCount;
responseUpdate.percentileThresholdValue = percentileThresholdValue;
Expand Down Expand Up @@ -192,7 +197,12 @@ export function useLatencyCorrelations() {
{
signal: abortCtrl.current.signal,
params: {
body: { ...fetchParams, fieldValuePairs: fieldValuePairChunk },
body: {
...fetchParams,
durationMin,
durationMax,
fieldValuePairs: fieldValuePairChunk,
},
},
}
);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -86,14 +86,18 @@ export const useTransactionDistributionChartData = () => {
params.serviceName &&
params.environment &&
params.start &&
params.end
params.end &&
overallLatencyData.durationMin &&
overallLatencyData.durationMax
) {
return callApmApi(
'POST /internal/apm/latency/overall_distribution/transactions',
{
params: {
body: {
...params,
durationMin: overallLatencyData.durationMin,
durationMax: overallLatencyData.durationMax,
percentileThreshold: DEFAULT_PERCENTILE_THRESHOLD,
termFilters: [
{
Expand All @@ -108,7 +112,7 @@ export const useTransactionDistributionChartData = () => {
);
}
},
[params]
[params, overallLatencyData.durationMin, overallLatencyData.durationMax]
);

useEffect(() => {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@

import type { HistogramItem } from '../../../../../common/correlations/types';

import { replaceHistogramDotsWithBars } from '.';
import { replaceHistogramZerosWithMinimumDomainValue } from '.';

describe('TransactionDistributionChart', () => {
describe('replaceHistogramDotsWithBars', () => {
it('does the thing', () => {
describe('replaceHistogramZerosWithMinimumDomainValue', () => {
it('replaces zeroes', () => {
const mockHistogram = [
{ doc_count: 10 },
{ doc_count: 10 },
Expand All @@ -25,15 +25,17 @@ describe('TransactionDistributionChart', () => {
{ doc_count: 10 },
] as HistogramItem[];

expect(replaceHistogramDotsWithBars(mockHistogram)).toEqual([
expect(
replaceHistogramZerosWithMinimumDomainValue(mockHistogram)
).toEqual([
{ doc_count: 10 },
{ doc_count: 10 },
{ doc_count: 0.0001 },
{ doc_count: 0.0001 },
{ doc_count: 0.0001 },
{ doc_count: 0.5 },
{ doc_count: 0.5 },
{ doc_count: 0.5 },
{ doc_count: 10 },
{ doc_count: 10 },
{ doc_count: 0.0001 },
{ doc_count: 0.5 },
{ doc_count: 10 },
{ doc_count: 10 },
]);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -83,18 +83,17 @@ const getAnnotationsStyle = (color = 'gray'): LineAnnotationStyle => ({
},
});

// TODO Revisit this approach since it actually manipulates the numbers
// showing in the chart and its tooltips.
const CHART_PLACEHOLDER_VALUE = 0.0001;
// With a log axis, Elastic Charts would not draw a continuous line for values that are 0.
// By replacing the 0s with a minimum domain value of >0 the line will be drawn as intended.
// This is just to visually fix the line, for tooltips, that number will be again rounded down to 0.
const Y_AXIS_MIN_DOMAIN = 0.5;

// Elastic charts will show any lone bin (i.e. a populated bin followed by empty bin)
// as a circular marker instead of a bar
// This provides a workaround by making the next bin not empty
// TODO Find a way to get rid of this workaround since it alters original values of the data.
export const replaceHistogramDotsWithBars = (histogramItems: HistogramItem[]) =>
export const replaceHistogramZerosWithMinimumDomainValue = (
histogramItems: HistogramItem[]
) =>
histogramItems.reduce((histogramItem, _, i) => {
if (histogramItem[i].doc_count === 0) {
histogramItem[i].doc_count = CHART_PLACEHOLDER_VALUE;
histogramItem[i].doc_count = Y_AXIS_MIN_DOMAIN;
}
return histogramItem;
}, histogramItems);
Expand Down Expand Up @@ -141,7 +140,7 @@ export function DurationDistributionChart({
) ?? 0;
const yTicks = Math.max(1, Math.ceil(Math.log10(yMax)));
const yAxisDomain = {
min: 0.5,
min: Y_AXIS_MIN_DOMAIN,
max: Math.pow(10, yTicks),
};

Expand Down Expand Up @@ -178,6 +177,10 @@ export function DurationDistributionChart({
line: {
visible: true,
},
point: {
visible: false,
radius: 0,
},
},
axes: {
tickLine: {
Expand Down Expand Up @@ -266,26 +269,28 @@ export function DurationDistributionChart({
ticks={yTicks}
gridLine={{ visible: true }}
/>
{data.map((d, i) => (
{data.map((d) => (
<AreaSeries
key={d.id}
id={d.id}
xScaleType={ScaleType.Log}
yScaleType={ScaleType.Log}
data={replaceHistogramDotsWithBars(d.histogram)}
data={replaceHistogramZerosWithMinimumDomainValue(d.histogram)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think wrapping it on a useMemo would help? At least it wouldn't recalculate on every render.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Moved this to a useMemo() in 52cf77e.

curve={CurveType.CURVE_STEP_AFTER}
xAccessor="key"
yAccessors={['doc_count']}
color={d.areaSeriesColor}
fit="lookahead"
// To make the area appear without the orphaned points technique,
// we changed the original data to replace values of 0 with 0.0001.
fit="linear"
areaSeriesStyle={{
fit: {
line: { visible: true },
},
}}
// To make the area appear with a continuous line,
// we changed the original data to replace values of 0 with Y_AXIS_MIN_DOMAIN.
// To show the correct values again in tooltips, we use a custom tickFormat to round values.
// We can safely do this because all duration values above 0 are without decimal points anyway.
// An update for Elastic Charts is in the works to be able to customize the above "fit"
// attribute. Once that is available we can get rid of the full workaround.
// Elastic Charts issue: https://github.com/elastic/elastic-charts/issues/1489
tickFormat={(p) => `${Math.round(p)}`}
tickFormat={(p) => `${Math.floor(p)}`}
/>
))}
</Chart>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,14 +32,35 @@ export const fetchDurationHistogramRangeSteps = async ({
kuery,
query,
searchMetrics,
durationMinOverride,
durationMaxOverride,
}: CommonCorrelationsQueryParams & {
chartType: LatencyDistributionChartType;
setup: Setup;
searchMetrics: boolean;
}): Promise<number[]> => {
durationMinOverride?: number;
durationMaxOverride?: number;
}): Promise<{
durationMin?: number;
durationMax?: number;
rangeSteps: number[];
}> => {
const steps = 100;

if (durationMinOverride && durationMaxOverride) {
return {
durationMin: durationMinOverride,
durationMax: durationMaxOverride,
rangeSteps: getHistogramRangeSteps(
durationMinOverride,
durationMaxOverride,
steps
),
};
}

const { apmEventClient } = setup;

const steps = 100;
const durationField = getDurationField(chartType, searchMetrics);

// when using metrics data, ensure we filter by docs with the appropriate duration field
Expand Down Expand Up @@ -71,7 +92,7 @@ export const fetchDurationHistogramRangeSteps = async ({
);

if (resp.hits.total.value === 0) {
return getHistogramRangeSteps(0, 1, 100);
return { rangeSteps: getHistogramRangeSteps(0, 1, 100) };
}

if (
Expand All @@ -81,11 +102,15 @@ export const fetchDurationHistogramRangeSteps = async ({
isFiniteNumber(resp.aggregations.duration_max.value)
)
) {
return [];
return { rangeSteps: [] };
}

const min = resp.aggregations.duration_min.value;
const max = resp.aggregations.duration_max.value * 2;
const durationMin = resp.aggregations.duration_min.value;
const durationMax = resp.aggregations.duration_max.value * 2;

return getHistogramRangeSteps(min, max, steps);
return {
durationMin,
durationMax,
rangeSteps: getHistogramRangeSteps(durationMin, durationMax, steps),
};
};
Loading