Skip to content

Commit

Permalink
[ML] APM Correlations: Fix usage in load balancing/HA setups. (#115145)…
Browse files Browse the repository at this point in the history
… (#118004)

- The way we customized the use of search strategies caused issues with race conditions when multiple Kibana instances were used for load balancing. This PR migrates away from search strategies and uses regular APM API endpoints.
- The task that manages calling the sequence of queries to run the correlations analysis is now in a custom React hook (useFailedTransactionsCorrelations / useLatencyCorrelations) instead of a task on the Kibana server side. While they show up as new lines/files in the git diff, the code for the hooks is more or less a combination of the previous useSearchStrategy and the server side service files that managed queries and state.
- The consuming React UI components only needed minimal changes. The above mentioned hooks return the same data structure as the previously used useSearchStrategy. This also means functional UI tests didn't need any changes and should pass as is.
- API integration tests have been added for the individual new endpoints. The test files that were previously used for the search strategies are still there to simulate a full analysis run, the assertions for the resulting data have the same values, it's just the structure that had to be adapted.
- Previously all ES queries of the analysis were run sequentially. The new endpoints run ES queries in parallel where possible. Chunking is managed in the hooks on the client side.
- For now the endpoints use the standard current user's esClient. I tried to use the APM client, but it was missing a wrapper for the fieldCaps method and I ran into a problem when trying to construct a random_score query. Sticking to the esClient allowed to leave most of the functions that run the actual queries unchanged. If possible I'd like to pick this up in a follow up. All the endpoints still use withApmSpan() now though.
- The previous use of generators was also refactored away, as mentioned above, the queries are now run in parallel.
Because we might run up to hundreds of similar requests for correlation analysis, we don't want the analysis to fail if just a single query fails like we did in the previous search strategy based task. I created a util splitAllSettledPromises() to handle Promise.allSettled() and split the results and errors to make the handling easier. Better naming suggestions are welcome 😅 . A future improvement could be to not run individual queries but combine them into nested aggs or using msearch. That's out of scope for this PR though.
  • Loading branch information
walterra authored Nov 9, 2021
1 parent db0d42a commit 304911b
Show file tree
Hide file tree
Showing 97 changed files with 2,811 additions and 2,760 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -82,9 +82,5 @@ export const KS_TEST_THRESHOLD = 0.1;

export const ERROR_CORRELATION_THRESHOLD = 0.02;

/**
* Field stats/top values sampling constants
*/

export const SAMPLER_TOP_TERMS_THRESHOLD = 100000;
export const SAMPLER_TOP_TERMS_SHARD_SIZE = 5000;
export const DEFAULT_PERCENTILE_THRESHOLD = 95;
export const DEBOUNCE_INTERVAL = 100;
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,8 @@ export interface FailedTransactionsCorrelation extends FieldValuePair {
export type FailedTransactionsCorrelationsImpactThreshold =
typeof FAILED_TRANSACTIONS_IMPACT_THRESHOLD[keyof typeof FAILED_TRANSACTIONS_IMPACT_THRESHOLD];

export interface FailedTransactionsCorrelationsParams {
percentileThreshold: number;
}

export interface FailedTransactionsCorrelationsRawResponse {
log: string[];
export interface FailedTransactionsCorrelationsResponse {
ccsWarning: boolean;
failedTransactionsCorrelations?: FailedTransactionsCorrelation[];
percentileThresholdValue?: number;
overallHistogram?: HistogramItem[];
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@
*/

import { estypes } from '@elastic/elasticsearch';
import { SearchStrategyParams } from './types';
import { CorrelationsParams } from './types';

export interface FieldStatsCommonRequestParams extends SearchStrategyParams {
export interface FieldStatsCommonRequestParams extends CorrelationsParams {
samplerShardSize: number;
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,22 +14,8 @@ export interface LatencyCorrelation extends FieldValuePair {
ksTest: number;
}

export interface LatencyCorrelationSearchServiceProgress {
started: number;
loadedHistogramStepsize: number;
loadedOverallHistogram: number;
loadedFieldCandidates: number;
loadedFieldValuePairs: number;
loadedHistograms: number;
}

export interface LatencyCorrelationsParams {
percentileThreshold: number;
analyzeCorrelations: boolean;
}

export interface LatencyCorrelationsRawResponse {
log: string[];
export interface LatencyCorrelationsResponse {
ccsWarning: boolean;
overallHistogram?: HistogramItem[];
percentileThresholdValue?: number;
latencyCorrelations?: LatencyCorrelation[];
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,35 +26,20 @@ export interface ResponseHit {
_source: ResponseHitSource;
}

export interface RawResponseBase {
ccsWarning: boolean;
took: number;
}

export interface SearchStrategyClientParamsBase {
export interface CorrelationsClientParams {
environment: string;
kuery: string;
serviceName?: string;
transactionName?: string;
transactionType?: string;
}

export interface RawSearchStrategyClientParams
extends SearchStrategyClientParamsBase {
start?: string;
end?: string;
}

export interface SearchStrategyClientParams
extends SearchStrategyClientParamsBase {
start: number;
end: number;
}

export interface SearchStrategyServerParams {
export interface CorrelationsServerParams {
index: string;
includeFrozen?: boolean;
}

export type SearchStrategyParams = SearchStrategyClientParams &
SearchStrategyServerParams;
export type CorrelationsParams = CorrelationsClientParams &
CorrelationsServerParams;
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@
*/

import { FIELDS_TO_ADD_AS_CANDIDATE } from '../constants';
import { hasPrefixToInclude } from '../utils';
import { hasPrefixToInclude } from './has_prefix_to_include';

import type { FieldValuePair } from '../../../../common/search_strategies/types';
import type { FieldValuePair } from '../types';

export const getPrioritizedFieldValuePairs = (
fieldValuePairs: FieldValuePair[]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,5 @@
* 2.0.
*/

export { failedTransactionsCorrelationsSearchServiceProvider } from './failed_transactions_correlations_search_service';
export { getPrioritizedFieldValuePairs } from './get_prioritized_field_value_pairs';
export { hasPrefixToInclude } from './has_prefix_to_include';
15 changes: 0 additions & 15 deletions x-pack/plugins/apm/common/search_strategies/constants.ts

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ import {
import React, { Fragment, useState } from 'react';
import { i18n } from '@kbn/i18n';
import { FormattedMessage } from '@kbn/i18n/react';
import { FieldStats } from '../../../../../common/search_strategies/field_stats_types';
import { FieldStats } from '../../../../../common/correlations/field_stats_types';
import { OnAddFilter, TopValues } from './top_values';
import { useTheme } from '../../../../hooks/use_theme';

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ import {
EuiToolTip,
} from '@elastic/eui';
import { i18n } from '@kbn/i18n';
import { FieldStats } from '../../../../../common/search_strategies/field_stats_types';
import { FieldStats } from '../../../../../common/correlations/field_stats_types';
import { asPercent } from '../../../../../common/utils/formatters';
import { useTheme } from '../../../../hooks/use_theme';

Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ import type { Criteria } from '@elastic/eui/src/components/basic_table/basic_tab
import { FETCH_STATUS } from '../../../hooks/use_fetcher';
import { useUiTracker } from '../../../../../observability/public';
import { useTheme } from '../../../hooks/use_theme';
import type { FieldValuePair } from '../../../../common/search_strategies/types';
import type { FieldValuePair } from '../../../../common/correlations/types';

const PAGINATION_SIZE_OPTIONS = [5, 10, 20, 50];

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,45 +29,38 @@ import type { Direction } from '@elastic/eui/src/services/sort/sort_direction';
import { i18n } from '@kbn/i18n';
import { FormattedMessage } from '@kbn/i18n/react';

import {
enableInspectEsQueries,
useUiTracker,
} from '../../../../../observability/public';
import { useUiTracker } from '../../../../../observability/public';

import { asPercent } from '../../../../common/utils/formatters';
import { FailedTransactionsCorrelation } from '../../../../common/search_strategies/failed_transactions_correlations/types';
import {
APM_SEARCH_STRATEGIES,
DEFAULT_PERCENTILE_THRESHOLD,
} from '../../../../common/search_strategies/constants';
import { FieldStats } from '../../../../common/search_strategies/field_stats_types';
import { FailedTransactionsCorrelation } from '../../../../common/correlations/failed_transactions_correlations/types';
import { DEFAULT_PERCENTILE_THRESHOLD } from '../../../../common/correlations/constants';
import { FieldStats } from '../../../../common/correlations/field_stats_types';

import { useApmPluginContext } from '../../../context/apm_plugin/use_apm_plugin_context';
import { useLocalStorage } from '../../../hooks/useLocalStorage';
import { FETCH_STATUS } from '../../../hooks/use_fetcher';
import { useSearchStrategy } from '../../../hooks/use_search_strategy';
import { useTheme } from '../../../hooks/use_theme';

import { ImpactBar } from '../../shared/ImpactBar';
import { push } from '../../shared/Links/url_helpers';

import { CorrelationsTable } from './correlations_table';
import { FailedTransactionsCorrelationsHelpPopover } from './failed_transactions_correlations_help_popover';
import { isErrorMessage } from './utils/is_error_message';
import { getFailedTransactionsCorrelationImpactLabel } from './utils/get_failed_transactions_correlation_impact_label';
import { getOverallHistogram } from './utils/get_overall_histogram';
import {
TransactionDistributionChart,
TransactionDistributionChartData,
} from '../../shared/charts/transaction_distribution_chart';
import { CorrelationsLog } from './correlations_log';
import { CorrelationsEmptyStatePrompt } from './empty_state_prompt';
import { CrossClusterSearchCompatibilityWarning } from './cross_cluster_search_warning';
import { CorrelationsProgressControls } from './progress_controls';
import { useTransactionColors } from './use_transaction_colors';
import { CorrelationsContextPopover } from './context_popover';
import { OnAddFilter } from './context_popover/top_values';

import { useFailedTransactionsCorrelations } from './use_failed_transactions_correlations';

export function FailedTransactionsCorrelations({
onFilter,
}: {
Expand All @@ -77,18 +70,12 @@ export function FailedTransactionsCorrelations({
const transactionColors = useTransactionColors();

const {
core: { notifications, uiSettings },
core: { notifications },
} = useApmPluginContext();
const trackApmEvent = useUiTracker({ app: 'apm' });

const inspectEnabled = uiSettings.get<boolean>(enableInspectEsQueries);

const { progress, response, startFetch, cancelFetch } = useSearchStrategy(
APM_SEARCH_STRATEGIES.APM_FAILED_TRANSACTIONS_CORRELATIONS,
{
percentileThreshold: DEFAULT_PERCENTILE_THRESHOLD,
}
);
const { progress, response, startFetch, cancelFetch } =
useFailedTransactionsCorrelations();

const fieldStats: Record<string, FieldStats> | undefined = useMemo(() => {
return response.fieldStats?.reduce((obj, field) => {
Expand All @@ -97,7 +84,6 @@ export function FailedTransactionsCorrelations({
}, {} as Record<string, FieldStats>);
}, [response?.fieldStats]);

const progressNormalized = progress.loaded / progress.total;
const { overallHistogram, hasData, status } = getOverallHistogram(
response,
progress.isRunning
Expand Down Expand Up @@ -368,7 +354,7 @@ export function FailedTransactionsCorrelations({
}, [fieldStats, onAddFilter, showStats]);

useEffect(() => {
if (isErrorMessage(progress.error)) {
if (progress.error) {
notifications.toasts.addDanger({
title: i18n.translate(
'xpack.apm.correlations.failedTransactions.errorTitle',
Expand All @@ -377,7 +363,7 @@ export function FailedTransactionsCorrelations({
'An error occurred performing correlations on failed transactions',
}
),
text: progress.error.toString(),
text: progress.error,
});
}
}, [progress.error, notifications.toasts]);
Expand Down Expand Up @@ -439,7 +425,7 @@ export function FailedTransactionsCorrelations({

const showCorrelationsEmptyStatePrompt =
correlationTerms.length < 1 &&
(progressNormalized === 1 || !progress.isRunning);
(progress.loaded === 1 || !progress.isRunning);

const transactionDistributionChartData: TransactionDistributionChartData[] =
[];
Expand All @@ -457,8 +443,8 @@ export function FailedTransactionsCorrelations({
if (Array.isArray(response.errorHistogram)) {
transactionDistributionChartData.push({
id: i18n.translate(
'xpack.apm.transactionDistribution.chart.allFailedTransactionsLabel',
{ defaultMessage: 'All failed transactions' }
'xpack.apm.transactionDistribution.chart.failedTransactionsLabel',
{ defaultMessage: 'Failed transactions' }
),
histogram: response.errorHistogram,
});
Expand Down Expand Up @@ -525,7 +511,7 @@ export function FailedTransactionsCorrelations({
<EuiText color="subdued" size="xs">
<FormattedMessage
id="xpack.apm.transactionDetails.tabs.failedTransactionsCorrelationsChartDescription"
defaultMessage="Log-log plot for latency (x) by transactions (y) with overlapping bands for {br}{allTransactions}, {allFailedTransactions} and {focusTransaction}."
defaultMessage="Log-log plot for latency (x) by transactions (y) with overlapping bands for {br}{allTransactions}, {failedTransactions} and {focusTransaction}."
values={{
br: <br />,
allTransactions: (
Expand All @@ -536,13 +522,13 @@ export function FailedTransactionsCorrelations({
/>
</span>
),
allFailedTransactions: (
failedTransactions: (
<span
style={{ color: transactionColors.ALL_FAILED_TRANSACTIONS }}
>
<FormattedMessage
id="xpack.apm.transactionDetails.tabs.failedTransactionsCorrelationsChartAllFailedTransactions"
defaultMessage="all failed transactions"
id="xpack.apm.transactionDetails.tabs.failedTransactionsCorrelationsChartFailedTransactions"
defaultMessage="failed transactions"
/>
</span>
),
Expand Down Expand Up @@ -621,7 +607,7 @@ export function FailedTransactionsCorrelations({
<EuiSpacer size="s" />

<CorrelationsProgressControls
progress={progressNormalized}
progress={progress.loaded}
isRunning={progress.isRunning}
onRefresh={startFetch}
onCancel={cancelFetch}
Expand Down Expand Up @@ -654,7 +640,6 @@ export function FailedTransactionsCorrelations({
)}
{showCorrelationsEmptyStatePrompt && <CorrelationsEmptyStatePrompt />}
</div>
{inspectEnabled && <CorrelationsLog logMessages={response.log ?? []} />}
</div>
);
}
Loading

0 comments on commit 304911b

Please sign in to comment.