diff --git a/docs-2.0/6.monitor-and-metrics/1.query-performance-metrics.md b/docs-2.0/6.monitor-and-metrics/1.query-performance-metrics.md index 58d29e71350..5ead80bc768 100644 --- a/docs-2.0/6.monitor-and-metrics/1.query-performance-metrics.md +++ b/docs-2.0/6.monitor-and-metrics/1.query-performance-metrics.md @@ -2,24 +2,26 @@ Nebula Graph supports querying the monitoring metrics through HTTP ports. -## Metrics +## Metrics structure Each metric of Nebula Graph consists of three fields: name, type, and time range. The fields are separated by periods, for example, `num_queries.sum.600`. Different Nebula Graph services (Graph, Storage, or Meta) support different metrics. The detailed description is as follows. |Field|Example|Description| |-|-|-| -|Metric name|`num_queries`|Indicates the function of the metric. For the detailed description of metrics, see [Metrics](../nebula-dashboard/6.monitor-parameter.md).| -|Metric type|`sum`|Indicates how the metrics are collected. Supported types are SUM, COUNT, AVG, RATE, and the P-th sample quantiles such as P75, P95, P99, and P99.9.| +|Metric name|`num_queries`|Indicates the function of the metric.| +|Metric type|`sum`|Indicates how the metrics are collected. Supported types are SUM, AVG, RATE, and the P-th sample quantiles such as P75, P95, P99, and P99.9.| |Time range|`600`|The time range in seconds for the metric collection. Supported values are 5, 60, 600, and 3600, representing the last 5 seconds, 1 minute, 10 minutes, and 1 hour.| ### Space-level metrics The Graph service supports a set of space-level metrics that record the information of different graph spaces separately. -The name of space-level metrics contains the corresponding graph space name in the form of `{space=space_name}`, for example, `query_latency_us{space=basketballplayer}.avg.3600`. - To enable space-level metrics, set the value of `enable_space_level_metrics` to `true` in the Graph service configuration file before starting Nebula Graph. For details about how to modify the configuration, see [Configuration Management](../5.configurations-and-logs/1.configurations/1.configurations.md). +!!! note + + Space-level metrics can be queried only by querying all metrics. For example, run `curl -G "http://192.168.8.40:19559/stats"` to show all metrics. The returned result contains the graph space name in the form of '{space=space_name}', such as `num_active_queries{space=basketballplayer}.sum.5=0`. + ## Query metrics over HTTP ### Syntax @@ -103,4 +105,116 @@ curl -G "http://:/stats?stats= [&format=json]" num_heartbeats.sum.60=40 num_heartbeats.sum.600=394 num_heartbeats.sum.3600=2364 + ... ``` + +## Metric description + +### Graph + +| Parameter | Description | +| ---------------------------------------------- | ------------------------------------------------------------ | +| `num_active_queries` | The number of queries currently being executed. | +| `num_active_sessions` | The number of currently active sessions. | +| `num_aggregate_executors` | The number of executions for the Aggregation operator. | +| `num_auth_failed_sessions_bad_username_password` | The number of sessions where authentication failed due to incorrect username and password. | +| `num_auth_failed_sessions_out_of_max_allowed` | The number of sessions that failed to authenticate logins because the value of the parameter `FLAG_OUT_OF_MAX_ALLOWED_CONNECTIONS` was exceeded.| +| `num_auth_failed_sessions` | The number of sessions in which login authentication failed. | +| `num_indexscan_executors` | The number of executions for index scan operators. | +| `num_killed_queries` | The number of killed queries. | +| `num_opened_sessions` | The number of sessions connected to the server. | +| `num_queries` | The number of queries. | +| `num_query_errors_leader_changes` | The number of the raft leader changes due to query errors. | +| `num_query_errors` | The number of query errors. | +| `num_reclaimed_expired_sessions` | The number of expired sessions actively reclaimed by the server. | +| `num_rpc_sent_to_metad_failed` | The number of failed RPC requests that the Graphd service sends to the Metad service. | +| `num_rpc_sent_to_metad` | The number of RPC requests that the Graphd service sent to the Metad service. | +| `num_rpc_sent_to_storaged_failed` | The number of failed RPC requests that the Graphd service sent to the Storaged service. | +| `num_rpc_sent_to_storaged` | The number of RPC requests that the Graphd service sent to the Storaged service. | +| `num_sentences` | The number of statements received by the Graphd service. | +| `num_slow_queries` | The number of slow queries. | +| `num_sort_executors` | The number of executions for the Sort operator. | +| `optimizer_latency_us` | The latency of executing optimizer statements. | +| `query_latency_us` | The average latency of queries. | +| `slow_query_latency_us` | The average latency of slow queries. | +| `num_queries_hit_memory_watermark` | The number of queries that reached the memory watermark. | + +### Meta + +| Parameter | Description | +| -------------------------- | ----------------------------------- | +| `commit_log_latency_us` | The latency of committing logs in Raft. | +| `commit_snapshot_latency_us` | The latency of committing snapshots in Raft. | +| `heartbeat_latency_us` | The latency of heartbeats. | +| `num_heartbeats` | The number of heartbeats. | +| `num_raft_votes` | The number of votes in Raft. | +| `transfer_leader_latency_us` | The latency of transferring the raft leader. | +| `num_agent_heartbeats` | The number of heartbeats for the AgentHBProcessor.| +| `agent_heartbeat_latency_us` | The average latency of the AgentHBProcessor.| + +### Storage + +| Parameter | Description | +| ---------------------------- | --------------------------------------------------- | +| `add_edges_atomic_latency_us` | The average latency of adding edge single. | +| `add_edges_latency_us` | The average latency of adding edges. | +| `add_vertices_latency_us` | The average latency of adding vertices. | +| `commit_log_latency_us` | The latency of committing logs in Raft. | +| `commit_snapshot_latency_us` | The latency of committing snapshots in Raft. | +| `delete_edges_latency_us` | The average latency of deleting edges. | +| `delete_vertices_latency_us` | The average latency of deleting vertices. | +| `get_neighbors_latency_us` | The average latency of querying neighbor vertices. | +| `num_get_prop` | The number of executions for the GetPropProcessor. | +| `num_get_neighbors_errors` | The number of execution errors for the GetNeighborsProcessor. | +| `get_prop_latency_us` | The average latency of executions for the GetPropProcessor.| +| `num_edges_deleted` | The number of deleted edges. | +| `num_edges_inserted` | The number of inserted edges. | +| `num_raft_votes` | The number of votes in Raft. | +| `num_rpc_sent_to_metad_failed` | The number of failed RPC requests that the Storage service sent to the Meta service. | +| `num_rpc_sent_to_metad` | The number of RPC requests that the Storaged service sent to the Metad service. | +| `num_tags_deleted` | The number of deleted tags. | +| `num_vertices_deleted` | The number of deleted vertices. | +| `num_vertices_inserted` | The number of inserted vertices. | +| `transfer_leader_latency_us` | The latency of transferring the raft leader. | +| `lookup_latency_us` | The average latency of executions for the LookupProcessor. | +| `num_lookup_errors` | The number of execution errors for the LookupProcessor.| +| `num_scan_vertex` | The number of executions for the ScanVertexProcessor.| +| `num_scan_vertex_errors` | The number of execution errors for the ScanVertexProcessor.| +| `update_edge_latency_us` | The average latency of executions for the UpdateEdgeProcessor.| +| `num_update_vertex` | The number of executions for the UpdateVertexProcessor.| +| `num_update_vertex_errors` | The number of execution errors for the UpdateVertexProcessor.| +| `kv_get_latency_us` | The average latency of executions for the Getprocessor.| +| `kv_put_latency_us` | The average latency of executions for the PutProcessor.| +| `kv_remove_latency_us` | The average latency of executions for the RemoveProcessor.| +| `num_kv_get_errors` | The number of execution errors for the GetProcessor.| +| `num_kv_get` | The number of executions for the GetProcessor.| +| `num_kv_put_errors` | The number of execution errors for the PutProcessor.| +| `num_kv_put` | The number of executions for the PutProcessor.| +| `num_kv_remove_errors` | The number of execution errors for the RemoveProcessor.| +| `num_kv_remove` | The number of executions for the RemoveProcessor.| +| `forward_tranx_latency_us` | The average latency of transmission.| + +### Space-level + +| Parameter | Description | +| ---------------------------------------------- | ----------------------------------------- | +| `num_active_queries` | The number of queries currently being executed. | +| `num_queries` | The number of queries. | +| `num_sentences` | The number of statements received by the Graphd service. | +| `optimizer_latency_us` | The latency of executing optimizer statements. | +| `query_latency_us` | The average latency of queries. | +| `num_slow_queries` | The number of slow queries. | +| `num_query_errors` | The number of query errors. | +| `num_query_errors_leader_changes` | The number of raft leader changes due to query errors. | +| `num_killed_queries` | The number of killed queries. | +| `num_aggregate_executors` | The number of executions for the Aggregation operator. | +| `num_sort_executors` | The number of executions for the Sort operator. | +| `num_indexscan_executors` | The number of executions for index scan operators. | +| `num_oom_queries` | The number of queries that caused memory to run out. | + + + + + + +