From f2cacd3f1875c97d9d14532e780aee62ce57ab7d Mon Sep 17 00:00:00 2001 From: Cheng Pan Date: Thu, 5 Dec 2024 13:59:29 +0800 Subject: [PATCH 1/3] [KYUUBI #6836] Ship `kafka-clients` in binary distribution tarball without compression libs ### Why are the changes needed? I'd like to include `kafka-clients` in the Kyuubi binary distribution tarball to enable the out-of-box support for sinking Kyuubi events to Kafka. - Kafka is an important component in modern data platforms, and is a defacto message queue implementation, especially in the big data domain - `kafka-clients` is released under Apache License V2, has no legal issue - `kafka-clients` is quite a light lib, has no third-party deps except for `slf4j-api` and a few optional compression libs - `kafka-clients` uses "none" compression as default, in practice, "gzip"(delegate to JDK gzip algorithm, no additional libs are required) already performs well for non-extreme cases Additionally, LOG4J2 has a built-in `KafkaAppender` that supports sinking logging to Kafka, which also requires `kafka-clients` in the classpath, I have some initial ideas to forward both Kyuubi server's and engine bootstrap processes log to Kafka in a structured format, will send another PR to add docs to guide users in configuring that. ### How was this patch tested? Review ### Was this patch authored or co-authored using generative AI tooling? No. Closes #6836 from pan3793/kafka-lib. Closes #6836 b069eb199 [Cheng Pan] Ship kafka-clients in binary distribution tarball without compression libs Authored-by: Cheng Pan Signed-off-by: Cheng Pan --- LICENSE-binary | 3 --- NOTICE-binary | 16 ---------------- dev/dependencyList | 3 --- pom.xml | 15 ++++++++++++++- 4 files changed, 14 insertions(+), 23 deletions(-) diff --git a/LICENSE-binary b/LICENSE-binary index e78047f5480..6b4c15fa772 100644 --- a/LICENSE-binary +++ b/LICENSE-binary @@ -306,8 +306,6 @@ io.vertx:vertx-grpc com.squareup.retrofit2:retrofit com.squareup.okhttp3:okhttp org.apache.kafka:kafka-clients -org.lz4:lz4-java -org.xerial.snappy:snappy-java org.xerial:sqlite-jdbc BSD @@ -319,7 +317,6 @@ jline:jline com.thoughtworks.paranamer:paranamer com.google.protobuf:protobuf-java-util com.google.protobuf:protobuf-java -com.github.luben:zstd-jni org.postgresql:postgresql Eclipse Distribution License - v 1.0 diff --git a/NOTICE-binary b/NOTICE-binary index b9bcdb21460..708624e3f39 100644 --- a/NOTICE-binary +++ b/NOTICE-binary @@ -1063,19 +1063,3 @@ which can be obtained at: * license/LICENSE.kafka.txt (Apache License 2.0) * HOMEPAGE: * https://github.com/apache/kafka - -This product optionally depends on 'snappy-java', Snappy compression and -decompression for Java, which can be obtained at: - - * LICENSE: - * license/LICENSE.snappy-java.txt (Apache License 2.0) - * HOMEPAGE: - * https://github.com/xerial/snappy-java - -This product optionally depends on 'lz4-java', Lz4 compression and -decompression for Java, which can be obtained at: - - * LICENSE: - * license/LICENSE.lz4-java.txt (Apache License 2.0) - * HOMEPAGE: - * https://github.com/lz4/lz4-java diff --git a/dev/dependencyList b/dev/dependencyList index 2d4e4fd34af..ab6a0499b7a 100644 --- a/dev/dependencyList +++ b/dev/dependencyList @@ -128,7 +128,6 @@ log4j-api/2.24.2//log4j-api-2.24.2.jar log4j-core/2.24.2//log4j-core-2.24.2.jar log4j-slf4j-impl/2.24.2//log4j-slf4j-impl-2.24.2.jar logging-interceptor/3.12.12//logging-interceptor-3.12.12.jar -lz4-java/1.8.0//lz4-java-1.8.0.jar metrics-core/4.2.26//metrics-core-4.2.26.jar metrics-jmx/4.2.26//metrics-jmx-4.2.26.jar metrics-json/4.2.26//metrics-json-4.2.26.jar @@ -173,7 +172,6 @@ simpleclient_tracer_otel_agent/0.16.0//simpleclient_tracer_otel_agent-0.16.0.jar slf4j-api/1.7.36//slf4j-api-1.7.36.jar snakeyaml-engine/2.7//snakeyaml-engine-2.7.jar snakeyaml/2.2//snakeyaml-2.2.jar -snappy-java/1.1.10.5//snappy-java-1.1.10.5.jar sqlite-jdbc/3.46.1.3//sqlite-jdbc-3.46.1.3.jar swagger-annotations/2.2.1//swagger-annotations-2.2.1.jar swagger-core/2.2.1//swagger-core-2.2.1.jar @@ -185,4 +183,3 @@ units/1.7//units-1.7.jar vertx-core/4.5.3//vertx-core-4.5.3.jar vertx-grpc/4.5.3//vertx-grpc-4.5.3.jar zjsonpatch/0.3.0//zjsonpatch-0.3.0.jar -zstd-jni/1.5.5-1//zstd-jni-1.5.5-1.jar diff --git a/pom.xml b/pom.xml index c6055b1b166..73146363243 100644 --- a/pom.xml +++ b/pom.xml @@ -1053,7 +1053,20 @@ org.apache.kafka kafka-clients ${kafka.version} - true + + + com.github.luben + zstd-jni + + + org.lz4 + lz4-java + + + org.xerial.snappy + snappy-java + + From 3167692732b7fe6ba3928db3ec89ca8419ed030d Mon Sep 17 00:00:00 2001 From: "Wang, Fei" Date: Thu, 5 Dec 2024 18:12:39 +0800 Subject: [PATCH 2/3] [KYUUBI #6829] Add metrics for batch pending max elapse time ### Why are the changes needed? 1. add metrics `kyuubi.operartion.batch_pending_max_elapse` for the batch pending max elapse time, which is helpful for batch health monitoring, and we can send alert if the batch pending elapse time too long 2. For `GET /api/v1/batches` api, limit the max time window for listing batches, which is helpful that, we want to reserve more metadata in kyuubi server end, for example: 90 days, but for list batches, we just want to allow user to search the last 7 days. It is optional. And if `create_time` is specified, order by `create_time` instead of `key_id`. https://github.com/apache/kyuubi/blob/68a6f48da53dd0ad2e20b450a41ca600b8c1e1d2/kyuubi-server/src/main/resources/sql/mysql/metadata-store-schema-1.8.0.mysql.sql#L32 ### How was this patch tested? GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #6829 from turboFei/batch_pending_time. Closes #6829 ee4f93125 [Wang, Fei] docs bf8169ad4 [Wang, Fei] comments f493a2af8 [Wang, Fei] new config ab7b6db65 [Wang, Fei] ut 168017587 [Wang, Fei] in memory session 510a30b6a [Wang, Fei] batchSearchWindow opt 1e93dd276 [Wang, Fei] save Authored-by: Wang, Fei Signed-off-by: Cheng Pan --- docs/configuration/settings.md | 37 +++---- docs/monitor/metrics.md | 101 +++++++++--------- .../org/apache/kyuubi/config/KyuubiConf.scala | 13 +++ .../kyuubi/metrics/MetricsConstants.scala | 1 + .../kyuubi/operation/BatchJobSubmission.scala | 8 ++ .../server/KyuubiRestFrontendService.scala | 15 ++- .../server/api/v1/BatchesResource.scala | 5 +- .../server/metadata/MetadataManager.scala | 9 +- .../server/metadata/MetadataStore.scala | 5 +- .../metadata/jdbc/JDBCMetadataStore.scala | 6 +- .../server/api/v1/BatchesResourceSuite.scala | 1 + 11 files changed, 125 insertions(+), 76 deletions(-) diff --git a/docs/configuration/settings.md b/docs/configuration/settings.md index fede369a842..acbc36197fa 100644 --- a/docs/configuration/settings.md +++ b/docs/configuration/settings.md @@ -377,24 +377,25 @@ You can configure the Kyuubi properties in `$KYUUBI_HOME/conf/kyuubi-defaults.co ### Metadata -| Key | Default | Meaning | Type | Since | -|-------------------------------------------------|----------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|-------| -| kyuubi.metadata.cleaner.enabled | true | Whether to clean the metadata periodically. If it is enabled, Kyuubi will clean the metadata that is in the terminate state with max age limitation. | boolean | 1.6.0 | -| kyuubi.metadata.cleaner.interval | PT30M | The interval to check and clean expired metadata. | duration | 1.6.0 | -| kyuubi.metadata.max.age | PT72H | The maximum age of metadata, the metadata exceeding the age will be cleaned. | duration | 1.6.0 | -| kyuubi.metadata.recovery.threads | 10 | The number of threads for recovery from the metadata store when the Kyuubi server restarts. | int | 1.6.0 | -| kyuubi.metadata.request.async.retry.enabled | true | Whether to retry in async when metadata request failed. When true, return success response immediately even the metadata request failed, and schedule it in background until success, to tolerate long-time metadata store outages w/o blocking the submission request. | boolean | 1.7.0 | -| kyuubi.metadata.request.async.retry.queue.size | 65536 | The maximum queue size for buffering metadata requests in memory when the external metadata storage is down. Requests will be dropped if the queue exceeds. Only take affect when kyuubi.metadata.request.async.retry.enabled is `true`. | int | 1.6.0 | -| kyuubi.metadata.request.async.retry.threads | 10 | Number of threads in the metadata request async retry manager thread pool. Only take affect when kyuubi.metadata.request.async.retry.enabled is `true`. | int | 1.6.0 | -| kyuubi.metadata.request.retry.interval | PT5S | The interval to check and trigger the metadata request retry tasks. | duration | 1.6.0 | -| kyuubi.metadata.store.class | org.apache.kyuubi.server.metadata.jdbc.JDBCMetadataStore | Fully qualified class name for server metadata store. | string | 1.6.0 | -| kyuubi.metadata.store.jdbc.database.schema.init | true | Whether to init the JDBC metadata store database schema. | boolean | 1.6.0 | -| kyuubi.metadata.store.jdbc.database.type | SQLITE | The database type for server jdbc metadata store.
  • SQLITE: SQLite3, JDBC driver `org.sqlite.JDBC`.
  • MYSQL: MySQL, JDBC driver `com.mysql.cj.jdbc.Driver` (fallback `com.mysql.jdbc.Driver`).
  • POSTGRESQL: PostgreSQL, JDBC driver `org.postgresql.Driver`.
  • CUSTOM: User-defined database type, need to specify corresponding JDBC driver.
  • Note that: The JDBC datasource is powered by HiKariCP, for datasource properties, please specify them with the prefix: kyuubi.metadata.store.jdbc.datasource. For example, kyuubi.metadata.store.jdbc.datasource.connectionTimeout=10000. | string | 1.6.0 | -| kyuubi.metadata.store.jdbc.driver | <undefined> | JDBC driver class name for server jdbc metadata store. | string | 1.6.0 | -| kyuubi.metadata.store.jdbc.password || The password for server JDBC metadata store. | string | 1.6.0 | -| kyuubi.metadata.store.jdbc.priority.enabled | false | Whether to enable the priority scheduling for batch impl v2. When false, ignore kyuubi.batch.priority and use the FIFO ordering strategy for batch job scheduling. Note: this feature may cause significant performance issues when using MySQL 5.7 as the metastore backend due to the lack of support for mixed order index. See more details at KYUUBI #5329. | boolean | 1.8.0 | -| kyuubi.metadata.store.jdbc.url | jdbc:sqlite:<KYUUBI_HOME>/kyuubi_state_store.db | The JDBC url for server JDBC metadata store. By default, it is a SQLite database url, and the state information is not shared across Kyuubi instances. To enable high availability for multiple kyuubi instances, please specify a production JDBC url. Note: this value support the variables substitution: ``. | string | 1.6.0 | -| kyuubi.metadata.store.jdbc.user || The username for server JDBC metadata store. | string | 1.6.0 | +| Key | Default | Meaning | Type | Since | +|-------------------------------------------------|----------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|--------| +| kyuubi.metadata.cleaner.enabled | true | Whether to clean the metadata periodically. If it is enabled, Kyuubi will clean the metadata that is in the terminate state with max age limitation. | boolean | 1.6.0 | +| kyuubi.metadata.cleaner.interval | PT30M | The interval to check and clean expired metadata. | duration | 1.6.0 | +| kyuubi.metadata.max.age | PT72H | The maximum age of metadata, the metadata exceeding the age will be cleaned. | duration | 1.6.0 | +| kyuubi.metadata.recovery.threads | 10 | The number of threads for recovery from the metadata store when the Kyuubi server restarts. | int | 1.6.0 | +| kyuubi.metadata.request.async.retry.enabled | true | Whether to retry in async when metadata request failed. When true, return success response immediately even the metadata request failed, and schedule it in background until success, to tolerate long-time metadata store outages w/o blocking the submission request. | boolean | 1.7.0 | +| kyuubi.metadata.request.async.retry.queue.size | 65536 | The maximum queue size for buffering metadata requests in memory when the external metadata storage is down. Requests will be dropped if the queue exceeds. Only take affect when kyuubi.metadata.request.async.retry.enabled is `true`. | int | 1.6.0 | +| kyuubi.metadata.request.async.retry.threads | 10 | Number of threads in the metadata request async retry manager thread pool. Only take affect when kyuubi.metadata.request.async.retry.enabled is `true`. | int | 1.6.0 | +| kyuubi.metadata.request.retry.interval | PT5S | The interval to check and trigger the metadata request retry tasks. | duration | 1.6.0 | +| kyuubi.metadata.search.window | <undefined> | The time window to restrict user queries to metadata within a specific period, starting from the current time to the past. It only affects `GET /api/v1/batches` API. You may want to set this to short period to improve query performance and reduce load on the metadata store when administer want to reserve the metadata for long time. The side-effects is that, the metadata created outside the window will not be invisible to users. If it is undefined, all metadata will be visible for users. | duration | 1.10.1 | +| kyuubi.metadata.store.class | org.apache.kyuubi.server.metadata.jdbc.JDBCMetadataStore | Fully qualified class name for server metadata store. | string | 1.6.0 | +| kyuubi.metadata.store.jdbc.database.schema.init | true | Whether to init the JDBC metadata store database schema. | boolean | 1.6.0 | +| kyuubi.metadata.store.jdbc.database.type | SQLITE | The database type for server jdbc metadata store.
    • SQLITE: SQLite3, JDBC driver `org.sqlite.JDBC`.
    • MYSQL: MySQL, JDBC driver `com.mysql.cj.jdbc.Driver` (fallback `com.mysql.jdbc.Driver`).
    • POSTGRESQL: PostgreSQL, JDBC driver `org.postgresql.Driver`.
    • CUSTOM: User-defined database type, need to specify corresponding JDBC driver.
    • Note that: The JDBC datasource is powered by HiKariCP, for datasource properties, please specify them with the prefix: kyuubi.metadata.store.jdbc.datasource. For example, kyuubi.metadata.store.jdbc.datasource.connectionTimeout=10000. | string | 1.6.0 | +| kyuubi.metadata.store.jdbc.driver | <undefined> | JDBC driver class name for server jdbc metadata store. | string | 1.6.0 | +| kyuubi.metadata.store.jdbc.password || The password for server JDBC metadata store. | string | 1.6.0 | +| kyuubi.metadata.store.jdbc.priority.enabled | false | Whether to enable the priority scheduling for batch impl v2. When false, ignore kyuubi.batch.priority and use the FIFO ordering strategy for batch job scheduling. Note: this feature may cause significant performance issues when using MySQL 5.7 as the metastore backend due to the lack of support for mixed order index. See more details at KYUUBI #5329. | boolean | 1.8.0 | +| kyuubi.metadata.store.jdbc.url | jdbc:sqlite:<KYUUBI_HOME>/kyuubi_state_store.db | The JDBC url for server JDBC metadata store. By default, it is a SQLite database url, and the state information is not shared across Kyuubi instances. To enable high availability for multiple kyuubi instances, please specify a production JDBC url. Note: this value support the variables substitution: ``. | string | 1.6.0 | +| kyuubi.metadata.store.jdbc.user || The username for server JDBC metadata store. | string | 1.6.0 | ### Metrics diff --git a/docs/monitor/metrics.md b/docs/monitor/metrics.md index 561014c370c..8043fa08102 100644 --- a/docs/monitor/metrics.md +++ b/docs/monitor/metrics.md @@ -40,56 +40,57 @@ The metrics system is configured via `$KYUUBI_HOME/conf/kyuubi-defaults.conf`. These metrics include: -| Metrics Prefix | Metrics Suffix | Type | Since | Description | -|--------------------------------------------------|----------------------------------------|-----------|-------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `kyuubi.exec.pool.threads.alive` | | gauge | 1.2.0 |
      threads keepAlive in the backend executive thread pool
      | -| `kyuubi.exec.pool.threads.active` | | gauge | 1.2.0 |
      threads active in the backend executive thread pool
      | -| `kyuubi.exec.pool.work_queue.size` | | gauge | 1.7.0 |
      work queue size in the backend executive thread pool
      | -| `kyuubi.connection.total` | | counter | 1.2.0 |
      cumulative connection count
      | -| `kyuubi.connection.total` | `${sessionType}` | counter | 1.7.0 |
      cumulative connection count with session type `${sessionType}`
      | -| `kyuubi.connection.opened` | | gauge | 1.2.0 |
      current active connection count
      | -| `kyuubi.connection.opened` | `${user}` | counter | 1.2.0 |
      current active connections count requested by a `${user}`
      | -| `kyuubi.connection.opened` | `${user}`
      `${sessionType}` | counter | 1.7.0 |
      current active connections count requested by a `${user}` with session type `${sessionType}`
      | -| `kyuubi.connection.opened` | `${sessionType}` | counter | 1.7.0 |
      current active connections count with session type `${sessionType}`
      | -| `kyuubi.connection.failed` | | counter | 1.2.0 |
      cumulative failed connection count
      | -| `kyuubi.connection.failed` | `${user}` | counter | 1.2.0 |
      cumulative failed connections for a `${user}`
      | -| `kyuubi.connection.failed` | `${sessionType}` | counter | 1.7.0 |
      cumulative failed connection count with session type `${sessionType}`
      | -| `kyuubi.operation.total` | | counter | 1.5.0 |
      cumulative opened operation count
      | -| `kyuubi.operation.total` | `${operationType}` | counter | 1.5.0 |
      cumulative opened count for the operation `${operationType}`
      | -| `kyuubi.operation.opened` | | gauge | 1.5.0 |
      current opened operation count
      | -| `kyuubi.operation.opened` | `${operationType}` | counter | 1.5.0 |
      current opened count for the operation `${operationType}`
      | -| `kyuubi.operation.failed` | `${operationType}`
      `.${errorType}` | counter | 1.5.0 |
      cumulative failed count for the operation `${operationType}` with a particular `${errorType}`, e.g. `execute_statement.AnalysisException`
      | -| `kyuubi.operation.state` | `${operationState}` | meter | 1.5.0 |
      kyuubi operation state rate
      | -| `kyuubi.operation.exec_time` | `${operationType}` | histogram | 1.7.0 |
      execution time histogram for the operation `${operationType}`, now only `ExecuteStatement` is enabled.
      | -| `kyuubi.engine.total` | | counter | 1.2.0 |
      cumulative created engines
      | -| `kyuubi.engine.timeout` | | counter | 1.2.0 |
      cumulative timeout engines
      | -| `kyuubi.engine.failed` | `${user}` | counter | 1.2.0 |
      cumulative explicitly failed engine count for a `${user}`
      | -| `kyuubi.engine.failed` | `${errorType}` | counter | 1.2.0 |
      cumulative explicitly failed engine count for a particular `${errorType}`, e.g. `ClassNotFoundException`
      | -| `kyuubi.backend_service.open_session` | | timer | 1.5.0 |
      kyuubi backend service `openSession` method execution time and rate
      | -| `kyuubi.backend_service.close_session` | | timer | 1.5.0 |
      kyuubi backend service `closeSession` method execution time and rate
      | -| `kyuubi.backend_service.get_info` | | timer | 1.5.0 |
      kyuubi backend service `getInfo` method execution time and rate
      | -| `kyuubi.backend_service.execute_statement` | | timer | 1.5.0 |
      kyuubi backend service `executeStatement` method execution time and rate
      | -| `kyuubi.backend_service.get_type_info` | | timer | 1.5.0 |
      kyuubi backend service `getTypeInfo` method execution time and rate
      | -| `kyuubi.backend_service.get_catalogs` | | timer | 1.5.0 |
      kyuubi backend service `getCatalogs` method execution time and rate
      | -| `kyuubi.backend_service.get_schemas` | | timer | 1.5.0 |
      kyuubi backend service `getSchemas` method execution time and rate
      | -| `kyuubi.backend_service.get_tables` | | timer | 1.5.0 |
      kyuubi backend service `getTables` method execution time and rate
      | -| `kyuubi.backend_service.get_table_types` | | timer | 1.5.0 |
      kyuubi backend service `getTableTypes` method execution time and rate
      | -| `kyuubi.backend_service.get_columns` | | timer | 1.5.0 |
      kyuubi backend service `getColumns` method execution time and rate
      | -| `kyuubi.backend_service.get_functions` | | timer | 1.5.0 |
      kyuubi backend service `getFunctions` method execution time and rate
      | -| `kyuubi.backend_service.get_operation_status` | | timer | 1.5.0 |
      kyuubi backend service `getOperationStatus` method execution time and rate
      | -| `kyuubi.backend_service.cancel_operation` | | timer | 1.5.0 |
      kyuubi backend service `cancelOperation` method execution time and rate
      | -| `kyuubi.backend_service.close_operation` | | timer | 1.5.0 |
      kyuubi backend service `closeOperation` method execution time and rate
      | -| `kyuubi.backend_service.get_result_set_metadata` | | timer | 1.5.0 |
      kyuubi backend service `getResultSetMetadata` method execution time and rate
      | -| `kyuubi.backend_service.fetch_results` | | timer | 1.5.0 |
      kyuubi backend service `fetchResults` method execution time and rate
      | -| `kyuubi.backend_service.fetch_log_rows_rate` | | meter | 1.5.0 |
      kyuubi backend service `fetchResults` method that fetch log rows rate
      | -| `kyuubi.backend_service.fetch_result_rows_rate` | | meter | 1.5.0 |
      kyuubi backend service `fetchResults` method that fetch result rows rate
      | -| `kyuubi.backend_service.get_primary_keys` | | meter | 1.6.0 |
      kyuubi backend service `get_primary_keys` method execution time and rate
      | -| `kyuubi.backend_service.get_cross_reference` | | meter | 1.6.0 |
      kyuubi backend service `get_cross_reference` method execution time and rate
      | -| `kyuubi.operation.state` | `${operationType}`
      `.${state}` | meter | 1.6.0 |
      The `${operationType}` with a particular `${state}` rate, e.g. `BatchJobSubmission.pending`, `BatchJobSubmission.finished`. Note that, the terminal states are cumulative, but the intermediate ones are not.
      | -| `kyuubi.metadata.request.opened` | | counter | 1.6.1 |
      current opened count for the metadata requests
      | -| `kyuubi.metadata.request.total` | | meter | 1.6.0 |
      metadata requests time and rate
      | -| `kyuubi.metadata.request.failed` | | meter | 1.6.0 |
      metadata requests failure time and rate
      | -| `kyuubi.metadata.request.retrying` | | meter | 1.6.0 |
      retrying metadata requests time and rate, it is not cumulative
      | +| Metrics Prefix | Metrics Suffix | Type | Since | Description | +|--------------------------------------------------|----------------------------------------|-----------|--------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `kyuubi.exec.pool.threads.alive` | | gauge | 1.2.0 |
      threads keepAlive in the backend executive thread pool
      | +| `kyuubi.exec.pool.threads.active` | | gauge | 1.2.0 |
      threads active in the backend executive thread pool
      | +| `kyuubi.exec.pool.work_queue.size` | | gauge | 1.7.0 |
      work queue size in the backend executive thread pool
      | +| `kyuubi.connection.total` | | counter | 1.2.0 |
      cumulative connection count
      | +| `kyuubi.connection.total` | `${sessionType}` | counter | 1.7.0 |
      cumulative connection count with session type `${sessionType}`
      | +| `kyuubi.connection.opened` | | gauge | 1.2.0 |
      current active connection count
      | +| `kyuubi.connection.opened` | `${user}` | counter | 1.2.0 |
      current active connections count requested by a `${user}`
      | +| `kyuubi.connection.opened` | `${user}`
      `${sessionType}` | counter | 1.7.0 |
      current active connections count requested by a `${user}` with session type `${sessionType}`
      | +| `kyuubi.connection.opened` | `${sessionType}` | counter | 1.7.0 |
      current active connections count with session type `${sessionType}`
      | +| `kyuubi.connection.failed` | | counter | 1.2.0 |
      cumulative failed connection count
      | +| `kyuubi.connection.failed` | `${user}` | counter | 1.2.0 |
      cumulative failed connections for a `${user}`
      | +| `kyuubi.connection.failed` | `${sessionType}` | counter | 1.7.0 |
      cumulative failed connection count with session type `${sessionType}`
      | +| `kyuubi.operation.total` | | counter | 1.5.0 |
      cumulative opened operation count
      | +| `kyuubi.operation.total` | `${operationType}` | counter | 1.5.0 |
      cumulative opened count for the operation `${operationType}`
      | +| `kyuubi.operation.opened` | | gauge | 1.5.0 |
      current opened operation count
      | +| `kyuubi.operation.opened` | `${operationType}` | counter | 1.5.0 |
      current opened count for the operation `${operationType}`
      | +| `kyuubi.operation.failed` | `${operationType}`
      `.${errorType}` | counter | 1.5.0 |
      cumulative failed count for the operation `${operationType}` with a particular `${errorType}`, e.g. `execute_statement.AnalysisException`
      | +| `kyuubi.operation.state` | `${operationState}` | meter | 1.5.0 |
      kyuubi operation state rate
      | +| `kyuubi.operation.exec_time` | `${operationType}` | histogram | 1.7.0 |
      execution time histogram for the operation `${operationType}`, now only `ExecuteStatement` is enabled.
      | +| `kyuubi.engine.total` | | counter | 1.2.0 |
      cumulative created engines
      | +| `kyuubi.engine.timeout` | | counter | 1.2.0 |
      cumulative timeout engines
      | +| `kyuubi.engine.failed` | `${user}` | counter | 1.2.0 |
      cumulative explicitly failed engine count for a `${user}`
      | +| `kyuubi.engine.failed` | `${errorType}` | counter | 1.2.0 |
      cumulative explicitly failed engine count for a particular `${errorType}`, e.g. `ClassNotFoundException`
      | +| `kyuubi.backend_service.open_session` | | timer | 1.5.0 |
      kyuubi backend service `openSession` method execution time and rate
      | +| `kyuubi.backend_service.close_session` | | timer | 1.5.0 |
      kyuubi backend service `closeSession` method execution time and rate
      | +| `kyuubi.backend_service.get_info` | | timer | 1.5.0 |
      kyuubi backend service `getInfo` method execution time and rate
      | +| `kyuubi.backend_service.execute_statement` | | timer | 1.5.0 |
      kyuubi backend service `executeStatement` method execution time and rate
      | +| `kyuubi.backend_service.get_type_info` | | timer | 1.5.0 |
      kyuubi backend service `getTypeInfo` method execution time and rate
      | +| `kyuubi.backend_service.get_catalogs` | | timer | 1.5.0 |
      kyuubi backend service `getCatalogs` method execution time and rate
      | +| `kyuubi.backend_service.get_schemas` | | timer | 1.5.0 |
      kyuubi backend service `getSchemas` method execution time and rate
      | +| `kyuubi.backend_service.get_tables` | | timer | 1.5.0 |
      kyuubi backend service `getTables` method execution time and rate
      | +| `kyuubi.backend_service.get_table_types` | | timer | 1.5.0 |
      kyuubi backend service `getTableTypes` method execution time and rate
      | +| `kyuubi.backend_service.get_columns` | | timer | 1.5.0 |
      kyuubi backend service `getColumns` method execution time and rate
      | +| `kyuubi.backend_service.get_functions` | | timer | 1.5.0 |
      kyuubi backend service `getFunctions` method execution time and rate
      | +| `kyuubi.backend_service.get_operation_status` | | timer | 1.5.0 |
      kyuubi backend service `getOperationStatus` method execution time and rate
      | +| `kyuubi.backend_service.cancel_operation` | | timer | 1.5.0 |
      kyuubi backend service `cancelOperation` method execution time and rate
      | +| `kyuubi.backend_service.close_operation` | | timer | 1.5.0 |
      kyuubi backend service `closeOperation` method execution time and rate
      | +| `kyuubi.backend_service.get_result_set_metadata` | | timer | 1.5.0 |
      kyuubi backend service `getResultSetMetadata` method execution time and rate
      | +| `kyuubi.backend_service.fetch_results` | | timer | 1.5.0 |
      kyuubi backend service `fetchResults` method execution time and rate
      | +| `kyuubi.backend_service.fetch_log_rows_rate` | | meter | 1.5.0 |
      kyuubi backend service `fetchResults` method that fetch log rows rate
      | +| `kyuubi.backend_service.fetch_result_rows_rate` | | meter | 1.5.0 |
      kyuubi backend service `fetchResults` method that fetch result rows rate
      | +| `kyuubi.backend_service.get_primary_keys` | | meter | 1.6.0 |
      kyuubi backend service `get_primary_keys` method execution time and rate
      | +| `kyuubi.backend_service.get_cross_reference` | | meter | 1.6.0 |
      kyuubi backend service `get_cross_reference` method execution time and rate
      | +| `kyuubi.operation.state` | `${operationType}`
      `.${state}` | meter | 1.6.0 |
      The `${operationType}` with a particular `${state}` rate, e.g. `BatchJobSubmission.pending`, `BatchJobSubmission.finished`. Note that, the terminal states are cumulative, but the intermediate ones are not.
      | +| `kyuubi.metadata.request.opened` | | counter | 1.6.1 |
      current opened count for the metadata requests
      | +| `kyuubi.metadata.request.total` | | meter | 1.6.0 |
      metadata requests time and rate
      | +| `kyuubi.metadata.request.failed` | | meter | 1.6.0 |
      metadata requests failure time and rate
      | +| `kyuubi.metadata.request.retrying` | | meter | 1.6.0 |
      retrying metadata requests time and rate, it is not cumulative
      | +| `kyuubi.operartion.batch_pending_max_elapse` | | gauge | 1.10.1 |
      the batch pending max elapsed time on current kyuubi instance
      | Before v1.5.0, if you use these metrics: - `kyuubi.statement.total` diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala index e26033bf0f8..a493d7c4578 100644 --- a/kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala +++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala @@ -2023,6 +2023,19 @@ object KyuubiConf { .intConf .createWithDefault(65536) + val METADATA_SEARCH_WINDOW: OptionalConfigEntry[Long] = + buildConf("kyuubi.metadata.search.window") + .doc("The time window to restrict user queries to metadata within a specific period, " + + "starting from the current time to the past. It only affects `GET /api/v1/batches` API. " + + "You may want to set this to short period to improve query performance and reduce load " + + "on the metadata store when administer want to reserve the metadata for long time. " + + "The side-effects is that, the metadata created outside the window will not be " + + "invisible to users. If it is undefined, all metadata will be visible for users.") + .version("1.10.1") + .timeConf + .checkValue(_ > 0, "must be positive number") + .createOptional + val ENGINE_EXEC_WAIT_QUEUE_SIZE: ConfigEntry[Int] = buildConf("kyuubi.backend.engine.exec.pool.wait.queue.size") .doc("Size of the wait queue for the operation execution thread pool in SQL engine" + diff --git a/kyuubi-metrics/src/main/scala/org/apache/kyuubi/metrics/MetricsConstants.scala b/kyuubi-metrics/src/main/scala/org/apache/kyuubi/metrics/MetricsConstants.scala index f615467f3f0..336060e8f70 100644 --- a/kyuubi-metrics/src/main/scala/org/apache/kyuubi/metrics/MetricsConstants.scala +++ b/kyuubi-metrics/src/main/scala/org/apache/kyuubi/metrics/MetricsConstants.scala @@ -64,6 +64,7 @@ object MetricsConstants { final val OPERATION_TOTAL: String = OPERATION + "total" final val OPERATION_STATE: String = OPERATION + "state" final val OPERATION_EXEC_TIME: String = OPERATION + "exec_time" + final val OPERATION_BATCH_PENDING_MAX_ELAPSE: String = OPERATION + "batch_pending_max_elapse" final private val BACKEND_SERVICE = KYUUBI + "backend_service." final val BS_FETCH_LOG_ROWS_RATE = BACKEND_SERVICE + "fetch_log_rows_rate" diff --git a/kyuubi-server/src/main/scala/org/apache/kyuubi/operation/BatchJobSubmission.scala b/kyuubi-server/src/main/scala/org/apache/kyuubi/operation/BatchJobSubmission.scala index 129fbc8d9aa..aa17c5436a3 100644 --- a/kyuubi-server/src/main/scala/org/apache/kyuubi/operation/BatchJobSubmission.scala +++ b/kyuubi-server/src/main/scala/org/apache/kyuubi/operation/BatchJobSubmission.scala @@ -424,6 +424,14 @@ class BatchJobSubmission( Utils.deleteDirectoryRecursively(session.resourceUploadFolderPath.toFile) } } + + def getPendingElapsedTime: Long = { + if (state == OperationState.PENDING) { + System.currentTimeMillis() - createTime + } else { + 0L + } + } } object BatchJobSubmission { diff --git a/kyuubi-server/src/main/scala/org/apache/kyuubi/server/KyuubiRestFrontendService.scala b/kyuubi-server/src/main/scala/org/apache/kyuubi/server/KyuubiRestFrontendService.scala index 5b8b1686adc..b93dcf7b5b8 100644 --- a/kyuubi-server/src/main/scala/org/apache/kyuubi/server/KyuubiRestFrontendService.scala +++ b/kyuubi-server/src/main/scala/org/apache/kyuubi/server/KyuubiRestFrontendService.scala @@ -31,12 +31,14 @@ import org.eclipse.jetty.servlet.{ErrorPageErrorHandler, FilterHolder} import org.apache.kyuubi.{KyuubiException, Utils} import org.apache.kyuubi.config.KyuubiConf import org.apache.kyuubi.config.KyuubiConf._ +import org.apache.kyuubi.metrics.MetricsConstants.OPERATION_BATCH_PENDING_MAX_ELAPSE +import org.apache.kyuubi.metrics.MetricsSystem import org.apache.kyuubi.server.api.v1.ApiRootResource import org.apache.kyuubi.server.http.authentication.{AuthenticationFilter, KyuubiHttpAuthenticationFactory} import org.apache.kyuubi.server.ui.{JettyServer, JettyUtils} import org.apache.kyuubi.service.{AbstractFrontendService, Serverable, Service, ServiceUtils} import org.apache.kyuubi.service.authentication.{AuthTypes, AuthUtils} -import org.apache.kyuubi.session.{KyuubiSessionManager, SessionHandle} +import org.apache.kyuubi.session.{KyuubiBatchSession, KyuubiSessionManager, SessionHandle} import org.apache.kyuubi.util.{JavaUtils, ThreadUtils} import org.apache.kyuubi.util.ThreadUtils.scheduleTolerableRunnableWithFixedDelay @@ -202,6 +204,14 @@ class KyuubiRestFrontendService(override val serverable: Serverable) } } + private def getBatchPendingMaxElapse(): Long = { + val batchPendingElapseTimes = sessionManager.allSessions().map { + case session: KyuubiBatchSession => session.batchJobSubmissionOp.getPendingElapsedTime + case _ => 0L + } + if (batchPendingElapseTimes.isEmpty) 0L else batchPendingElapseTimes.max + } + def waitForServerStarted(): Unit = { // block until the HTTP server is started, otherwise, we may get // the wrong HTTP server port -1 @@ -220,6 +230,9 @@ class KyuubiRestFrontendService(override val serverable: Serverable) isStarted.set(true) startBatchChecker() recoverBatchSessions() + MetricsSystem.tracing { ms => + ms.registerGauge(OPERATION_BATCH_PENDING_MAX_ELAPSE, getBatchPendingMaxElapse, 0) + } } catch { case e: Exception => throw new KyuubiException(s"Cannot start $getName", e) } diff --git a/kyuubi-server/src/main/scala/org/apache/kyuubi/server/api/v1/BatchesResource.scala b/kyuubi-server/src/main/scala/org/apache/kyuubi/server/api/v1/BatchesResource.scala index e3e981abdc0..499110cf68c 100644 --- a/kyuubi-server/src/main/scala/org/apache/kyuubi/server/api/v1/BatchesResource.scala +++ b/kyuubi-server/src/main/scala/org/apache/kyuubi/server/api/v1/BatchesResource.scala @@ -67,6 +67,7 @@ private[v1] class BatchesResource extends ApiRequestContext with Logging { fe.getConf.get(ENGINE_SECURITY_ENABLED) private lazy val resourceFileMaxSize = fe.getConf.get(BATCH_RESOURCE_FILE_MAX_SIZE) private lazy val extraResourceFileMaxSize = fe.getConf.get(BATCH_EXTRA_RESOURCE_FILE_MAX_SIZE) + private lazy val metadataSearchWindow = fe.getConf.get(METADATA_SEARCH_WINDOW) private def batchV2Enabled(reqConf: Map[String, String]): Boolean = { fe.getConf.get(BATCH_SUBMITTER_ENABLED) && @@ -420,13 +421,15 @@ private[v1] class BatchesResource extends ApiRequestContext with Logging { s"The valid batch state can be one of the following: ${VALID_BATCH_STATES.mkString(",")}") } + val createTimeFilter = + math.max(createTime, metadataSearchWindow.map(System.currentTimeMillis() - _).getOrElse(0L)) val filter = MetadataFilter( sessionType = SessionType.BATCH, engineType = batchType, username = batchUser, state = batchState, requestName = batchName, - createTime = createTime, + createTime = createTimeFilter, endTime = endTime) val batches = sessionManager.getBatchesFromMetadataStore(filter, from, size, desc) new GetBatchesResponse(from, batches.size, batches.asJava) diff --git a/kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/MetadataManager.scala b/kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/MetadataManager.scala index 8d939aefc0c..9ca1948dec6 100644 --- a/kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/MetadataManager.scala +++ b/kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/MetadataManager.scala @@ -139,8 +139,13 @@ class MetadataManager extends AbstractService("MetadataManager") { from: Int, size: Int, desc: Boolean = false): Seq[Batch] = { - withMetadataRequestMetrics(_metadataStore.getMetadataList(filter, from, size, desc)).map( - buildBatch) + withMetadataRequestMetrics(_metadataStore.getMetadataList( + filter, + from, + size, + // if create_file field is set, order by create_time, which is faster, otherwise by key_id + orderBy = if (filter.createTime > 0) Some("create_time") else Some("key_id"), + direction = if (desc) "DESC" else "ASC")).map(buildBatch) } def countBatch( diff --git a/kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/MetadataStore.scala b/kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/MetadataStore.scala index d350050f142..b9fb52f779a 100644 --- a/kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/MetadataStore.scala +++ b/kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/MetadataStore.scala @@ -58,13 +58,16 @@ trait MetadataStore extends Closeable { * @param from the metadata offset. * @param size the size to get. * @param desc the order of metadata list. + * @param orderBy the order by column, default is the auto increment primary key, `key_id`. + * @param direction the order direction, default is `ASC`. * @return selected metadata list. */ def getMetadataList( filter: MetadataFilter, from: Int, size: Int, - desc: Boolean = false): Seq[Metadata] + orderBy: Option[String] = Some("key_id"), + direction: String = "ASC"): Seq[Metadata] /** * Count the metadata list with filter conditions. diff --git a/kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/jdbc/JDBCMetadataStore.scala b/kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/jdbc/JDBCMetadataStore.scala index e3db01f87eb..af965a220f6 100644 --- a/kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/jdbc/JDBCMetadataStore.scala +++ b/kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/jdbc/JDBCMetadataStore.scala @@ -257,15 +257,15 @@ class JDBCMetadataStore(conf: KyuubiConf) extends MetadataStore with Logging { filter: MetadataFilter, from: Int, size: Int, - desc: Boolean = false): Seq[Metadata] = { + orderBy: Option[String] = Some("key_id"), + direction: String = "ASC"): Seq[Metadata] = { val queryBuilder = new StringBuilder val params = ListBuffer[Any]() queryBuilder.append("SELECT ") queryBuilder.append(METADATA_COLUMNS) queryBuilder.append(s" FROM $METADATA_TABLE") queryBuilder.append(s" ${assembleWhereClause(filter, params)}") - queryBuilder.append(" ORDER BY key_id ") - queryBuilder.append(if (desc) "DESC " else "ASC ") + orderBy.foreach(o => queryBuilder.append(s" ORDER BY $o $direction ")) queryBuilder.append(dialect.limitClause(size, from)) val query = queryBuilder.toString JdbcUtils.withConnection { connection => diff --git a/kyuubi-server/src/test/scala/org/apache/kyuubi/server/api/v1/BatchesResourceSuite.scala b/kyuubi-server/src/test/scala/org/apache/kyuubi/server/api/v1/BatchesResourceSuite.scala index 1a376e93e5a..4d32de70262 100644 --- a/kyuubi-server/src/test/scala/org/apache/kyuubi/server/api/v1/BatchesResourceSuite.scala +++ b/kyuubi-server/src/test/scala/org/apache/kyuubi/server/api/v1/BatchesResourceSuite.scala @@ -486,6 +486,7 @@ abstract class BatchesResourceSuiteBase extends KyuubiFunSuite .queryParam("from", "0") .queryParam("size", "1") .queryParam("desc", "true") + .queryParam("createTime", "1") .request(MediaType.APPLICATION_JSON_TYPE) .header(AUTHORIZATION_HEADER, basicAuthorizationHeader("anonymous")) .get() From dc3ac89453f9dad3229e4bfbb35d88cb0bf33aad Mon Sep 17 00:00:00 2001 From: Cheng Pan Date: Thu, 5 Dec 2024 19:39:17 +0800 Subject: [PATCH 3/3] [KYUUBI #6839] Add example for service monitor in Helm chart ### Why are the changes needed? Adding an example to `servicemonitor` in values.yaml to improve the usability . ### How was this patch tested? execute command `helm install kyuubi-test .` check the servicemonitor after deployment ### Was this patch authored or co-authored using generative AI tooling? No Closes #6839 from zhifanggao/add_default_endpoints. Closes #6839 59aac4142 [Cheng Pan] Update charts/kyuubi/values.yaml 869f45e79 [zhifanggao] include the key itself Lead-authored-by: Cheng Pan Co-authored-by: zhifanggao <28681649@qq.com> Signed-off-by: Cheng Pan --- charts/kyuubi/values.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/charts/kyuubi/values.yaml b/charts/kyuubi/values.yaml index 1f35c9ba871..51c6717ec02 100644 --- a/charts/kyuubi/values.yaml +++ b/charts/kyuubi/values.yaml @@ -289,6 +289,8 @@ metrics: enabled: false # List of service endpoints serving metrics to be scraped by Prometheus, see Prometheus Operator docs for more details endpoints: [] + # endpoints: + # - port: prometheus # Additional labels to be used to make ServiceMonitor discovered by Prometheus labels: {}