Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TASK][EASY] Spark Engine should throw an exception when the engine fails to start. #5331

Closed
3 of 4 tasks
ASiegeLion opened this issue Sep 26, 2023 · 4 comments
Closed
3 of 4 tasks
Assignees
Labels
kind:bug This is a clearly a bug priority:major

Comments

@ASiegeLion
Copy link
Contributor

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the bug

In the main method of SparkSQLEngine, exceptions are caught but not thrown. If the engine fails to start, the k8s status is displayed as finished, and the kyuubi server will not exit until the engine initialization times out.
d45cf19d59ae6934430910300a28fff

Affects Version(s)

master/1.7.1

Kyuubi Server Log Output

No response

Kyuubi Engine Log Output

No response

Kyuubi Server Configurations

No response

Kyuubi Engine Configurations

No response

Additional context

No response

Are you willing to submit PR?

  • Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
  • No. I cannot submit a PR at this time.
@ASiegeLion ASiegeLion added kind:bug This is a clearly a bug priority:major labels Sep 26, 2023
@pan3793
Copy link
Member

pan3793 commented Sep 26, 2023

Thanks for reporting this issue, but seems it's not limited to K8s but a generic issue, right?

PS: Please use text as much as possible, the picture is not searchable which makes the future explorer hard to find it.

@ASiegeLion
Copy link
Contributor Author

Thanks for reporting this issue, but seems it's not limited to K8s but a generic issue, right?

PS: Please use text as much as possible, the picture is not searchable which makes the future explorer hard to find it.

If the createSpark method fails, YARN seems to be able to know, but K8S cannot. If the startEngine method fails, they don't know.

@cxzl25
Copy link
Contributor

cxzl25 commented Sep 26, 2023

YARN seems to be able to know

If the engine does not throw an exception, the status seen in YARN is also SUCCESS.

org.apache.spark.deploy.yarn.ApplicationMaster

            mainMethod.invoke(null, userArgs.toArray)
            finish(FinalApplicationStatus.SUCCEEDED, ApplicationMaster.EXIT_SUCCESS)

https://github.com/apache/spark/blob/9a44dc4f9ee005f8e82f15bd731c9c870cc4a606/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala#L738-L739

For example, setting a non-existent catalog leads to the initialization failure of the spark yarn cluster mode.

spark.sql.catalog.spark_catalog=NOT_EXIST_CATALOG

@ASiegeLion
Copy link
Contributor Author

YARN seems to be able to know

If the engine does not throw an exception, the status seen in YARN is also SUCCESS.

org.apache.spark.deploy.yarn.ApplicationMaster

            mainMethod.invoke(null, userArgs.toArray)
            finish(FinalApplicationStatus.SUCCEEDED, ApplicationMaster.EXIT_SUCCESS)

https://github.com/apache/spark/blob/9a44dc4f9ee005f8e82f15bd731c9c870cc4a606/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala#L738-L739

For example, setting a non-existent catalog leads to the initialization failure of the spark yarn cluster mode.

spark.sql.catalog.spark_catalog=NOT_EXIST_CATALOG

You are right, the exception I see is the failure to start the ApplicationMaster, not the spark Engine. So I mistakenly thought that YARN seems to be able to know

@ASiegeLion ASiegeLion changed the title [Bug] Spark Engine should throw an exception to let K8s know when the engine fails to start. [Bug] Spark Engine should throw an exception to let K8s or Yarn know when the engine fails to start. Sep 26, 2023
@pan3793 pan3793 changed the title [Bug] Spark Engine should throw an exception to let K8s or Yarn know when the engine fails to start. [TASK][EASY][Bug] Spark Engine should throw an exception when the engine fails to start. Nov 13, 2023
pan3793 pushed a commit that referenced this issue Nov 13, 2023
…o start

### _Why are the changes needed?_

Close #5331

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

### _Was this patch authored or co-authored using generative AI tooling?_

No.

Closes #5332 from ASiegeLion/master.

Closes #5331

21342f5 [sychen] wrap InterruptedException
1f2542c [sychen] fix UT
e433b54 [liupeiyue] [KYUUBI #5331]Spark Engine should throw an exception to let K8s know when the engine fails to start

Lead-authored-by: liupeiyue <[email protected]>
Co-authored-by: sychen <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
(cherry picked from commit 7c5f583)
Signed-off-by: Cheng Pan <[email protected]>
@pan3793 pan3793 changed the title [TASK][EASY][Bug] Spark Engine should throw an exception when the engine fails to start. [TASK][EASY] Spark Engine should throw an exception when the engine fails to start. Dec 11, 2023
beryllw pushed a commit to beryllw/incubator-kyuubi that referenced this issue Jan 11, 2024
beryllw pushed a commit to beryllw/incubator-kyuubi that referenced this issue Jan 11, 2024
[Backport][KYUUBI apache#5331] Spark engine should throw an exception when it fails to start

See merge request !38
pan3793 added a commit that referenced this issue Feb 22, 2024
…se notes

# 🔍 Description
## Issue References 🔗

Currently, we use a rather primitive way to manually write release notes from scratch, and some of the mechanical and repetitive work can be simplified by the scripts.

## Describe Your Solution 🔧

Adds a script to simplify the process of creating release notes.

Note: it just simplifies some processes, the release manager still needs to tune the outputs by hand.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

```
RELEASE_TAG=v1.8.1 PREVIOUS_RELEASE_TAG=v1.8.0 build/release/pre_gen_release_notes.py
```

```
$ head build/release/commits-v1.8.1.txt
[KYUUBI #5981] Deploy Spark Hive connector with Scala 2.13 to Maven Central
[KYUUBI #6058] Make Jetty server stop timeout configurable
[KYUUBI #5952][1.8] Disconnect connections without running operations after engine maxlife time graceful period
[KYUUBI #6048] Assign serviceNode and add volatile for variables
[KYUUBI #5991] Error on reading Atlas properties composed of multi values
[KYUUBI #6045] [REST] Sync the AdminRestApi with the AdminResource Apis
[KYUUBI #6047] [CI] Free up disk space
[KYUUBI #6036] JDBC driver conditional sets fetchSize on opening session
[KYUUBI #6028] Exited spark-submit process should not block batch submit queue
[KYUUBI #6018] Speed up GetTables operation for Spark session catalog
```

```
$ head build/release/contributors-v1.8.1.txt
* Shaoyun Chen        -- [KYUUBI #5857][KYUUBI #5720][KYUUBI #5785][KYUUBI #5617]
* Chao Chen           -- [KYUUBI #5750]
* Flyangz             -- [KYUUBI #5832]
* Pengqi Li           -- [KYUUBI #5713]
* Bowen Liang         -- [KYUUBI #5730][KYUUBI #5802][KYUUBI #5767][KYUUBI #5831][KYUUBI #5801][KYUUBI #5754][KYUUBI #5626][KYUUBI #5811][KYUUBI #5853][KYUUBI #5765]
* Paul Lin            -- [KYUUBI #5799][KYUUBI #5814]
* Senmiao Liu         -- [KYUUBI #5969][KYUUBI #5244]
* Xiao Liu            -- [KYUUBI #5962]
* Peiyue Liu          -- [KYUUBI #5331]
* Junjie Ma           -- [KYUUBI #5789]
```
---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6074 from pan3793/release-script.

Closes #6074

3d5ec20 [Cheng Pan] credits
1765279 [Cheng Pan] Add a script to simplify the process of creating release notes

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
zhaohehuhu pushed a commit to zhaohehuhu/incubator-kyuubi that referenced this issue Mar 21, 2024
… release notes

# 🔍 Description
## Issue References 🔗

Currently, we use a rather primitive way to manually write release notes from scratch, and some of the mechanical and repetitive work can be simplified by the scripts.

## Describe Your Solution 🔧

Adds a script to simplify the process of creating release notes.

Note: it just simplifies some processes, the release manager still needs to tune the outputs by hand.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

```
RELEASE_TAG=v1.8.1 PREVIOUS_RELEASE_TAG=v1.8.0 build/release/pre_gen_release_notes.py
```

```
$ head build/release/commits-v1.8.1.txt
[KYUUBI apache#5981] Deploy Spark Hive connector with Scala 2.13 to Maven Central
[KYUUBI apache#6058] Make Jetty server stop timeout configurable
[KYUUBI apache#5952][1.8] Disconnect connections without running operations after engine maxlife time graceful period
[KYUUBI apache#6048] Assign serviceNode and add volatile for variables
[KYUUBI apache#5991] Error on reading Atlas properties composed of multi values
[KYUUBI apache#6045] [REST] Sync the AdminRestApi with the AdminResource Apis
[KYUUBI apache#6047] [CI] Free up disk space
[KYUUBI apache#6036] JDBC driver conditional sets fetchSize on opening session
[KYUUBI apache#6028] Exited spark-submit process should not block batch submit queue
[KYUUBI apache#6018] Speed up GetTables operation for Spark session catalog
```

```
$ head build/release/contributors-v1.8.1.txt
* Shaoyun Chen        -- [KYUUBI apache#5857][KYUUBI apache#5720][KYUUBI apache#5785][KYUUBI apache#5617]
* Chao Chen           -- [KYUUBI apache#5750]
* Flyangz             -- [KYUUBI apache#5832]
* Pengqi Li           -- [KYUUBI apache#5713]
* Bowen Liang         -- [KYUUBI apache#5730][KYUUBI apache#5802][KYUUBI apache#5767][KYUUBI apache#5831][KYUUBI apache#5801][KYUUBI apache#5754][KYUUBI apache#5626][KYUUBI apache#5811][KYUUBI apache#5853][KYUUBI apache#5765]
* Paul Lin            -- [KYUUBI apache#5799][KYUUBI apache#5814]
* Senmiao Liu         -- [KYUUBI apache#5969][KYUUBI apache#5244]
* Xiao Liu            -- [KYUUBI apache#5962]
* Peiyue Liu          -- [KYUUBI apache#5331]
* Junjie Ma           -- [KYUUBI apache#5789]
```
---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes apache#6074 from pan3793/release-script.

Closes apache#6074

3d5ec20 [Cheng Pan] credits
1765279 [Cheng Pan] Add a script to simplify the process of creating release notes

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
beryllw pushed a commit to beryllw/incubator-kyuubi that referenced this issue Jun 7, 2024
…ails to start

### _Why are the changes needed?_

Close apache#5331

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

### _Was this patch authored or co-authored using generative AI tooling?_

No.

Closes apache#5332 from ASiegeLion/master.

Closes apache#5331

21342f5 [sychen] wrap InterruptedException
1f2542c [sychen] fix UT
e433b54 [liupeiyue] [KYUUBI apache#5331]Spark Engine should throw an exception to let K8s know when the engine fails to start

Lead-authored-by: liupeiyue <[email protected]>
Co-authored-by: sychen <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
(cherry picked from commit 7c5f583)
Signed-off-by: Cheng Pan <[email protected]>
beryllw pushed a commit to beryllw/incubator-kyuubi that referenced this issue Jun 7, 2024
[KYUUBI apache#5331] Spark engine should throw an exception when it fails to start

See merge request !43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug This is a clearly a bug priority:major
Projects
No open projects
3 participants