[Improvement][common] get application id in SHELL scripts #4025

gabrywu · 2020-11-04T14:21:34Z

Describe the question
For now, if we execute a yarn job in a SHELL script, we find the application IDs in the logs by regex 'application_\d+_\d+'.
I think it's so ugly and has performance issues. So I suggest that we register an aspect when executing 'yarn jar' command,
we can weave a join point to org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication, where we can get the submitted application id and the tracking URL, and output them into one local file

What are the current deficiencies and the benefits of improvement

deficiency:
need the aspectjweaver-1.9.6.jar file, which size is about 2MB
benefit:
no need to retrieve the whole log with the regex 'application_\d+_\d+'.
no need to restrict yarn client log level to INFO

Which version of DolphinScheduler:

all version

Describe alternatives you've considered

add the following two env to global envs
export YARN_CLIENT_OPTS="-javaagent:/pathto/aspectjweaver-1.9.6.jar"

export YARN_USER_CLASSPATH=/pathto/Aop2YarnClient-1.0-SNAPSHOT.jar
Then when submitting applications to the yarn cluster, the aspect in Aop2YarnClient-1.0-SNAPSHOT.jar will be registered, and we can get the submitted application id and the tracking URL

This is an example, I just output the application id to console

Here is the sample code

The solution is suitable for Hive, Spark, Flink, and other tools running the yarn cluster. 'hive -e 'hive sql'' test passed

The text was updated successfully, but these errors were encountered:

CalvinKirs · 2020-11-04T14:47:24Z

I think this is a good idea

gabrywu · 2020-11-07T04:01:28Z

This is a public repo which can achieve this function, https://github.com/gabrywu/Aop2YarnClient

xiejiajun · 2020-11-09T05:52:56Z

it will not be able to fetch the applicationId in the case of use HiveServer2 submitting the SQL, should we consider storing the appId information in public storage? @gabrywu

gabrywu · 2020-11-18T06:33:39Z

it will not be able to fetch the applicationId in the case of use HiveServer2 submitting the SQL, should we consider storing the appId information in public storage? @gabrywu

Do you have any good ideas to resolve it? @xiejiajun

xiejiajun · 2020-11-19T07:47:35Z

it will not be able to fetch the applicationId in the case of use HiveServer2 submitting the SQL, should we consider storing the appId information in public storage? @gabrywu

Do you have any good ideas to resolve it? @xiejiajun

I thought about writing the appId to a public storage such as Mysql, but it will introduce additional third-party service configuration such as JdbcUrl , so we still need to think about it carefully.

gabrywu · 2020-11-29T04:06:07Z

it will not be able to fetch the applicationId in the case of use HiveServer2 submitting the SQL, should we consider storing the appId information in public storage? @gabrywu

Do you have any good ideas to resolve it? @xiejiajun

I thought about writing the appId to a public storage such as Mysql, but it will introduce additional third-party service configuration such as JdbcUrl , so we still need to think about it carefully.

Yes, so the example project just put it to a local file

ruanwenjun · 2022-07-29T07:03:41Z

@caishunfeng

gabrywu self-assigned this Nov 4, 2020

gabrywu added discussion discussion enhancement New feature or request suggestion labels Nov 4, 2020

gabrywu changed the title ~~[Improvement][common] Improvement title~~ [Improvement][common] get application id in SHELL script Nov 5, 2020

gabrywu changed the title ~~[Improvement][common] get application id in SHELL script~~ [Improvement][common] get application id in SHELL scripts Nov 5, 2020

ruanwenjun mentioned this issue Aug 3, 2022

[Improvement][Task] Improved way to collect yarn job's appIds #11262

Closed

3 tasks

William-GuoWei mentioned this issue Jun 21, 2023

[Snyk] Fix for 1 vulnerabilities Realtime-BigData/EasyScheduler#199

Open

ruanwenjun removed the suggestion label Oct 25, 2023

SbloodyS closed this as completed Jul 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Improvement][common] get application id in SHELL scripts #4025

[Improvement][common] get application id in SHELL scripts #4025

gabrywu commented Nov 4, 2020 •

edited

Loading

CalvinKirs commented Nov 4, 2020

gabrywu commented Nov 7, 2020

xiejiajun commented Nov 9, 2020

gabrywu commented Nov 18, 2020 •

edited

Loading

xiejiajun commented Nov 19, 2020

gabrywu commented Nov 29, 2020 •

edited

Loading

ruanwenjun commented Jul 29, 2022

[Improvement][common] get application id in SHELL scripts #4025

[Improvement][common] get application id in SHELL scripts #4025

Comments

gabrywu commented Nov 4, 2020 • edited Loading

CalvinKirs commented Nov 4, 2020

gabrywu commented Nov 7, 2020

xiejiajun commented Nov 9, 2020

gabrywu commented Nov 18, 2020 • edited Loading

xiejiajun commented Nov 19, 2020

gabrywu commented Nov 29, 2020 • edited Loading

ruanwenjun commented Jul 29, 2022

gabrywu commented Nov 4, 2020 •

edited

Loading

gabrywu commented Nov 18, 2020 •

edited

Loading

gabrywu commented Nov 29, 2020 •

edited

Loading