Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-17130][web] Enable listing JM Logs and displaying Logs by filename #11731

Closed
wants to merge 3 commits into from

Conversation

vthinkxie
Copy link
Contributor

What is the purpose of the change

ref https://issues.apache.org/jira/browse/FLINK-17130

Brief change log

add JM log list
add JM log detail

Verifying this change

  • the job manager log list
  • the job manager log detail
  • check the download button
  • check the reload button
  • check the fullscreen button

before:

image

after:

list:
image

log-detail from the list:

image

full screen:
image

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)
  • If yes, how is the feature documented? (not documented)

@flinkbot
Copy link
Collaborator

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 2a9f46b (Tue Apr 14 09:34:38 UTC 2020)

Warnings:

  • No documentation files were touched! Remember to keep the Flink docs up to date!
  • This pull request references an unassigned Jira ticket. According to the code contribution guide, tickets need to be assigned before starting with the implementation work.

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@vthinkxie
Copy link
Contributor Author

@simplejason do you have time to take a look?

@flinkbot
Copy link
Collaborator

flinkbot commented Apr 14, 2020

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build
  • @flinkbot run azure re-run the last Azure build

Copy link
Contributor

@simplejason simplejason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything works as expected, the PR looks good to me, I just left a few remarks.

Comment on lines 42 to 54
.breadcrumb {
background: @component-background;
border-bottom: 1px solid @border-color-split;
margin-bottom: 16px;
padding: 12px 24px;
position: relative;
display: block;
}

flink-refresh-download {
position: absolute;
right: 12px;
top: 0;
line-height: 47px;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to have done some duplicate work with Task Managers log list, can we make these codes reusable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two components look very similar, but the style is not the same(some border and full-screen style need to adjust according to different layout)
and they may have more different features in the future, like #10228, I think it would better not to create a share component between them

(reload)="reload()"
(fullScreen)="toggleFullScreen($event)">
</flink-refresh-download>
</div>
<flink-monaco-editor [value]="logs"></flink-monaco-editor>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I scroll the log content, a border appears at the top of the editor, I think it would be nice to hide it, but it's also ok if we want to keep it :)

image

Copy link
Contributor Author

@vthinkxie vthinkxie Apr 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed
image

@GJL GJL self-assigned this Apr 21, 2020
Copy link
Contributor

@simplejason simplejason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@GJL
Copy link
Member

GJL commented Apr 27, 2020

Feature works as expected but I have some remarks regarding the usability.

If a user starts and stops the cluster multiple times locally, the log list gets confusing quickly, i.e., it is difficult to find the latest/active log file. See screenshot below:

image

You can simulate this yourself by running:

for i in {0..10} do; bin/start-cluster.sh && bin/stop-cluster.sh done

For task manager logs it is similar but perhaps a bit better since there is a way to access the "main TM log" directly from the sub vertex page.

Do you have any ideas how this can be improved? FLINK-16863 would not fully solve the above problem; if a user has many local TMs running, it could still be difficult to find the current/active JM log.

cc: @jinglining

@vthinkxie
Copy link
Contributor Author

vthinkxie commented Apr 27, 2020

Hi @GJL

I have a discussion with @jinglining , what about the following solutions?

  1. split the taskmanager and jobmanager logs into different folders to avoid they mixed together in the standalone mode (which may bring break changes?).

  2. add lastModified describe in FLINK-16863 and current:boolean data in the list item.

the current means that the log has the same name as the previous API such as /jobmanager/stdout, I can add a highlight or a tag in the frontend to prompt the users with it.

we can get it from

public static class LogFileLocation {
public final File logFile;
public final File stdOutFile;
public final File logDir;

@GJL
Copy link
Member

GJL commented Apr 27, 2020

First of all, I am sorry that I did not see this problem earlier.

  1. split the taskmanager and jobmanager logs into different folders to avoid they mixed together in the standalone mode (which may bring break changes?).

  2. add lastModified describe in FLINK-16863 and current:boolean data in the list item.

It would mitigate the issue. However, the user would still see additional log files belonging to other processes, e.g., when users are running multiple standalone TMs on the same host. Ideally, only files owned by the respective TM/JM process are listed. Therefore, an extension of your idea that @tillrohrmann had is to put all logs belonging to a single process into a separate directory. However, for a user who wants to configure custom log files, it could be difficult to predict the name of the log directory, which would render the whole feature useless.

A (temporary) workaround could be to always show the main log and stdout file by default and make the log list a separate page. This would allow us to roll out the feature while not changing existing behavior.

I think we should not make a decision on our own. Maybe it is acceptable to see other processes' log files. How about we bring this issue up on the dev mailing list and see what others are saying (especially the developers that voted on FLIP-103 in the first place)?

@vthinkxie
Copy link
Contributor Author

@vthinkxie
Copy link
Contributor Author

vthinkxie commented May 11, 2020

I have revert the deletion of the sdtout and logs page
cc @GJL
image
image
image

Copy link
Contributor

@simplejason simplejason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Stdout and Logs work for me.

@GJL GJL closed this in 74b850c May 12, 2020
klion26 pushed a commit to klion26/flink that referenced this pull request May 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants