Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dag view to display dagrun history per dag level #19921

Closed
wants to merge 4 commits into from

Conversation

Acehaidrey
Copy link
Contributor

@Acehaidrey Acehaidrey commented Dec 1, 2021

In the Airflow Summit 2021, we presented about some additional views and tools we worked on to give users quicker insights about their workflows. The full talk was here: https://airflowsummit.org/sessions/2021/usability-improvements-debugging-inspection-tooling/.
But the one this is tackling is the dagrun history view, which helps illustrate the dagruns that were run for this dag, with the additional information about each run and formatted in a more readable way. This is similar to the dagrun tab in the webserver but is simpler for the user as it is already at the dag view. We also show duration to identify outliers. We allow them to also link to all the records if they like which links to the dagrun tab. It also has a final column to show the type of run prefix it is, i.e. scheduled, backfill, manual, other.

We had feedback asking for this to be shared so this is our first part in adding new views. We are welcome to any feedback and how everyone thinks this view can be more useful/helpful for all.

Screen Shot 2021-11-30 at 1 49 50 AM

Screen Shot 2021-11-30 at 1 50 00 AM

Screen Shot 2021-11-30 at 1 51 37 AM


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@boring-cyborg boring-cyborg bot added area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues labels Dec 1, 2021
airflow/utils/timezone.py Outdated Show resolved Hide resolved
@bbovenzi
Copy link
Contributor

bbovenzi commented Dec 1, 2021

I believe that the new tree view and the existing dag runs table already achieves all of the use-cases of this page. But I don't think the duration is calculated well in the dag runs table, so it would be great to use the td_format() function there.

@hterik
Copy link
Contributor

hterik commented Dec 1, 2021

By coincidence i created almost exactly the same thing, see #19936 for an alternative approach. Seems like a popular feature request :)

@Acehaidrey
Copy link
Contributor Author

Hi @bbovenzi thanks for the reply. So I think I understand when you say you feel this can be redundant in some ways but the bigger issue is that to have an easy way to see all this history is not there. I do think even in your case if you want to create a frame where you have the dagrun view embedded in a tab for the dag is okay too, but users come to Airflow, and want to identify which runs are taking longer, what the anomalies are, etc.
We have many health checks alerts from users that will then come and see which of the past workflows may be having issues before they go through to click each graph.

The other reason why I helped with this, is because the poll there were over 10 people who voted they'd like this feature. And somehow in the same day/time @hterik also had the same idea :). Meaning that this is still for whatever reason not easy for users. I think its a opportunity for improvement.

On another note, if you like the td_format function, I can create a separate PR and just work on that and apply it to the dagrun view, but only want to do that if you feel it's helpful and would get in.

Lastly, we also have done this for the audit log for example. And also have added a plot in the duration history for the dagrun times (it's in same duration tab under the tasks). Do you feel these would be desired? Are there any plans this will duplicate or conflict. I know our org uses them a lot but have an initiative by end of this year to get another 2 PRs in that will be helpful to the greater community, that I'm focused on making happen. See the images attached.
CC @potiuk in case you have thoughts too. Thanks all -

Screen Shot 2021-12-01 at 2 12 09 PM

Screen Shot 2021-12-01 at 2 12 49 PM

@bbovenzi
Copy link
Contributor

bbovenzi commented Dec 2, 2021

Yes, let's do a separate pr for td_format. Another PR for the dag audit log also makes sense to me.

I am totally open to ways to improve how users understand their dag runs. I think we should leverage the new tree view to do so. Like, there should be an easy way to see the entire history at once. Or even collapse the task instances part and expand the bar chart of dag runs to the full view. Both can be done in-view as opposed to adding another tab.

I would also say that a bar chart is a better representation of this data than a line graph, so a user can identify the specific problematic dag run. We probably need to get around and update the task duration chart too.

@Acehaidrey
Copy link
Contributor Author

Hi @bbovenzi thanks for the reply -
I will do a separate pr for the td_format but do you know which areas is should be applied to? I just want to get a head start from you to verify :)

And the audit logs view, I can add something if you feel it would be helpful. But if you all think it's not that is okay with me too - I know all organizations run in their own way.

I just did this PR since the poll we sent out had it as a higher priority. If you have screenshots or video clips of how the new tree view will handle this - would love to see that too. I couldn't get a way to see it with the latest version. I agree that if it can be in that view that is even more concise and clear. Want to see if we can leverage that too.

Lastly regarding the dagrun history duration, yes think the duration view could use some help. It definitely can be improved but for us, our customers like the table views more vs the graph visualization so we haven't done a whole lot of work there. Just a way to have improved filtering.

I can add that, and see if a bar graph would be able to show each day, but for us, seeing the change over day helps see the outliers too. So both would work.

@Acehaidrey
Copy link
Contributor Author

If that is the case then for td_format I'd just apply to the dagrun list column

@Acehaidrey
Copy link
Contributor Author

#20112

@bbovenzi if you can review

kaxil pushed a commit that referenced this pull request Dec 15, 2021
This PR came out of the review for: #19921. There was a review for adding an updated view for displaying the duration column in a more readable way.
@Acehaidrey
Copy link
Contributor Author

Going to abandon this.

@Acehaidrey Acehaidrey closed this Dec 15, 2021
@Acehaidrey
Copy link
Contributor Author

@bbovenzi here is audit log pr btw: #20733

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants