Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The wayfire 0.8.0 expo plugin has a noticeable drop in frame rate and feels sluggish compared to 0.7.5. #1940

Closed
xiaohuirong opened this issue Oct 7, 2023 · 44 comments · Fixed by #2132
Labels
Milestone

Comments

@xiaohuirong
Copy link

Describe the bug
In wayfire 0.8.0, when using keybinding to activate the expo plugin, there is a noticeable frame rate drop and sluggishness in the transition animations. In version 0.7.5, it felt much smoother.

To Reproduce
Steps to reproduce the behavior:

  1. Press key to activate the expo plugin.
  2. Animation lag and frame drops occur.

Expected behavior
The animations should be as smooth as in version 0.7.5.

Videos
0.8.0
https://drive.google.com/file/d/1k_rOv_si34aRSJ_8AZKmQpe6tEozLntO/view?usp=sharing

0.7.5
https://drive.google.com/file/d/1CGO2i3h6iyYDEeCBnUVsecDX-MH7miWL/view?usp=sharing

Wayfire version
0.8.0, git, built from commit 2059459

@xiaohuirong xiaohuirong added the bug label Oct 7, 2023
@ammen99
Copy link
Member

ammen99 commented Oct 7, 2023

This is weird, because I supposedly optimized expo .. if you look at htop/intel_gpu_top/radeontop/whatever, does it seem like there is a cpu or a gpu bottleneck? Or neither? Do you have core/max_render_time set (and to what value)?

@xiaohuirong
Copy link
Author

I haven't set core/max_render_time, should I set it to a reasonable value?

@ammen99
Copy link
Member

ammen99 commented Oct 7, 2023

Well, if it is set to -1 (default), it should behave optimally wrt. framerate.

@ammen99
Copy link
Member

ammen99 commented Oct 7, 2023

After playing around a bit with Expo, I managed to get it to stutter and use a lot of GPU power. However, I restarted Wayfire and the problem is gone, and I cannot figure out how to trigger it again .. Do you do anything specific after starting Wayfire? Any other plugins, maybe blur or similar?

@xiaohuirong
Copy link
Author

I didn't use the 'blur' plugin; both tests were conducted with the same configuration file.

@soreau
Copy link
Member

soreau commented Oct 7, 2023

If you use the bench plugin from wayfire-plugins-extra, do you notice a drop in framerate?

@xiaohuirong
Copy link
Author

@xiaohuirong
Copy link
Author

I'm using the wayfire-plugins-extra version that corresponds to the Wayfire version. It seems that the old version of the bench plugin would make the display frame rate reach its maximum value.

@soreau
Copy link
Member

soreau commented Oct 7, 2023

Yes, there were changes to the way bench works, especially in light of vrr enabled outputs.

@xiaohuirong
Copy link
Author

I found that running some GPU-intensive programs, such as https://github.com/amarao/fpscount, helps alleviate the sluggishness.

@xiaohuirong
Copy link
Author

Testing some historical builds of wayfire, it seems that the sluggishness started to occur after the commit d7a9285.

@ammen99
Copy link
Member

ammen99 commented Oct 7, 2023

@xiaohuirong It is possible that Wayfire's improvements actually made the GPU do less work, which causes the GPU to downclock .. which then causes missed frames. GNOME has a similar problem, see this PR: https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1441

Maybe try forcing your GPU to stay at maximum frequency to verify that this is the case.

@xiaohuirong
Copy link
Author

I set my Intel integrated graphics to work at the maximum frequency, and the sluggishness issue was indeed alleviated. I compared the usage of Render/3D when continuously activating expo at the maximum GPU frequency and found that version 0.8.0 was around 23%, while version 0.7.5 was only around 15%. It seems that the older version of wayfire has better performance?

@soreau
Copy link
Member

soreau commented Oct 7, 2023

You can also try the showrepaint plugin to see what is being damaged/repainted and how often.

@ammen99
Copy link
Member

ammen99 commented Oct 7, 2023

I set my Intel integrated graphics to work at the maximum frequency, and the sluggishness issue was indeed alleviated. I compared the usage of Render/3D when continuously activating expo at the maximum GPU frequency and found that version 0.8.0 was around 23%, while version 0.7.5 was only around 15%. It seems that the older version of wayfire has better performance?

This is interesting. Here, intel_gpu_top reports lower Hz and lower usage (%) with newer Wayfire (and it runs smoothly, 60fps).

@xiaohuirong
Copy link
Author

You can also try the showrepaint plugin to see what is being damaged/repainted and how often.

I tried showrepaint, and it seems that in version 0.7.5, the flickering frequency is higher. It's worth noting that I have two screens, and in version 0.8.0, when I activate expo on the first screen, the second screen also undergoes repainting, whereas this issue does not occur in 0.7.5.

@xiaohuirong
Copy link
Author

How can I obtain the number of repaints? It's flickering too quickly for me to visually observe the repaint count.

@soreau
Copy link
Member

soreau commented Oct 7, 2023

You should be able to use bench plugin to get a rough estimate of the framerate. I have updated it (master branch) so it works with 0.8.0 expo.

Set average frames to 1 for instant, and frames per update to 1 as well, for 0.7.5.

@ammen99
Copy link
Member

ammen99 commented Oct 7, 2023

I tried showrepaint, and it seems that in version 0.7.5, the flickering frequency is higher. It's worth noting that I have two screens, and in version 0.8.0, when I activate expo on the first screen, the second screen also undergoes repainting, whereas this issue does not occur in 0.7.5.

This I cannot reproduce (with two nested wayland outputs). When I activate expo on one of them, the other continues to have a very low refresh rate - is showrepaint telling you otherwise? I.e is the non-expo output repainting as quickly as the expo output??

@xiaohuirong
Copy link
Author

You should be able to use bench plugin to get a rough estimate of the framerate. I have updated it (master branch) so it works with 0.8.0 expo.

Set average frames to 1 for instant, and frames per update to 1 as well, for 0.7.5.

Under version 0.8.0, I tested the frame rates of automatic GPU frequency and maximum GPU frequency.
0.8.0 automatic GPU frequency:
https://drive.google.com/file/d/1XmJDbwddJKql_Xw2G-vGUt8yghoZBdnc/view?usp=sharing

0.8.0 max GPU frequency:
https://drive.google.com/file/d/1MoNti2a2jfTrweiS3wrB8sNeDaeoacYX/view?usp=sharing

I can't test 0.7.5; the benchmark always displays the highest frame rate of the screen.

@xiaohuirong
Copy link
Author

I tried showrepaint, and it seems that in version 0.7.5, the flickering frequency is higher. It's worth noting that I have two screens, and in version 0.8.0, when I activate expo on the first screen, the second screen also undergoes repainting, whereas this issue does not occur in 0.7.5.

This I cannot reproduce (with two nested wayland outputs). When I activate expo on one of them, the other continues to have a very low refresh rate - is showrepaint telling you otherwise? I.e is the non-expo output repainting as quickly as the expo output??

The issues I encountered are shown in the following video.
0.8.0
https://drive.google.com/file/d/1qPV6FTBPzG_fE8ov5fUUUkuUQuXrJ-us/view?usp=sharing

0.7.5
https://drive.google.com/file/d/1v3v9Xw3GbnfH9Mlaa6M0Q-8aru0QfsKc/view?usp=sharing

@ammen99
Copy link
Member

ammen99 commented Oct 7, 2023

To me it seems like the other output is not repainted on every frame - maybe the clock is ticking, or something else?
By the way, which plugin/client do you use for the backgrounds? It seems that you have two different images on the two monitors, maybe it is something related to that?

@xiaohuirong
Copy link
Author

The client for the backgrounds is swaybg.

@ammen99
Copy link
Member

ammen99 commented Oct 7, 2023

Actually, I can reproduce the lower GPU usage with wayfire-0.7.x. I had to disable blur and not have any client which redraws itself (I had glxgears before) - which is the only situation where the older expo implementation is indeed better.

I suspect a triple buffer strategy like mutter will be able to solve this problem in Wayfire. We could also always use the older implementation, but this is a trade-off: do we want better performance for blur and videos/etc or do we want better perf for static expo?

@ammen99
Copy link
Member

ammen99 commented Oct 7, 2023

Also, another thing to keep in mind, wayfire 0.7.x did not dim inactive workspaces in Expo. This does cost a bit, because we need to draw a semi-transparent rectangle over the inactive workspaces ..

@xiaohuirong
Copy link
Author

Actually, I can reproduce the lower GPU usage with wayfire-0.7.x. I had to disable blur and not have any client which redraws itself (I had glxgears before) - which is the only situation where the older expo implementation is indeed better.

I suspect a triple buffer strategy like mutter will be able to solve this problem in Wayfire. We could also always use the older implementation, but this is a trade-off: do we want better performance for blur and videos/etc or do we want better perf for static expo?

Thank you for letting me know your considerations and trade-offs. When using eww (dashboard on the background), I tested my browser's frame rate on the website www.testufo.com. It can reach the maximum frame rate of my monitor when expo is not activated. However, when I activate the expo view(eww will display on each workspace), the frame rate is only half. In version 0.7.5, when expo is activated, it can also run at full frame rate. It seems that eww significantly impacts the performance of Wayfire 0.8.0, whereas it has no noticeable impact on the performance of 0.7.5. Is this also caused by the trade-offs you mentioned?

@ammen99
Copy link
Member

ammen99 commented Oct 7, 2023

Yes. The more static surfaces there are (esp layer-shell ones, which are visible on all workspaces), the bigger the impact is.
Earlier, expo would composite each workspace to a framebuffer, and then paint the framebuffers on the screen.

This means: if you have multiple static layers on each workspace, they are composited once, and then just drawn at different sizes.

The new expo: composite everything directly on the screen. For dynamic applications, this is better, because even the old expo would have had to re-draw such clients on each frame. But, for static applications, it means we draw them on each frame with the new size, as expo zooms out of the current desktop.

@ammen99
Copy link
Member

ammen99 commented Oct 8, 2023

@xiaohuirong I have been trying various strategies today, but I have not been able to figure out much .. Maybe we'll have to fall back to the old expo painting algorithm while the animation is active.

By the way, I created a very hacky implementation of 'triple' buffering, based on the track-wlroots branch here. I wonder whether you might be able to test it? The patch is http://ix.io/4IuI and applies on top of track-wlroots. If you do find time and desire to test it, keep in mind that track-wlroots is based on wlroots-git and wf-config-git, so if you install it to /usr it might overwrite your existing installation of those libraries.

@ammen99
Copy link
Member

ammen99 commented Oct 8, 2023

Also, I pinned my gpu frequency to its minimum just to see what happens:

  • wayfire 0.7.2: ~0.4W
  • wayfire 0.8.0: ~0.6W
  • wayfire 0.8.0 with inactive workspace dimming disabled in the code: ~0.45W

I think we have the culprint ..

@xiaohuirong
Copy link
Author

I set the duration of my expo plugin to 300ms. Using the wtype -s 300 -k super_r command, I simulated a super key press operation every 300ms for 200 rounds. At the same time, I used sudo intel_gpu_top -J to obtain GPU information. Finally, I used Python to generate visualization images, resulting in the following two images:
Render/3D:

render

Power consumption:
power

'0.8.0p' represents the 0.8.0 version that includes the patch mentioned above. The 0.8.0 version generally exhibits higher GPU usage and increased power consumption compared to 0.7.5, and it is less smooth. 0.8.0p feels slightly smoother than 0.8.0, but it still has very high GPU usage.

@ammen99
Copy link
Member

ammen99 commented Oct 9, 2023

Wow, those are some nice graphics :) I suggest that you also try commenting out the lines

auto fb_region = target.framebuffer_region_from_geometry_region(region);
OpenGL::render_begin(target);
for (auto& dmg_rect : fb_region)
{
target.scissor(wlr_box_from_pixman_box(dmg_rect));
const float a = 1.0 - dim;
OpenGL::render_rectangle(target.geometry, {0, 0, 0, a},
target.get_orthographic_projection());
}
OpenGL::render_end();
, to disable the inactive workspace dimming. If that is enough to solve the problem, I will think of whether we could optimize that, or alternatively disable it conditionally.

@ammen99
Copy link
Member

ammen99 commented Oct 9, 2023

By the way, do you happen to have scripts for generating those graphs (or would you be willing to upload them somewhere)? I feel like it could be quite useful in general if we had something like this for trying out various optimizations :)

@xiaohuirong
Copy link
Author

By the way, do you happen to have scripts for generating those graphs (or would you be willing to upload them somewhere)? I feel like it could be quite useful in general if we had something like this for trying out various optimizations :)

This is mainly done through two simple scripts, getgpudata.sh and plotgpuusage.py, along with some manual operations. getgpudata.sh primarily obtains GPU data, while plotgpuusage.py visualizes GPU usage.

#!/bin/bash
# usage: ./getgpudata.sh <your json file name>

echo "[" > "$1"
/usr/bin/intel_gpu_top -J >> "$1" &
for i in {1..400}
    do wtype -s 300 -k super_r
done
pkill intel_gpu_top
echo "]" >> "$1"

chown 1000:1000 "$1"
#!/bin/python

import json

import matplotlib.pyplot as plt

vers = ["0.7.5", "0.8.0", "0.8.0p"]
mode = ["min", "auto", "max"]
eww = ["eww", "noeww"]

contents = [
    [[None for _ in range(len(eww))] for _ in range(len(mode))]
    for _ in range(len(vers))
]
plotdatas = [
    [[[] for _ in range(len(eww))] for _ in range(len(mode))] for _ in range(len(vers))
]

# load json file to contents
for i in range(len(vers)):
    for j in range(len(mode)):
        for k in range(len(eww)):
            file_name = vers[i] + "-" + mode[j] + "-" + eww[k] + ".json"
            path_name = vers[i] + "/" + file_name
            with open(path_name, "r") as file:
                contents[i][j][k] = json.load(file)
            for item in contents[i][j][k]:
                plotdatas[i][j][k].append(item["engines"]["Render/3D/0"]["busy"])
                # plotdatas[i][j][k].append(item["power"]["GPU"])

fig, axes = plt.subplots(3, 2, figsize=(30, 20))
for r in range(len(mode)):
    for c in range(len(eww)):
        # After specifying the mode and whether eww is running, draw the curves corresponding to three different wayfire versions.
        axes[r, c].set_title("gpu " + mode[r] + " frequency mode " + "with " + eww[c])
        axes[r, c].set_xlabel("Time")
        axes[r, c].set_ylabel("Render/3D/0")
        #axes[r, c].set_ylabel("Power")
        axes[r, c].set_xlim([0, 125])
        axes[r, c].set_ylim([0, 100])
        #axes[r, c].set_ylim([0, 4])
        for i in range(len(vers)):
            plotdata = plotdatas[i][r][c]
            axes[r, c].plot(plotdata, label=vers[i])
        axes[r, c].legend()

plt.show()
.
├── 0.7.5
│   ├── 0.7.5-auto-eww.json
│   ├── 0.7.5-auto-noeww.json
│   ├── 0.7.5-max-eww.json
│   ├── 0.7.5-max-noeww.json
│   ├── 0.7.5-min-eww.json
│   └── 0.7.5-min-noeww.json
├── 0.8.0
│   ├── 0.8.0-auto-eww.json
│   ├── 0.8.0-auto-noeww.json
│   ├── 0.8.0-max-eww.json
│   ├── 0.8.0-max-noeww.json
│   ├── 0.8.0-min-eww.json
│   └── 0.8.0-min-noeww.json
├── 0.8.0p
│   ├── 0.8.0p-auto-eww.json
│   ├── 0.8.0p-auto-noeww.json
│   ├── 0.8.0p-max-eww.json
│   ├── 0.8.0p-max-noeww.json
│   ├── 0.8.0p-min-eww.json
│   └── 0.8.0p-min-noeww.json
├── getgpudata.sh
└── plotgpuusage.py

4 directories, 20 files

@ammen99
Copy link
Member

ammen99 commented Oct 9, 2023

Nice, thanks :)

@ammen99
Copy link
Member

ammen99 commented Oct 9, 2023

Some tests from my system, unfortunately your findings are confirmed

expo

There are many curves, the important ones:

  • wayfire072: the expo version which works smooth for you, with static background
  • old-dynamic: wayfire 0.7.2 however with an animating background (glxgears)
  • 072-one-gears: wayfire 0.7.2 with just one animated workspace (glxgears only on one workspace)
  • with-overlay: wayfire 0.8.0 default, static background + wf-panel
  • no-overlay: wayfire 0.8.0, static background, no inactive workspace dim
  • dynamic-background: wayfire 0.8.0 default + glxgears background view

If nothing else, this clearly shows the trade-off I spoke about: expo is much more efficient with animated views, but less efficient with static workspaces. Also, about half the overhead comes from inactive workspace dimming.

@ammen99
Copy link
Member

ammen99 commented Oct 9, 2023

Maybe workspace-wall (what expo uses) should be smart: first, render once to a framebuffer, like old expo. If we get 2+ updates, switch to the new rendering algorithm. That's quite the hack though, I'll have to check how easy it is to implement.

@ammen99
Copy link
Member

ammen99 commented Oct 9, 2023

@xiaohuirong This makes me think, was Expo actually smooth before if you had one or more animating views?

@soreau
Copy link
Member

soreau commented Oct 10, 2023

Not to muddy the waters, but does this patch help anything? This should apply to master, 0.8.x and track-wlroots branches and affect the case of toggling expo animation without any clients on any workspaces.

@ammen99
Copy link
Member

ammen99 commented Oct 10, 2023

I don't expect the patch to help at all: the next frame is already scheduled indirectly by wlr_output_damage. The problem here is about the efficiency of the rendering algorithm.

@xiaohuirong
Copy link
Author

@xiaohuirong This makes me think, was Expo actually smooth before if you had one or more animating views?

I use mpvpaper to set videos as wallpapers, and the smoothness of expo version 0.8.0 has greatly improved.
I'd like to ask how to set glxgears as a wallpaper. In version 0.7.5, I used toggle_sticky to make the glxgears window appear on all workspaces, but it seems like this feature is not working in version 0.8.0.

@soreau
Copy link
Member

soreau commented Oct 10, 2023

@xiaohuirong This makes me think, was Expo actually smooth before if you had one or more animating views?

I use mpvpaper to set videos as wallpapers, and the smoothness of expo version 0.8.0 has greatly improved. I'd like to ask how to set glxgears as a wallpaper. In version 0.7.5, I used toggle_sticky to make the glxgears window appear on all workspaces, but it seems like this feature is not working in version 0.8.0.

You can try the background-view plugin from wayfire-plugins-extra.

@xiaohuirong
Copy link
Author

Summarizing some of the scenarios I have tested:

  1. When using mpvpaper to set a video wallpaper or using background-view to set glxgears and similar benchmark programs as wallpapers, the smoothness of the Expo plugin in version 0.8.0 is significantly better than in 0.7.5.

  2. In some other high-dynamic scenarios, such as enabling the blur plugin, playing a video in full screen in one workspace, opening web pages with dynamic content in one workspace, and running glxgears in full screen in another workspace, the smoothness of 0.7.5 is similar to 0.8.0, with slightly higher GPU usage.
    dynamic

  3. In some relatively static scenarios, 0.7.5 provides smoother performance and consumes lower GPU resources compared to 0.8.0.

@ammen99 ammen99 mentioned this issue Oct 20, 2023
@ammen99 ammen99 added this to the Wayfire 0.8.1 milestone Oct 23, 2023
@xiaohuirong
Copy link
Author

I drew some flame graphs, hoping they could be helpful. All data was generated by the command perf record -F 1000 -p $(pidof wayfire) -g -- sleep 60. Simulating press of the Super key using the command for i in {1..70}; do wtype -s 1000 -k super_r; done

0.7.5 with eww
0 7 5

0.7.5 without eww
0 7 5 noeww

0.8.0 with eww
0 8 0

0.8.0 without eww
0 8 0 noeww

@ammen99
Copy link
Member

ammen99 commented Feb 10, 2024

@xiaohuirong I thought a bit more and I figured out an even better approach which is a hybrid of the old and the new Expo implementation + additional optimizations. I pushed my impl to a branch here: https://github.com/WayfireWM/wayfire/tree/reimplement-expo-once-again

It would be great if you could test and report how it compares to the other variants, whether it is smooth, etc. For a comparison, I plotted the 3 variants I have tested and the new seems to be the fastest:

plot

I tested both the empty variants (where you have only the static background on all workspaces) and the glxgears variant where glxgears is shown on each workspace to force updates to the workspaces all of the time.

ammen99 added a commit that referenced this issue Feb 10, 2024
In Wayfire 0.7.2 we were using auxilliary buffers to composite the
workspaces before finally drawing them on the screen.

In Wayfire 0.8.0 the behavior changed: all windows were directly
composited on the screen. This introduced highly improved performance
for cases where the workspace contents were changing, because we could
render them at scale. However, it introduced problems with static
workspaces containing multiple windows, because we'd have to composite
them multiple times on each frame.

The new implementation takes a best-of-both-worlds approach. We
composite workspaces to auxilliary buffers, ensuring that we do not
re-composite static surfaces together on each frame.

To ensure that dynamic content also works well, we scale the buffers as
well, if enough of the content has changed so that a full redraw with
a different scale is less expensive than updating the current buffers.
We also have to be careful to avoid visual artifacts (popping etc) when
transitioning between different scales.

Fixes #1940
ammen99 added a commit that referenced this issue Feb 11, 2024
In Wayfire 0.7.2 we were using auxilliary buffers to composite the
workspaces before finally drawing them on the screen.

In Wayfire 0.8.0 the behavior changed: all windows were directly
composited on the screen. This introduced highly improved performance
for cases where the workspace contents were changing, because we could
render them at scale. However, it introduced problems with static
workspaces containing multiple windows, because we'd have to composite
them multiple times on each frame.

The new implementation takes a best-of-both-worlds approach. We
composite workspaces to auxilliary buffers, ensuring that we do not
re-composite static surfaces together on each frame.

To ensure that dynamic content also works well, we scale the buffers as
well, if enough of the content has changed so that a full redraw with
a different scale is less expensive than updating the current buffers.
We also have to be careful to avoid visual artifacts (popping etc) when
transitioning between different scales.

Fixes #1940
ammen99 added a commit that referenced this issue Mar 13, 2024
In Wayfire 0.7.2 we were using auxilliary buffers to composite the
workspaces before finally drawing them on the screen.

In Wayfire 0.8.0 the behavior changed: all windows were directly
composited on the screen. This introduced highly improved performance
for cases where the workspace contents were changing, because we could
render them at scale. However, it introduced problems with static
workspaces containing multiple windows, because we'd have to composite
them multiple times on each frame.

The new implementation takes a best-of-both-worlds approach. We
composite workspaces to auxilliary buffers, ensuring that we do not
re-composite static surfaces together on each frame.

To ensure that dynamic content also works well, we scale the buffers as
well, if enough of the content has changed so that a full redraw with
a different scale is less expensive than updating the current buffers.
We also have to be careful to avoid visual artifacts (popping etc) when
transitioning between different scales.

Fixes #1940
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants