Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The script is slow #6

Open
simlevesque opened this issue Nov 23, 2015 · 36 comments
Open

The script is slow #6

simlevesque opened this issue Nov 23, 2015 · 36 comments

Comments

@simlevesque
Copy link
Contributor

Hi, I was wondering if it was just me or is the script really slow ?

@Sweets
Copy link

Sweets commented Nov 24, 2015

It uses ImageMagick's convert tool to modify images, twice at that. The script isn't asynchronous, because if out was you would have broken or missing components (some perhaps disabling the program altogether, at least from just a glance). Really the only thing you can do is deal with it being "slow" (although it could be much slower), or even find a different script to use.

Also, if this sounds rude, apologies, it isn't meant to.

@meskarune
Copy link
Owner

It takes about 2.5s on my netbook to run. I resize the image down, blur it and then resize back up, which helped, but yeah, it still takes a few seconds.

You can speed it up a bit more by changing the blur setting from -filter Gaussian -resize 50% -define filter:sigma=4.5 -resize 200% to -filter Gaussian -resize 50% -define filter:sigma=2.5 -resize 200% but then the background isn't as blurred.

You could also skip doing a blur altogether and pixelate the background instead with scale.

I might play around with resizing even smaller than 50%, perhaps 30% would be better, especially for larger screen sizes. But I am also open to suggestions to help improve the speed.

The speed issue is partly just the nature of the image editing that is happening. Try doing a blur on an image with gimp or using a filter in gimp to see, it just takes some time to do. For my use, I use xautolock to lock automatically after 5 min so I don't really notice the initial run speed as I'm generally away from the computer anyways.

@meskarune
Copy link
Owner

I just tested with 30% resize down and 333.33% resize up, and it didn't significantly improve the speed for me. It went from like 2.5s to 2.3s. Then I changed the sigma to 2.5 and the speed was like 2.2s. It's not really a huge improvement, but I could try with these settings if people don't mind the change in quality.

screenshots:

regular script

lock screen shot, slow

resize changed to 30% and sigma 2.5

lock screen shot improved

@simlevesque
Copy link
Contributor Author

@Coilest

"Really the only thing you can do is deal with it being "slow" (although it could be much slower), or even find a different script to use."

Or I could help you fix it... Maybe you like slow things, it's just not my thing.


Ok, here's where I'm at :

  • It's slow on my end because I have multiple monitors.
  • The crunching into a 1 pixel block looks like the most expensive operation.
  • The only part that needs to be crunched is the center monitor since it is the only one which will have text on it.
  • You need every monitors to be blurred, but you only need to crunch the middle screen so there should be two scrot (scrot is not an expensive operation).
  • From my observations, it seems like crunching a jpg is faster than crunching a png. The screen shot to be crunched should be in .jpg.
  • It's even faster if you set the quality to one (-q=1) for the jpg scrot since it's quicker to crunch a jpg which is lossy (that's how you make a thumbnail).

I'll try to send you a pull request ASAP !

@simlevesque
Copy link
Contributor Author

Some more :

  • You can't tell scrot to take a picture of a single monitor, you'd need another step and it would add time.
  • While reading the man page for scrot, I discovered a feature that could speed up the process : the -t argument which creates a thumbnail at the same time as it shoots the picture. The thumbnail can be crunched faster since it's already smaller.

@simlevesque
Copy link
Contributor Author

My best try at fixing the script takes one more second than meskarune/i3lock-fancy. 11 second for my version and 10 second for yours. Three monitors, Imagemagick with OpenCL & OpenMP.

By the way : if you wanna compare performance between various imagemagick operations, add the --bench argument.

@meskarune
Copy link
Owner

@simlevesque I was thinking of switching to maim and had a similar idea to yours. Basically just grab the center bit of the screen where the text will be to determine color instead of the whole screen. A 300px square area in the center would probably cover it. Maim can grab a selection of the screen as well as take screenshots.

https://github.com/naelstrof/maim

The downside is maim isn't as popularly available as scrot in various linux distros, but then i3lock-color isn't either I suppose.

Thanks for the tip about --bench it will come in useful :)

@pid1
Copy link

pid1 commented Nov 28, 2015

I am using maim in my fork, so I can play around with --localize. I will need to figure out a way to get relative locations, though, instead of hard-coding for a certain screen resolution.

@carnager
Copy link

I use this in teiler: https://github.com/DaveDavenport/xininfo

carnager@caprica ~ > xininfo -mon-size
2560 1440
carnager@caprica ~ > xininfo -active-mon
0

@insanebits
Copy link

What if we would hide problem from the user by first applying some transitional background (greyish filter on top of screenshot for example) so we can show locked screen ASAP. And then do calculations in the background. And once finished switch to the "nice" background.

The biggest problem for me is: when I hit lock shortcut it seems like nothing happened, and after ~10s it locks down. I wouldn't care if it would not be that good looking while script renders background. As long as I know my system is already locked and I can leave my computer instead of waiting 10s to vertify it's locked.

What do you guys think?

@meskarune
Copy link
Owner

@insanebits could you try the current script and let me know how slow it is for you? I have changed some things since this issue was opened that have helped the speed for me significantly.

@insanebits
Copy link

Nice to hear about it. On my system latest version takes about ~2.5s which is acceptable. Awesome job! Will test it tomorrow on my work machine which is a bit older and see how it performs.

@meskarune
Copy link
Owner

Ok I made another change. Instead of taking an average color of the entire screen, I crop it to the center 100px. This has 2 advantages, one of being faster, and another of only using the center color to determind if it should be dark or light, so in situations like this:

screen shot of the lock with majority dark colors but light in the center

It does the right thing.

@cer-nagas
Copy link

Instead of using Gaussian filter, you can use directly the -blur option of imageMagick, which is recommended and explained here: http://www.imagemagick.org/Usage/blur/
You can try changing line 7 to EFFECT=(-resize 20% -blur 6x3 -resize 500.5%).
The result is "more blurred" (acceptable in my opinion), but the time taken is smaller. In my case it decreased from 1.7s to 1.3s. It's said that the bigger (or more complex) the image is, the more significant the difference is. Tinker with the {radius}x{sigma} ratio and I think you will find the best solution for that (in my monitor the 6x3 seems good).

@PandorasFox
Copy link
Contributor

Actually, one this note, my fork of i3lock will have a blurring option, so that'll be moved out of this script and will vastly speed up the locking. You can read a bit more into this in #57.

There'll be some interesting things to consider for implementation but I think the end result will be much, much better either way.

@AladW
Copy link
Collaborator

AladW commented Oct 15, 2016

Random idea:

Run convert processes in parallel by generating both images for dark and light backgrounds. So 3x resources for 2x duration, I guess. Also exit codes are likely lost, unless relying on bash 4.4.

@Airblader
Copy link

@PandorasFox
Copy link
Contributor

I've no idea yet how ffmpeg's blurring will compare to what's pending for my i3lock fork (I have a feeling like ffmpeg's code will be a bit more polished, but may have some more overhead, so it'll balance out), but what I'd like to eventually nail down would be overlaying an image over what's blurred, so that i3lock can grab the screen and lock it ASAP, rather than wait for an image to be blurred.

That'd usually lead to a loss of things like custom overlaid text, lock icons, etc. unless that's also brought into i3lock/an i3lock fork (hint: no), unless I go tinker with it some more to allow overlaying transparent images over the blurred screenshot. I've got some hopes for that, so I'll see how it goes when I have time to tinker and implement that.

Hopefully this'll enable i3lock-fancy to be a lot faster and more streamlined (as well as eliminate some dependencies, I think).

@AladW
Copy link
Collaborator

AladW commented Nov 1, 2016

No offense, but relying on a i3lock fork to do all of the heavy lifting is hardly a portable thing to do. At the very least ffmpeg should be kept as fallback.

@PandorasFox
Copy link
Contributor

Perhaps, but if we're already grabbing the screen, it's pretty trivial to use xcb to just capture it and blur it. While ffmpeg can do gaussian blurring pretty quickly, I'll go do some timings but I think the timings will be fairly similar, and any performance loss in i3lock blurring will be offset by the fact that i3lock will be grabbing the screen and locking sooner, which is preferable, for me, at least.

i3lock already requires libxcb and uses cairo for drawing stuff, so I don't think that the blurring that @sebastian-frysztak is working on will introduce any more dependencies or make it less portable.

@frysztak
Copy link

frysztak commented Nov 1, 2016

@AladW: it will be portable. There are three blur implementations:

  • old, generic and slow - but at least it works everywhere
  • SSE2-optimized - with much better border handling than previous, and ~3..4 times faster. SSE2 is no news and everything since 2004 supports it, so this is effectively the default option.
  • SSSE3-optimized - should be even faster than SSE2, but needs some work.

We'll detect CPU's capabilities at runtime and use appropriate functions. There are no additional dependencies.

@PandorasFox
Copy link
Contributor

PandorasFox commented Nov 1, 2016

Here's the timings from what I'm seeing with i3lock blurring vs ffmpeg:

arcana@archana:~/i3lock-color$ time (scrot in.png && ffmpeg -loglevel quiet -y -i in.png -vf "gblur=sigma=8" out.png && ./i3lock -i ./out.png)

real    0m0.380s
user    0m0.377s
sys     0m0.033s
arcana@archana:~/i3lock-color$ time ./i3lock -B

real    0m0.058s
user    0m0.043s
sys     0m0.007s

It may not be a perfect comparison, but handling the blurring in i3lock is much, much faster than anything external.

(The version of the blurring I used is the SSE2 version, I believe).

edit: blurring on my desktop with SSE3:

[arcana@archana i3lock-color]$ time ./i3lock -B

real    0m0.086s
user    0m0.047s
sys 0m0.010s
[arcana@archana i3lock-color]$ time (scrot in.png && ffmpeg -loglevel quiet -y -i in.png -vf "gblur=sigma=8" out.png && ./i3lock -i ./out.png)

real    0m0.777s
user    0m0.737s
sys 0m0.053s

I imagine that the main hit in performance (my CPU on my desktop is much stronger than my laptop CPU) comes from my desktop having ~4x as many pixels to work on. There's a ~9x speedup here compared to the ~7x speedup on my laptop, which suggests that this method is also better for larger resolutions.

@AladW
Copy link
Collaborator

AladW commented Nov 1, 2016

My point is that right now, you have some functionality if you use the i3-shipped i3lock, which is included with most Linux/BSD distributions. Moving everything into a fork means you have to do more than just fetch a shell script. Hence at minimum, there should be an ffmpeg fallback if i3lock -B is not available.

@PandorasFox
Copy link
Contributor

PandorasFox commented Nov 1, 2016

Oh yeah, definitely, but the script has always used at least some of the features specific to a fork (primarily being the custom colors); originally, it relied on a fork that was ... 5 or 6 years out of date. I was kinda uncomfortable with using a fork from 5 or 6 years ago on my machine since that just seems like a bad idea, which is how I kinda came about to this. I figured as long as I was doing that, I may as well make it more customizable.

@PandorasFox
Copy link
Contributor

There's blurring in the mainline of i3lock-color, so if you want to tentatively start using that, it should work fine.

So far for overlaying text/etc you'll have to pad out some images and then just use that as an image that's overlaid over the blur. I'll work on image offsets (and potentially multiple images, though implementing that will likely get messy) soonish, since that should make things easier for you (don't have to generate a large number of images to overlay for various resolutions), and potentially alignment flags (bottom middle, right middle, middle middle, etc).

Also potentially could do them per-monitor instead of on the screen as a whole. I'll probably be refactoring a lot of the image handling code if I do this.

@owenthewizard
Copy link

Maybe blurring could be done using Imlib2?

@darddan
Copy link

darddan commented Feb 22, 2018

I made a fork of scrot and added blurring and adding an icon in the code. Instead of 2-5 seconds, now it runs in about 150 - 500 milliseconds.
You can check the code here darddan/scrot.

@AntonGitName
Copy link

darddan@, I didn't figure out how to use your fork with multiple displays.

If anyone is interested, here is what I come up with for two monitors. Works fast enough for me.

LOCK=$HOME/.i3/i3lock-fancy/lock.png
RES=$(xdpyinfo | grep dimensions | sed -r 's/^[^0-9]*([0-9]+x[0-9]+).*$/\1/')
IMAGE=$(mktemp).png

ffmpeg -probesize 100M -thread_queue_size 32 -f x11grab -video_size $RES \
  -y -i $DISPLAY -i $LOCK -i $LOCK -filter_complex \
  "eq=gamma=0.75,boxblur=3:3,overlay=(main_w-overlay_w)/4:(main_h-overlay_h)/2,overlay=3*(main_w-overlay_w)/4:(main_h-overlay_h)/2" \
  -vframes 1 $IMAGE

i3lock -n -i "$IMAGE"

@Boruch-Baum
Copy link
Contributor

Two more suggestions:

1] Add an option to use a static image, that could be of a screenshot that the user manually takes, once. I implemented this in PR #124 ;

2] Convert the script from bash to dash. This will require replacing several bash idioms present throughout, such as the use of arrays. Here are two links discussing the speed difference, with attempts to benchmark it:

https://askubuntu.com/questions/1059474/are-there-concrete-figures-on-the-speed-of-bash-vs-dash

https://unix.stackexchange.com/questions/148035/is-dash-or-some-other-shell-faster-than-bash

@AladW
Copy link
Collaborator

AladW commented Aug 7, 2018

Re 2: Changing to dash will have minimal effect, it's convert et al that take the majority of compute time. See http://wiki.c2.com/?PrematureOptimization

@meskarune
Copy link
Owner

I wrote a prototype lua script that runs in less than half a second: https://github.com/meskarune/i3lock-fancy/blob/fast-blur/fast-blur.lua

@AladW
Copy link
Collaborator

AladW commented Aug 9, 2018

I suppose the main advantage is from using i3lock --blur?

@frysztak
Copy link

frysztak commented Aug 9, 2018

If you want to make blurring really fast, you should consider using OpenGL and writing appropriate shader for it. Hopefully startup time won't be too long.

@meskarune
Copy link
Owner

@AladW yes, also using lua pattern matching instead of awk/sed and not using the readline stuff which is actually pretty slow in itself. Lua is a pretty large package dependency though and I don't think everyone is going to want to install a whole programming language for one script.

I can't help but think that there has to be an easier way to get the monitor information than parsing xrandr output. There must be something on the OS in /dev that can be parsed to get this. I need the total resolution and current resolution of each connected monitor as well as their offsets.

Maybe the bash script can be edited to use i3lock-color --blur 5 but my previous attempt trying to get that to work on dualmonitor with readline failed. The image was made with a grey background instead of clear. It will probably be easier to do on the single monitor script, but I would prefer to have dual monitor support be the standard at some point.

@sebastian-frysztak that is probably something that would need to happen in the upstream i3lock-color if that dev wanted to do it. Their built in blur is pretty fast and decent right now though. I've also never coded anything with OpenGL >.> I don't know how difficult that would be.

There are a lot of improvements that can still be made in the bash script that I want to do when I have time and I think there are a lot of people who would prefer something with a small dependency list. Most Linux systems already have bash, awk, core-utiles, etc installed by default.

@yvbbrjdr
Copy link

yvbbrjdr commented Oct 3, 2018

I rewrote i3lock-fancy in C, and used (configurable) box blur instead of Gaussian blur to achieve high speed. Now it only takes about 200ms to generate the blurred screenshot. If you are interested, please check it out!

@medicalwei
Copy link

I am currently using this effect to further obscure the screenshot while keeping the color. The render is much faster since the screenshot is scaled down to much smaller size.

EFFECT=(-sample 5% -filter Gaussian -define filter:sigma=4 -resize 2000.5%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests