-
-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pygame.Surface.fblits
slower in 2.5.0.dev2 vs 2.4.1
#2821
Comments
Tested with my youtube particles tutorial. Running on 2.4.1 ~20K particles at 60FPS and on 2.5.0 getting about 6800. https://github.com/bigwhoopgames/youtube-particles/blob/master/particles.py Line 88 needs changing to work with python 3.12, just wrap values passed to random as ints. |
Could it be SDL's fault? We got EDIT: Unfortunately there's no reference in SDL's releases page about any change regarding blitting: https://github.com/libsdl-org/SDL/releases. |
How hard would making an SDL reproducer be so I can report it upstream to them? |
Do you even need to recompile? SDL3_DYNAMIC_API=/my/actual/libSDL2.so.30.2 python3 benchmark.py |
Information from discord- itzpr said
oddbookworm posted a C reproducer attempt #include <stdio.h>
#include <sys/time.h>
#define SDL_MAIN_HANDLED
#include "SDL2/SDL.h"
long long timeInMilliseconds(void) {
struct timeval tv;
gettimeofday(&tv,NULL);
return (((long long)tv.tv_sec)*1000)+(tv.tv_usec/1000);
}
Uint32 ColorToUint(int R, int G, int B)
{
return (Uint32)((R << 16) + (G << 8) + (B << 0));
}
void blitManyTimes(SDL_Surface* target, SDL_Surface* surf, SDL_Rect* rect, size_t iterations)
{
for (size_t i = 0U; i < iterations; ++i)
{
if (SDL_BlitSurface(surf, rect, target, rect) < 0)
{
printf("Failed to perform blit\n");
return;
}
}
}
int main()
{
printf("SDL %d.%d.%d\n", SDL_MAJOR_VERSION, SDL_MINOR_VERSION, SDL_PATCHLEVEL);
if (SDL_Init(SDL_INIT_VIDEO) < 0)
{
printf("Failed to initialize SDL2\n");
return -1;
}
SDL_Window* window = SDL_CreateWindow("SDL2 Window", SDL_WINDOWPOS_CENTERED, SDL_WINDOWPOS_CENTERED, 680, 480, 0);
if (!window)
{
printf("Failed to create window\n");
return -1;
}
SDL_Surface* windowSurface = SDL_GetWindowSurface(window);
if (!windowSurface)
{
printf("Failed to get surface from window");
return -1;
}
SDL_Surface* surf = SDL_CreateRGBSurface(0, 100, 100, 32, 0, 0, 0, 0);
if (!surf)
{
printf("Failed to create surface");
return -1;
}
SDL_Rect rect = { .x = 0, .y = 0, .w = 100, .h = 100 };
SDL_Color color = { .r = 255, .g = 0, .b = 0, .a = 255 };
if (SDL_FillRect(surf, &rect, ColorToUint(color.r, color.g, color.b)) < 0)
{
printf("Failed to fill surface");
return -1;
}
long long start = timeInMilliseconds();
blitManyTimes(windowSurface, surf, &rect, 1000000);
long long end = timeInMilliseconds();
printf("Total time elapsed: %ld milliseconds\n", end - start);
return 0;
} He said he didn't see any difference until he switched 100x100 surfs to 10x20. |
Here is a more isolated Python reproducer of the issue. Still using fblits, but I expect this would have the same effect if running a for loop of blits in Python, just harder to spot since that loop would have more overhead. import random
import time
import pygame
random.seed(36)
width = 600
height = 400
screen = pygame.Surface((width, height))
def make_particle():
p_surf = pygame.Surface((5,5))
p_surf.fill((random.randint(0,255), random.randint(0,255), random.randint(0,255)))
return (p_surf, (random.randint(0, width), random.randint(0, height)))
particles = [make_particle() for _ in range(1000000)]
start = time.time()
screen.fblits(particles)
print(time.time() - start) Previously this runs in like a quarter second, now it runs in about a second. I tracked it down to between SDL 2.29.2 and 2.29.3 I really don't see what it would be in there: libsdl-org/SDL@prerelease-2.29.2...prerelease-2.29.3 |
I bisected the performance regression to libsdl-org/SDL@a6d5c1f (surprisingly, nothing about blitting in there... I guess this makes a logging function more expensive and that something in blit calls it) We still need a clear C reproducer and to report this to SDL, I'd appreciate someone doing. |
Ran some tests by removing parts of Lines 3816 to 3953 in 7d73fc8
SDL_GetColorKey is the issue, don't know why but removing both SDL_GetColorKey calls there brings back 2.4.1 performance.
|
Nice find! Okay, I think I have the full picture now. We call int SDL_GetColorKey(SDL_Surface *surface, Uint32 *key)
{
if (!surface) {
return SDL_InvalidParamError("surface");
}
if (!(surface->map->info.flags & SDL_COPY_COLORKEY)) {
return SDL_SetError("Surface doesn't have a colorkey");
}
if (key) {
*key = surface->map->info.colorkey;
}
return 0;
} Inside SDL, it calls SDL_SetError if the surface doesn't have a colorkey. But libsdl-org/SDL@a6d5c1f makes that way more expensive because it needs to check the environment for that variable now? I confirmed the slowdown is coming from GetColorKey rather than rearranging the blit code paths by adding a bunch of them into the function like so: int res = SDL_GetColorKey(src, &key);
int res2 = SDL_GetColorKey(src, &key);
int res3 = SDL_GetColorKey(src, &key);
int res4 = SDL_GetColorKey(src, &key);
int res5 = SDL_GetColorKey(src, &key); Luckily, there is a newer function called |
That reminds me: I should post the result of my investigation into palettes and the bug on Android. A palette can have the same colour twice, and if you load a paletted PNG where one colour has alpha transparency and the others do not, then SDL_image will just use the index of that colour as the colour key. This way, the loaded palette can have the same colour twice, with the same RGB values, and both time with an alpha of 255, even though the saved PNG had unique colours in the palette. We should probably expost SDL_GetColorKey (to get the mapped value of the colour key) and SDL_HasColorKey to Python. |
I just opened #2835 that starts to deal with this. There are other uses of SDL_GetColorKey in our code that need to be retrofitted, but the change made in 2835 is the most critical one and also the simplest, so I hope it's easy to review. |
If this is not fixed on the SDL side, we could just add something like the following and use it across our codebase when we don't care about the SDL error int PG_GetColorKey(SDL_Surface *surface, Uint32 *key) {
if (!SDL_HasColorKey(surface)) {
return -1;
}
return SDL_GetColorKey(surface, key);
} |
@bigwhoopgames mentioned on discord that he's getting a 25%-33% fps drop when using the 2.5.0.dev2 prerelease vs the 2.4.1 release. I did some testing by giving him special wheels with specific changes to see if that fixed it, but didn't really see much difference. List of things I tried:
surface.fblits
code and handled generator exception #2679surface.fblits
code and handled generator exception #2679 and Fix segfault insurface.fblits
#2667/Ox
to the MSVC build in mesonoptimization
to3
in mesonImage provided by Big Whoop showing some comparisons from a performance profile
The text was updated successfully, but these errors were encountered: