-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VSG calls vkAllocateMemory() too much #1051
Comments
The vsg::DeviceMemory and vsg::MemberBufferPools are both written to specifically batch Vulkan memory allocations and memory reuse. This part of the VSG should be capable of handling allocation and reuse just fine. Could you check what type of objects are being allocated the most in your usage case to see if there are specific types of objects are triggering the large number of allocations. Could you also look at whether you are using a modified version of vsgXchange that disables the sharing of state as this could lead to excessive number of DescriptorImage/DescirptorBuffer being created. W.r.t vsg::TransferTask, it's role is multi-facted. The use of a staging buffer is required for transferring data to the GPU, and also for multi-buffering the data to prevent the a frames dynamic updates on the CPU from inappropriately affecting a concurrently rendered frame. The later issue is what tripped up the shadow mapping on NVidia cards as the buffer wasn't correctly being synchronized. By doing away with the buffering and synchronization you open the door to lots of obscure bugs like the shadow mapping one. |
OK, but I don't see how memory is returned to a memory pool. It's possible that I am wrong about maxMemoryAllocationCount and that the total number of allocations is decremented by vkFreeMemory. #942 covered a lot of the same ground, and in investigating that I thought that I had established that vkFreeMemory didn't help.
The direct cause of the small allocations is the reallocation of staging buffers by the TransferTask. vsgCs allocates a very small amount of dynamic data, and this should be deallocated when terrain tiles are paged out. I haven't investigated if "unloading" tiles is working as expected, but I generally don't see unbounded memory use when running vsgCs. The big allocations seem normal; terrain tiles do require texture and geometry data.
The only modifcation to vsgXchange that I'm using is the sRGB textures PR. There are a few models in this scene that are loaded by vsgXchange, but vsgCs doesn't use vsgXchange to build terrain tiles.
The TransferTask can't do much if the destination buffers aren't multi-buffered, other than protect the copy operations with a semaphore, and I understand that using staging buffers does simpify that. The alternative would be to use a fence to delay a direct write from the CPU until a safe time. |
The vsg::BufferInfo has a mechanism in it for returning memory blocks. It's a while since I wrote this stuff though ;-) The vsg::TransferTask shouldn't be thrashing the staging buffer, instead it should only be reallocated when more memory is required, and should be kept around at the max size. If it's thrashing then this is a bug. |
I looked at the source of the Vulkan Memory Allocator, and they count vkFreeMemory calls against the total allocations. They ought to know. I changed my debugging output to reflect that, and I know see:
That's a much more reasonable number for the total number of allocations. The total amount of allocated memory does seem kind of high, but that's something I need to look at in RenderDoc for vsgCs specifically. If there's any issue left, it's the behavior of TransferTask, but that's not critical. Incidently, you might want to look at using the Vulkan Memory Allocator to replace the memory pools and DeviceMemory direct allocations. I know that it's yet another third party dependency , but it has been hammered on by many developpers for several years. |
FYI, the VSG's CPU memory allocation and GPU memory allocation share quite a bit of code, just replacing the DeviceMemory/MemoryPools with Vulkan Memory Allocator wouldn't save much code and add another external dependency that doesn't gain functionality we don't already have. That's not to say we can can't learn stuff from Vulkan Memory Allocator project, just that learning from it, will be better than just ripping out a bit of the VSG that mostly works already. |
Vulkan implementations have a limit on the total number of times that vkAllocateMemory() can be called. On AMD drivers this limit is quite low at 4096 total allocations. I locally modified DeviceMemory.cpp to log a message every time vkAllocateMemory() is called. A little while into a run of vsgCs, we see:
The one large allocation comes from paging in terrain and is made on behalf of a memory pool. The many small allocations are coming from the transfer task, which doesn't use a pool and allocates memory large enough to copy all the dynamic buffers in use. I don't know why this number is slowly growing. It's very pessemistic with regards to the dynamic data actually transferred; unless LOD fading is enabled, the only dynamic data transferred in a vsgCs frame is the light data.
So, there are two bugs:
In the not-too-long term, almost all the direct calls to DeviceMemory::create() should go away and be replaced by allocations from pools.
In the long term, TransferTask doesn't need a big staging buffer. Highly dynamic buffer data should be stored in memory that is host-visible and thus doesn't need a staging buffer at all.
The text was updated successfully, but these errors were encountered: