-
Notifications
You must be signed in to change notification settings - Fork 461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Omnibus #683
Omnibus #683
Conversation
This is fantastic work and I look forward to being able to use it. |
Thanks, Xenon ! |
This looks great - many thanks! I'm particularly interested in using these improvements in conjunction with GDAL. |
Glad to be of service, Homme. I've got a few more tricks up my sleeve, will be checking them in shortly. |
This looks like cool performance improvement of OpenJPEG for the first glimpse! |
Thanks, Petr. 8 core CPUs are coming to the desktop this year; we need to make use of them. |
@boxerab Seems nice ! BTW, there are many warnings on the build: http://my.cdash.org/viewBuildError.php?type=1&buildid=887331 Do you think you can get this down a bit ? |
Thanks @mayeut. I will update perf and memory consumption stats from my system. As for the warnings, yes, there are quite a lot. Will look into getting rid of them. |
By the way, my last commit on this branch made a small change to opj_image_compt_t, so it looks like I have changed the ABI. I need to do this because in certain cases, I am passing the tile data back inside the user image, and this tile data is allocated using opj_aligned_malloc, so it must Currently, peak memory usage for DCI decoding on my branch is around 1/3 of master (!) so it is now much more reasonable. But, the price we pay is we need to track whether image data was allocated with opj_aligned_malloc. |
Also, @mayeut , thanks again for setting up Travis and Appveyor. It made working out the problems with the branch pretty easy, although time consuming. It helped me a lot. |
@boxerab, even though the API/ABI check passes, some new declarations in |
Do you mean these methods?
Yes, I suppose you are right. The user needs to call these when freeing up image data, they can't just call If this is a problem, I can make a temporary change so these methods are not required. We will lose around 10% performance gain and add 1/3 memory usage back for single tile, 3 component decode. Also, there is an initialize call that is needed if OpenMP is turned on, but this is a new feature. |
image->comps[2].data = (int*)calloc((size_t)max, sizeof(int)); | ||
new_image->comps[0].data= NULL; | ||
new_image->comps[1].data = NULL; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Git reports a whitespace warning here (spaces at end of line).
@stweil Thanks. I would like to get this merged in, and then we can worry about whitespace. |
@mayeut I have temporarily reversed some of my memory optimizations; now a user doesn't need to worry about allocating and de-allocating image memory using the library. So, nothing should break if a user upgrades to the upcoming release; no behaviour changes. I will put these optimizations back in for the following release. |
Hi Aaron, thanks for this great work ! I really would like to merge it before the upcoming release but I have a few comments / questions:
UCL is there as initiator, host, maintainer, funder, and main contributor of the project. A file would list all OpenJPEG contributors. For now, we did not yet manage to achieve this simplification but concerning the 4 new files you added, would you agree to:
Cheers and thanks again. |
Hi Antonin, I will try to remove what I can of the spurious whitespace in the branch. Also, I am happy to change the copyright notice on my new files. Cheers, |
@detonin well, I've changed the copyright notices, and I've done what I could to fix the whitespace. How does it look now? |
1. zero copy reads for buffer/memory mapped stream interface 2. no code block allocation
the user should NOT free image data on their own; they must call opj_image_destroy.
…it of current ABI
This PR has been superseded by my decode_region PR. Closing. |
Omnibus
This branch is going to pull in all of my pending pull requests into one single branch.
It adds performance improvements and better memory management.
Also, memory mapped and memory buffered streams are supported.
Measurements
I compared master and omnibus branches for memory usage and performance.
(my system is a 4 year old 4-core i7 3770 with DDR3, running Windows 7 64)
Peak Memory Usage
For decode of 100 DCI frames:
So, memory usage has dropped to almost one third.
Performance:
Omnibus uses OpenMP to multi-thread the library.
For encode/decode of 100 DCI frames:
Encode: 0.85 FPS
Decode : 2.5 FPS
Encode: 5.5 FPS
Decode : 11.2 FPS
On upcoming 8 core FinFET desktop CPUs, this should get real time decode, and
~15 FPS encode.