Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSE2 optimization for horizontal pass of IDWT 5x3 #960

Open
rouault opened this issue Jun 26, 2017 · 0 comments
Open

SSE2 optimization for horizontal pass of IDWT 5x3 #960

rouault opened this issue Jun 26, 2017 · 0 comments

Comments

@rouault
Copy link
Collaborator

rouault commented Jun 26, 2017

It is possible to do SSE2 optimizations of the single-pass horizontal pass of IDWT 5x3. This has been partially implemented per 288f472 in the https://github.com/uclouvain/openjpeg/tree/opj_idwt53_h_cas0_SSE2 branch, during the investigations done in #957 but not committed yet .
The case here is restricted to cas == 0 (ie the origin of the array is on even coordinates) and where the length is a multiple of 8. This should be generalized to a length not a multiple of 8 (termination to modify), and to the cas == 1 (odd coordinates) as well. But the performance improvement in the situation implemented were rather small, so given the additional code complexity, not merged for now.

@rouault rouault changed the title SSE2 optimization for IDWT 6x SSE2 optimization for horizontal pass of IDWT 5x3 Jun 26, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant