Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: New C core #870

Merged
merged 66 commits into from
Jun 23, 2021
Merged

WIP: New C core #870

merged 66 commits into from
Jun 23, 2021

Conversation

kdt3rd
Copy link
Contributor

@kdt3rd kdt3rd commented Nov 23, 2020

This starts to implement a new C-based "core" for OpenEXR.

Current status:

  • most things are being written with doxygen-style comments, although some functions will need to be expanded upon and a cleanup pass made once we like the style to be made consistent.
  • base, lighter-weight attributes initial implementation sufficient for use, initial unit tests written
  • core file I/O / error reporting / memory allocation defined
  • initial file reading / parsing done, unit tests are in progress
    • only enough of the compression techniques to test have been ported, b44, pxr24, etc. are not difficult once we are happy with the core design, but I want to make that system a bit more ... replaceable
  • writing is largely incomplete at this stage
  • NB: This does not replace any C++ code at all currently (in fact, the current performance test relies on that!)

This is a massive PR, if it would be easier to re-construct it as individual layers, I can probably do that. But what I'm looking for is feedback on whether the "namespace" mechanism I've implemented degrades from the readability of the code too much to be worth it, or if we should eliminate that before there is too much inertia. Also looking for feedback on how I've chosen to break up the headers into component chunks - I dislike massive files. It is also written in a slightly different style, favoring underscores and such to try to differentiate it more from the C++.

Comment on lines 25 to 17
/** Used to declare / use types */
#define EXR_TYPE(n) exr_ ## n
/** Used to declare / use enums */
#define EXR_DEF(n) EXR_ ## n
/** Used to declare / call functions */
#define EXR_FUN(n) exr_ ## n

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be helpful to have

#define EXR_TYPE(n) exr_ ## n ## _t

to more easily distinguish types from functions and be sure they won't clash?

I'm a little leery of these macros. Maybe it's nothing to worry about, but it makes me immediately wonder if they will throw off modern editor's (such as Sublime, VSCode, CLion, etc.) ability to correctly syntax highlight, refactor, connect declarations and usage of functions, show information about a function's declarations and parameter when you hover over a call site, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, the _t is a good point, not sure why I didn't do that, I obviously subconsciously was thinking it with the separation.

But these macros are the biggest discussion I'd like to have - now is the time to decide whether to use it or not. And I think in libraries like jpeg where you can have 8-bit and 12-bit variants present, it makes sense. I was initially thinking people might like to customize for their facility, but I don't think this is that big a deal at this level, because the file format is the file format, and we should do the symbol versioning thing to manage different versions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about Larry's point on the IDEs ~ do the common IDEs recognize the expanded function names such that "go to definition/declaration" works? As a sanity check, GLEW's macros tend to work in IDEs, so it might be interesting to see if an editor can eat GLEW, can it eat these macros as well?

Comment on lines 53 to 57
d = _mm_add_epi8(d, _mm_slli_si128(d, 1));
d = _mm_add_epi8(d, _mm_slli_si128(d, 2));
d = _mm_add_epi8(d, _mm_slli_si128(d, 4));
d = _mm_add_epi8(d, _mm_slli_si128(d, 8));
d = _mm_add_epi8(d, vPrev);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this isn't the time and place to bring this up, but perhaps it's about time to kick these direct intrinsics to the curb and instead use wrappers with understandable names and that, inside them, have implementations for SSE, ARM, generic C fallback, etc. As an example, see OIIO's simd.h.

Yes, I know that my C++ classes are not a good solution for this C code, but the point is that we could make C wrappers these 4-wide ops, with friendly and self-documenting names and a central place to have alternate implementations -- such as we're going to want to get this running fast on the new Macs with ARM processors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, I'll have a look


#include "openexr_base.h"

extern void *priv_alloc( size_t bytes ) __attribute__((malloc));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attribute is surely gcc/clang only and won't work on MSVS, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, quite possibly, I don't have a windows machine at home, so I end up using CI to discover these :)

Copy link
Contributor

@meshula meshula Nov 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MSVC doesn't have an equivalent, I think. Am I reading the gcc docs right, that the attribute says "a returned pointer has no objects embedded in it, so a realloc is safe? I'm curious about what kind of optimizations it enables gcc to do in response? I think this attribute is quite gcc specific and should be restricted as such. As of apple clang 12, clang has started to lose attributes that were implemented for compatibility. Ultimately Apple will upstream these, and clang will diverge from gcc in the end.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my understanding is that it is two fold: one, it allows the optimizer know that the memory is not aliased, so it doesn't have to reload from other non-aliased stores, which realloc can not not.

However, the real reason I attached it was that I was under the impression that the various static analyzers would use this to report further down where memory was requested that the memory was not freed, so it would report the call site, not the routine

@ThiagoIze
Copy link

I get having a C based API in libraries, but why should the implementation itself be C instead of C++? If there are parts of C++ that aren't wanted, such as exceptions, one can always avoid them, so it's not clear to me what forcing C accomplishes. Since it's not apparent to me, I suppose it would be worth having that outlined in case someone in the future wants to port this to C++.

@kthurston
Copy link

I get having a C based API in libraries, but why should the implementation itself be C instead of C++? If there are parts of C++ that aren't wanted, such as exceptions, one can always avoid them, so it's not clear to me what forcing C accomplishes. Since it's not apparent to me, I suppose it would be worth having that outlined in case someone in the future wants to port this to C++.

@ThiagoIze - the idea is that this is C only because then it can be used in context where a c++ library is not desired (which has been requested numerous times), and I believe is a hindrance to adoption in things like the ffmpeg project, instead they have implemented a subset of the file format, which is problematic. Further, the roadmap (which is certainly not clear from what has been done so far) is to then re-implement the existing C++ library in terms of this library, although some API will have to be added to do so in a thread-safe manner. So C++ is not going away, just not part of this subset of the overall library. So, if you prefer C++, please continue to use the C++ API, as that will continue to be developed, and in future versions will gain the advantages this C layer was written to address (thread safety where multiple threads can read from the same file being the primary one)

@kthurston
Copy link

kthurston commented Jan 22, 2021

I have removed the pseudo namespace layer. If anyone was following along, I used the following script to convert the code, then touched up a couple of things in the tests which were working by luck only.

#! /bin/bash

echo 's/EXR_TYPE[ ]?[(][ ]?file[ ]?[)]/_priv_exr_file_t/g' > /tmp/fixup.$$
echo 's/EXR_TYPE[ ]?[(][ ]?FILE[ ]?[)]/exr_file_t/g' >> /tmp/fixup.$$
echo 's/EXR_TYPE[ ]?[(][ ]?([A-Za-z_][A-Za-z_0-9]*)[ ]?[)]/exr_\1_t/g' >> /tmp/fixup.$$
echo 's/EXR_DEF[ ]?[(][ ]?([A-Za-z_][A-Za-z_0-9]*)[ ]?[)]/EXR_\1/g' >> /tmp/fixup.$$
echo 's/EXR_FUN[ ]?[(][ ]?([A-Za-z_][A-Za-z_0-9]*)[ ]?[)]/exr_\1/g' >> /tmp/fixup.$$

find src/test/OpenEXRCoreTest -type f -exec sed -r -i -f /tmp/fixup.$$ {} \;
find src/lib/OpenEXRCore -type f -exec sed -r -i -f /tmp/fixup.$$ {} \;
find src/bin/exrinfo -type f -exec sed -r -i -f /tmp/fixup.$$ {} \;

rm /tmp/fixup.$$

@meshula
Copy link
Contributor

meshula commented Jan 23, 2021

Removal of the pseudo-namespaces has made the code infinitely easier to read.

src/lib/OpenEXRCore/openexr_priv_structs.h Outdated Show resolved Hide resolved
src/lib/OpenEXRCore/openexr_priv_structs.h Outdated Show resolved Hide resolved
@lgritz
Copy link
Contributor

lgritz commented Mar 30, 2021

Hey, can I make a request? Somewhere in one of the main headers (maybe OpenEXRConfig.h) can you please

#define HAS_OPENEXR_CORE 1

This will make it easier for downstream packages to know if they're dealing with a build of OpenEXR that has the new headers and C core functions.

@kdt3rd
Copy link
Contributor Author

kdt3rd commented Mar 30, 2021

This will make it easier for downstream packages to know if they're dealing with a build of OpenEXR that has the new headers and C core functions.

of course, we should do something like this. Although I don't actually like the name "core" - but I guess I will have to see how the rest of this goes for a little while

if (u)
if (!u) return pctxt->standard_error (ctxt, EXR_ERR_INVALID_ARGUMENT);

/* TODO: do we care if the incoming unpacked data is null? */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's correct to reject null.

@@ -154,13 +154,13 @@ undozip (
void* out,
size_t outsz)
{
uLongf outSize = outsz;
uLongf outSize = (uLongf)outsz;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this a do-over, now might be a time to replace uLongF with a sized type :) Every time I see uLongf, I have to go grep for what it is!

@@ -120,10 +120,17 @@ exr_attr_string_create_with_length (
/* someone might pass in a string shorter than requested length (or longer) */
if (len > 0)
{
#ifdef _MSC_VER
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my warnings cleanup elsewhere in OpenEXR, I'm just suppressing 4996 for the entire file. I can't think of a single case where spot-applying 4996 might be helpful. The warnings are just a constant reminder that Microsoft had an idea that never took hold in the programming community, outside of the windows centric community, anyway.

Copy link
Contributor

@meshula meshula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some small opinions

@kdt3rd kdt3rd force-pushed the new_c_core branch 3 times, most recently from fd0a680 to 3285dbb Compare May 30, 2021 12:20
kdt3rd added 10 commits June 21, 2021 00:51
This is an example of using the C library only

Signed-off-by: Kimball Thurston <[email protected]>
Popular consent on the TSC was that the namespace macros would serve
little purpose for the C layer, where we will instead focus on symbol
versioning and other techniques to maintain stability, obviating the
need for customization.

Signed-off-by: Kimball Thurston <[email protected]>
Signed-off-by: Kimball Thurston <[email protected]>
kdt3rd and others added 27 commits June 21, 2021 00:51
Signed-off-by: Kimball Thurston <[email protected]>
In refactoring how attributes worked and were accessed, introduced a
paste bomb with the tile description attribute.

Co-authored-by: Timothy Grant <[email protected]>
Signed-off-by: Kimball Thurston <[email protected]>
Also make a pass with -Weverything under clang to clean up warnings

Signed-off-by: Kimball Thurston <[email protected]>
Signed-off-by: Kimball Thurston <[email protected]>
Signed-off-by: Kimball Thurston <[email protected]>
Signed-off-by: Kimball Thurston <[email protected]>
Signed-off-by: Kimball Thurston <[email protected]>
Signed-off-by: Kimball Thurston <[email protected]>
@kdt3rd kdt3rd merged commit a04cbf5 into AcademySoftwareFoundation:master Jun 23, 2021
@cary-ilm cary-ilm added the v3.1.0 label Sep 2, 2021
@kdt3rd kdt3rd deleted the new_c_core branch February 12, 2022 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants