Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalidenum workaround #859

Conversation

peterhillman
Copy link
Contributor

Address https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=26641
exrenvmap checks for unknown values of EnvMap attribute and avoids setting an EnvMap enum to an unknown value is undefined behavior. exrheader also has simliarly workarounds for EnvMap attributes. exrcheck also confirms that any EnvMap and DeepImageState attributes are set to valid states

else if (hasEnvmap (in.header()))
{
// validate type is known before using
const Envmap* typeInFile = &envmap (in.header());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be easier to read the intention here if written as:
const Envmap& typeInFile = envmap (in.header()
const int envMapAsInt = * reinterpret_cast<const int*>(&typeInFile);
since envmap() returns const&. The use of pointers to integers is confusing, and at least this limits that to a single reinterpret_cast line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The underlying issue is this -

IMF_STD_ATTRIBUTE_IMP (envmap, Envmap, Envmap)

expands to

const TypedAttribute<type> &					 \
    IMF_NAME_ATTRIBUTE(name) (const Header &header)                    \
    {									 \
	return header.typedAttribute <TypedAttribute <type> >		 \
		(IMF_STRING (name));					 \
    }	

which would be a TypedAttribute, where Envmap is

enum Envmap
{
    ENVMAP_LATLONG = 0,		// Latitude-longitude environment map
    ENVMAP_CUBE = 1,		// Cube map

    NUM_ENVMAPTYPES		// Number of different environment map types
};

TypedAttribute has this:

    const T &				value () const;

so shouldn't the following work? It eliminates all the casting, and lets the compiler do the work in a bullet proof way.

if (ENVMAP_LATLONG == envmap (in.header()).value();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see the originating bug report, but this avoids setting a value, so I think it would avoid the issue as described in the PR comments?

Copy link
Contributor

@meshula meshula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting one. I have many questions :)

else if (hasEnvmap (in.header()))
{
// validate type is known before using
const Envmap* typeInFile = &envmap (in.header());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The underlying issue is this -

IMF_STD_ATTRIBUTE_IMP (envmap, Envmap, Envmap)

expands to

const TypedAttribute<type> &					 \
    IMF_NAME_ATTRIBUTE(name) (const Header &header)                    \
    {									 \
	return header.typedAttribute <TypedAttribute <type> >		 \
		(IMF_STRING (name));					 \
    }	

which would be a TypedAttribute, where Envmap is

enum Envmap
{
    ENVMAP_LATLONG = 0,		// Latitude-longitude environment map
    ENVMAP_CUBE = 1,		// Cube map

    NUM_ENVMAPTYPES		// Number of different environment map types
};

TypedAttribute has this:

    const T &				value () const;

so shouldn't the following work? It eliminates all the casting, and lets the compiler do the work in a bullet proof way.

if (ENVMAP_LATLONG == envmap (in.header()).value();

else if (hasEnvmap (in.header()))
{
// validate type is known before using
const Envmap* typeInFile = &envmap (in.header());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see the originating bug report, but this avoids setting a value, so I think it would avoid the issue as described in the PR comments?

{
switch (e)
int envmapAsInt = *( reinterpret_cast<const int*>(&e) );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the hard core definition of default that default means "any known legal value" and is UB for unknown values? If that's the case, should we instead be declaring Envmap as an int?

i.e.

enum Envmap : int { ...

Would declaring it as such resolve the problem without all the casting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the definition of Envmap does seem to resolve the problem. Is doing that an ABI change? I was trying to avoid that but maybe the simplicity of the code makes it worth doing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to do an nm on the .so with and without the : int? If you grep the output for Envmap, we should be able to see if the mangled names with Envmap have changed or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mangled names do not change, but they move around slightly, which is a little concerning.

Will all architectures guarantee that the Envmap enum was of type int before this change? I'd understood that some compilers might use an 8 bit type if everything could fit in and no type was specified. If that's the case, it might cause issues in user code even if the library itself was identical

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's inherently a problem with any sort of reinterpret_cast based solution, but I think if it was an issue it would have raised its head by now.

If the names do not change, but move around, that's not going to cause linking problems. I'm in favor of moving to an explicit type to avoid compilers having opinions about types :)

src/lib/OpenEXRUtil/ImfCheckFile.cpp Show resolved Hide resolved
@cary-ilm
Copy link
Member

cary-ilm commented Dec 3, 2020 via email

@peterhillman
Copy link
Contributor Author

I believe compilers might optimize switch statements assuming enum variables only have valid values, so you might end up with an arbitrary case statement executing when the value is not valid. Code might also not be expecting no branch to be taken, which could cause issues.

I think the only enums of concern are Envmap and DeepImageState. All the other enums are validated when reading, and again by the sanityCheck function. That's because Compression and the enums in TileDescription are used to compute the number of chunks within the file, so if they aren't a currently known value then the file layout cannot be determined.

PixelType and LineOrder are also sanityChecked when a file is opened, so unknown values cannot be preserved (the *InputFile constructors throw an exception). That's not strictly necessary. Pixels cannot be read from an image which has an unknown PixelType but exrstdattr could theoretically still copy the data from input to output, and might be possible to read other parts of a multipart file.

Despite the potential ABI change, I like that @meshula's solution also bypasses any sanitizer warnings in user code and keeps things cleaner.

If more enums are added in the future, it might pay to have a more abstract class to handle them rather than just a plain enum, so unknown values can be handled properly and detected cleanly

@cary-ilm
Copy link
Member

cary-ilm commented Dec 3, 2020 via email

@meshula
Copy link
Contributor

meshula commented Dec 3, 2020

Thinking about Peter's earlier note that he's heard of compilers picking an arbitrary size for an enum (and I've heard that as well) If it's possible that an enum can be other than 32 bits, then a union of int and enum wouldn't allow strict comparison, due to endianness. This shows the basic problem with reinterpret casting as well.

key:

hi = high byte, mh = middle hi, ml = middle lo, lo = low byte
en = 8 bits encoding an enumeration
uu = undefined memory, could be anything, zero if we're lucky

memory arrangement:

[hi][mh][ml][lo]
[en][uu][uu][uu]

vs

[lo][ml][mh][hi]
[en][uu][uu][uu]

@cary-ilm
Copy link
Member

Revisiting this, have we settled on @meshula's "enum Envmap : int" solution? @peterhillman, would you like to rework the PR around that?

@peterhillman
Copy link
Contributor Author

This now uses "enum : int" for attributes which may be extended with future states. The sanitizer warning suppressions for Envmap and DeepImageState attributes are no longer needed so those have been removed too

Copy link
Contributor

@meshula meshula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm :)

@peterhillman peterhillman merged commit 01d8d74 into AcademySoftwareFoundation:master Jan 25, 2021
@peterhillman peterhillman deleted the invalidenum_workaround branch January 25, 2021 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants