Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

J9UTF8 FAM? #10244

Closed
Hello71 opened this issue Jul 24, 2020 · 25 comments · Fixed by #10404
Closed

J9UTF8 FAM? #10244

Hello71 opened this issue Jul 24, 2020 · 25 comments · Fixed by #10404
Assignees

Comments

@Hello71
Copy link
Contributor

Hello71 commented Jul 24, 2020

I'm a little confused about the J9UTF8 struct. It seems that the data member is actually a flexible array member. In that case, this is both confusing because it is both undefined behavior and raises the question of why 2 was selected instead of 0 or 1.

Code like https://github.com/eclipse/openj9/blob/master/runtime/cfdumper/romdump.c#L157 is also confusing.

@Hello71
Copy link
Contributor Author

Hello71 commented Jul 24, 2020

Specifically, beyond being undefined behavior, this code fails to compile with gcc 10:

In function 'charReplace',
    inlined from 'addModuleExportsOrOpens' at jvmtiModules.c:99:4:
jvmtiModules.c:34:25: error: writing 1 byte into a region of size 0 [-Werror=stringop-overflow=]
   34 |    J9UTF8_DATA(utf8)[i] = newChar;
In file included from ../oti/j9generated.h:69,
                 from ../oti/j9.h:82,
                 from ../oti/jvmtiInternal.h:29,
                 from jvmtiHelpers.h:28,
                 from jvmtiModules.c:22:
jvmtiModules.c: In function 'addModuleExportsOrOpens':
../oti/j9nonbuilder.h:3202:6: note: at offset 2 to object 'data' with size 2 declared here
 3202 |  U_8 data[2];
      |      ^~~~

@gacholio
Copy link
Contributor

Given that we already have macros to access the size and data, the correct thing to do would be to declare J9UTF8 as an opaque structure and modify the macros to operate on the known offsets. We already have a setter for the size field.

It's convenient to retain the structure name (as opposed to just making them all void*) for documentation purposes in the code. It's legal to have a pointer to an opaque structure, and clearly we could never have any real instances of the structure used due to its variable length.

This is very similar to what I proposed in eclipse-omr/omr#5027

@gacholio
Copy link
Contributor

I agree the highlighted code is pretty obtuse.

@gacholio gacholio self-assigned this Jul 24, 2020
@Hello71
Copy link
Contributor Author

Hello71 commented Jul 24, 2020

Oh, looking at that code I guess it should be read as J9UTF8_LENGTH(utf8) - sizeof(J9UTF8_LENGTH(utf8)), except for padding considerations. This wouldn't be an issue if a proper flexible array member was used, but I guess that didn't exist in 1991. I still don't understand why 2 was selected instead of the more common 0 or 1, but I don't mind that much, I just want to fix it for gcc 10. I will replace data with a FAM and just use sizeof(J9UTF8) + J9UTF8_LENGTH(utf8). Thanks!

@gacholio
Copy link
Contributor

gacholio commented Jul 25, 2020

Yeah, this code predates any modern C compiler. 2 was selected so that the total structure size was 4 (size of a pointer back then). 0 wasn't allowed in old compilers. Also, back in the day, sizeof was unreliable - some compilers padded structures, others did not.

If you do as you propose, your code will not compile once the structure is opaque.

@gacholio
Copy link
Contributor

Rather than making the structure opaque, I'm trying it with just the data field removed (to support older compilers).

@dnakamura
Copy link
Contributor

I was looking at a similar issue in OMR, where we do the same thing for UTTraceRecord. This then causes problems because we then strcpy into the buffer which causes warnings because the compiler can detect the buffer overrun. The annoying part is all our platforms support either zero length arrays or flexible array members, but neither option is supported on all platforms. Flexible array members are the most promising option. There are 2 platforms with issues

  • zos. While the compiler manual says that everything we are doing is supported, the compiler still issues a warning/error (although IIRC I did find an option to disable the specific warning)
  • windows. The OMR appveyor build did not work, although it was intentionally using an older version of MSVC to catch compatibility issues for openj9. However I believe we have upgraded compilers since that build was set up. I have to do a bit digging to find out the oldest version of MSVC that we actually still use

@gacholio
Copy link
Contributor

I'm also having trouble with the verifier, which has somehow embedded the knowledge that UTF8 is 4 bytes. None of the three solutions will work until that is addressed.

@Hello71
Copy link
Contributor Author

Hello71 commented Jul 29, 2020

* windows. The OMR appveyor build did not work, although it was intentionally using an older version of MSVC to catch compatibility issues for openj9. However I believe we have upgraded compilers since that build was set up. I have to do a bit digging to find out the oldest version of MSVC that we actually still use

According to https://stackoverflow.com/a/39152330, it is supported in at least MSVC 2015 with a warning in C++ mode. However, regarding OMR, my understanding is that such constructs are not supposed to be used in C++ anyways, and instead the struct should contain a pointer which is allocated internally; see std::string, std::vector.

I'm also having trouble with the verifier, which has somehow embedded the knowledge that UTF8 is 4 bytes. None of the three solutions will work until that is addressed.

I also had this problem.

Error: Unable to initialize main class build.tools.jigsaw.AddPackagesAttribute
Caused by: java.lang.VerifyError: JVMVRFY041 invokespecial of wrong initializer; class=build/tools/jigsaw/AddPackagesAttribute, method=<init>()V, pc=1
Exception Details:
  Location:
    build/tools/jigsaw/AddPackagesAttribute.<init>()V @1: JBinvokespecial
  Reason:
    Type 'java/lang/Object' (current frame, stack[0]) is not assignable to 'build/tools/jigsaw/AddPackagesAttribute'
  Current Frame:
    bci: @1
    flags: { flagThisUninit }
    locals: { 'uninitializedThis' }
    stack: { 'java/lang/Object' }

@Hello71
Copy link
Contributor Author

Hello71 commented Jul 29, 2020

I was looking at a similar issue in OMR, where we do the same thing for UTTraceRecord.

Actually, there are more fake array members in OMR than that. Just searching for char.*\[1\], there are at least BlobString, UtTraceRecord, UtActiveSection, UtServiceSection, UtStartupSection, and UtTraceCfg. BlobString is the most egregious: char data[1]; /* flexible array member */.

@Hello71
Copy link
Contributor Author

Hello71 commented Jul 29, 2020

Also, I'd like to emphasize that the issue is not limited to compiler warnings. GCC has been documented to optimize aggressively based on invalid array sizes: https://lkml.org/lkml/2015/2/18/407. I don't know if this is still the case, but it's certainly a potential correctness issue, not just a warning issue.

@pshipton
Copy link
Member

We still use VS2010 to compile Windows. We're planning to move to VS2013 #10129 since this is the highest level supported by OpenJDK atm. https://github.com/ibmruntimes/openj9-openjdk-jdk8/blob/openj9/common/autoconf/toolchain_windows.m4#L61

@keithc-ca
Copy link
Contributor

The errors mentioned by #10244 (comment) should be fixed by #10299.

@gacholio
Copy link
Contributor

I've found the verifier issue - hopefully will have a fix out soon.

@gacholio
Copy link
Contributor

@Hello71 Can you give my branch a try and see if it it fixes your issues? It's here:

https://github.com/gacholio/openj9/tree/utf

@Hello71
Copy link
Contributor Author

Hello71 commented Aug 15, 2020

@gacholio seems to work on 15 (14 master doesn't compile atm?). for some reason without the patch I only get a warning, not an error. I'm not really convinced that removing the member is a better option than just making it a flexible array, but I guess it's fine. the commit messages are a little funny, but functionally it seems to work alright. thanks!

@gacholio
Copy link
Contributor

I've tried the FAM, but it doesn't work on the outdated windows compilers that we use.

@Hello71
Copy link
Contributor Author

Hello71 commented Aug 17, 2020

I've tried the FAM, but it doesn't work on the outdated windows compilers that we use.

That doesn't sound right, unless a different compiler is used for omr. As I point out in eclipse-omr/omr#5444, FAMs are already used in at least two places in omr. They're just hidden by #pragma warning(disable : 4200).

@gacholio
Copy link
Contributor

Yep, adding the pragma fixes the problem.

@Hello71
Copy link
Contributor Author

Hello71 commented Aug 17, 2020

I didn't find any FAMs elsewhere in openj9, but if eclipse-omr/omr#5444 is accepted, I think it would make sense to use /wd4200 here too.

@keithc-ca
Copy link
Contributor

I don't think we should use /wd4200 globally. I'd prefer to push/pop that warning only as necessary.

@Hello71
Copy link
Contributor Author

Hello71 commented Aug 17, 2020

I don't think we should use /wd4200 globally. I'd prefer to push/pop that warning only as necessary.

why is that? are there some contexts in which you want to keep pre-2015 MSVC compatibility?

@keithc-ca
Copy link
Contributor

It isn't about compatibility, it's about ignoring warnings cautiously and selectively.

@gacholio
Copy link
Contributor

gacholio commented Aug 18, 2020

While I agree completely with @keithc-ca in principle, I would be in favour of just disabling this warning. Windows is the only platform that doesn't support FAM by default, so there's no real value in causing a compile error on Windows when every other platform would succeed.

@Hello71
Copy link
Contributor Author

Hello71 commented Aug 18, 2020

I agree with @gacholio on this issue. I agree entirely that disabling warnings should be done cautiously, and have made a point to do so in exactly this case: I was and continue to be opposed to disabling the -Wstringop-overflow warning, since doing so potentially hides overly aggressive gcc optimization causing incorrect code generation. I am also opposed to globally disabling warnings that trigger on arguably-correct code, such as -Wimplicit-fallthrough, since those can potentially hide serious bugs.

In this case, I think this warning is there mostly for historic reasons (Microsoft doesn't like C99), and partially for strict C++ compliance reasons (flexible array members are not technically allowed in C++), but in practice there are no useful implementations which forbid it, and no plausible future implementation would silently generate incorrect code for a FAM.

Furthermore, I don't see any reason why FAMs would work in some contexts but not others in OpenJ9. Perhaps in some other projects, you might want to forbid them in header files (since other other projects may include them), but as far as I know, that's not the case here.

gacholio added a commit to gacholio/openj9 that referenced this issue Aug 18, 2020
Also fix all direct uses of J9UTF8 fields to use the access macros.

Fixes: eclipse-openj9#10244

Signed-off-by: Graham Chapman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants