-
Notifications
You must be signed in to change notification settings - Fork 865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce new JSON parser utility #12531
Conversation
cec8408
to
e494d37
Compare
I understand the desire to insulate ourselves from the specific json reader implementation by abstracting out an interface for us to use. However, I am a little uncomfortable inventing a new interface for a json reader. Relevant XKCD. Why not use json-parser's json.c/json.h symbols directly? |
@lrbison For the most part, json-parser only provides an API to parse a string. This PR extended that API to support filename. Another reason is that I modeled the JSON object with a |
It's probably better to add some descriptions on how to use the json parser than pointing the users to some source code. Please also include the new features added and how to convert existing tuning files into the json files. |
That's a good idea. I can add an example later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but there are still some questions pending.
- Can you don't use the same name for the two .[ch] files. I know they are in different directories, but the identical name is confusing.
- What is the benefit to have the OPAL_OBJECT support ? Is this benefit worth enough to account for the extra memory and CPU needed (including the refcount) to handle the opal_object_t.
@bosilca Thank you very much for the review. I have renamed the files as suggested.
The primary motivation is memory management. In json-parser, only the root object should be freed after use - the child objects are merely pointers to the root. This can be tricky to use. With the opal object lifecycle hooks we can override the destructor to correctly free the root object only, while the application can safely TBH I am not concerned with resource usage - the parser should only be used to read the file into memory and stored somewhere else, e.g. a schema with efficient query methods. Afterwards the parser should be torn down. |
@juntangc I added an example program in |
If you don't add all the OPAL objectification you only have to free a single object, the initial json object via |
That is also my goal. But my opinion differs in that:
|
ab22d6e
to
6363bf1
Compare
To state this differently instead of trusting to user to release a single object once (aka. the main json object she obtained from the load function), you trust them to release each object resulting from |
I think we can solve our disagreement if instead of |
@bosilca Thinking more about OPAL_OBJECT I have a question about inheritance: In this PR I want to hide
Currently I take advantage of the subclass
This makes the intention clear that the user should not touch the internals beyond I wonder what the alternative is if I don't use OPAL_OBJECT. It is obvious to me that we can still do something similar without subclassing, e.g.
Would this be better in your opinion? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks much better to me.
47d5261
to
00503cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a comment about the tempfile, but otherwise approve. I prefer this opal util version, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code generally looks good.
I notice that we're introducing some public json_*
symbols into libopen-pal.so
:
(venv) root@a21a97d3936c:~# nm /tmp/bogus/lib/libopen-pal.so| grep ' json_'
000000000004ec40 T json_parse
000000000004d810 T json_parse_ex
000000000004ecb0 T json_value_free
000000000004d754 T json_value_free_ex
00000000000ca9c8 R json_value_none
I know that these are the third-party symbols. We should probably prefix these so that there's no conflict with some other application linking in their own copy of a JSON library.
@jsquyres are you suggesting some link-time magic or simply renaming those symbols in the source? |
Whatever is least disruptive. I just think we shouldn't have public symbols named |
@jsquyres Very good point. I marked those symbols |
This patch introduces a utility in OPAL based on the 3rd-party project https://github.com/json-parser/json-parser.git The utility provides APIs to read JSON into memory along with getters to retrieve C values. Signed-off-by: Wenduo Wang <[email protected]>
@lrbison and I has a discussion offline. He raised questions about the potential side effect of
I wrote a simple app to confirm that, the internal |
Just to help me clarify my understanding since we did something similar in PMIx. If symbols are in This is what we did and it works fine, although we don't "publish" the equivalent Only issue was - what happens when the user configures with visibility disabled? In that case (IIRC), all symbols become visible and you can/do get |
@rhc54 Thanks for the reminder. I experimented with On Mac with clang 15 I observe that both Did I miss anything? |
I'm afraid that isn't how it works - I suspect the OMPI configure flag is no longer correct. See this article (https://www.akkadia.org/drepper/dsohowto.pdf) starting at section 2.2.2 (on page 18) for an explanation. Put simply - visibility is alive and well and definitely has an effect on symbols. |
Just to be clear: I'm not saying you have a problem. Only suggesting you consider the case where the user requests visibility to be disabled. For example, the paper will explain why that is necessary when debugging, so it isn't an unreasonable use-case. I'm not sure how your proposed solution to the symbol pollution issue will impact it. |
I just checked the code, and here's a fun fact: Meaning: Open MPI uses the same compiler flags regardless of whether you use That being said, it looks like at least some environments enable hidden visibility by default these days. I did quick test builds in Fedora 38 and MacOS 14/Sonoma, and I see that OMPI's build is not passing Doing a little spelunking, I think we offered If symbol visibility is (generally?) the default these days, I don't know if we need to do anything further. Particularly since OMPI has a non-functional Related question (but does not need to be part of this PR): should we remove the |
As I said above, the paper explains that you need to make symbols visible for the debugger to work. So it isn't a case of the user getting some deserved pain - they might get the pain simply because they are attempting to debug their code and want to trace MPI calls (and hence require that the OMPI library also have visible symbols). (In case anyone is wondering, the paper is written by the person who heads up DSO design/spec for gcc - so he is considered the expert on visibility) I don't claim to fully grok all the implications here - e.g., if I'm running a debugger on my code and want to trace down into MPI, does the OMPI library also have to expose their symbols? I believe the answer is "yes", but haven't dug enough to prove it. Just pointing out that there are some subtle things going on here that might merit further thought. |
This discussion is not grounded on current OMPI code. The We can assess the visibility from #include <stdlib.h>
extern int json_parse();
extern int opal_json_free();
int main(int argc, char* argv[])
{
json_parse(NULL);
opal_json_free(NULL);
return 0;
}
I think the conclusion is that this code is good to go as it despite all the ongoing discussions. |
So we are all looking into this. So much wasted time ! I recall we were running a checker (at the end of debug builds or something) to make sure that all exported symbols are correctly prefixed ? |
The reason we forcefully make some symbols externally visible has nothing to do with debugging. Without these symbols being visible our MCA components would not be able to access them when loaded at runtime. |
FWIW, even with the current visibility settings (with
So I think the current visibility stuff is giving us what we want. The question as to whether to remove the compiler-specific visibility stuff is still relevant, but doesn't have to be part of this PR. |
Do we used to have this? It will be very helpful. |
@jsquyres helped me find it. It's in |
@bosilca and I were talking about this in Slack today. This is also a separate issue than this PR, but we both agree that it would be a great CI test. Could be something as simple as:
That was on OSX -- could be easily adapted for Linux. It could also easily be done in Python to be a bit more fine-grained for checking, etc. The https://github.com/open-mpi/ompi/blob/main/config/find_common_syms script (which is called by the top-level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments have been addressed. LGTM.
@rhc54 I'm merging the change as-is. Will also explore the symbol validation suggested by George. |
Sounds reasonable to me! FWIW: @jsquyres and I wrote a script a few years back (i.e., 10) to automatically prefix symbols for just these purposes. See https://github.com/open-mpi/ompi/blob/main/contrib/symbol-hiding.pl |
This patch introduces a utility in OPAL based on the 3rd-party project https://github.com/json-parser/json-parser.git
The utility provides APIs to read JSON into memory along with getters to retrieve C values.
Please see the unit test opal_json.c for example usage.