-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Meson build system for nanoarrow-ipc extension #483
Conversation
Meson's handling of subprojects is way easier I think than CMake's. If down the road we wanted to replace CMake with Meson there is a ton that we could simplify within these extensions, and maybe could even consider them as separate projects |
Nice! I've been wondering if the IPC and Device "extensions" should move into the main tree (e.g., #481 )...they started as separate projects but a few things like the "testing" module make it a little awkward to keep them separate. Any thoughts on that? |
# under the License. | ||
|
||
[wrap-git] | ||
url = ../../.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the main project right now we ask end users to run meson wrap install <lib>
, but that doesn't make sense to do with this nanoarrow.wrap, so I ended up putting all the wrap files here
Maybe even in the main project we should just add the wrap files? That would technically pin a version for third party dependencies (if not installed elsewhere on the system) which may or may not be desired
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this complexity would be solved if the IPC extension just lived with the main sources!
# under the License. | ||
|
||
[wrap-git] | ||
method = cmake |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flatcc is working to replace (?) CMake with Meson. That would simplify the wrap / subproject usage for flatcc if and when it goes live
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the intention is to support both, although we vendor flatcc anyway for now. I should be more active contributing to that upstream since the IPC reader 100% relies on it!
I haven't used these in particular so I'm just kind of guessing, but it feels like it would make sense to combine them. I think for consumers having to potentially install three different libraries for nanoarrow instead of just one library that you can choose the dependencies out of is overkill. And as you've mentioned already, there appears to be a ton of overlap between these projects' dependencies |
extensions/nanoarrow_ipc/meson.build
Outdated
instead of -DNANOARROW_IPC_CODE_COVERAGE=true''') | ||
endif | ||
|
||
if get_option('NANOARROW_IPC_BUNDLE') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paleolimbot without being super knowledgeable on the end goal of these bundle options, are these used by consumers downstream or just for internal use? I see the Python build uses the bundle to auto generate a Cython header, but not sure of other uses cases and if its worth trying to replicate in Meson
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's any reason to expose this in Meson...this is currently how the files in dist/ are generated (e.g., the single-file bundles for each component). The fact that they are written in CMake is something that initially made it easier to test that the single-file distribution actually worked...I think it might make more sense to rewrite the amalgamation scripts in Python at some point though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't used these in particular so I'm just kind of guessing, but it feels like it would make sense to combine them.
Great! I will aim for that in a near-term PR. I am happy to work through this and merge it before that refactor or afterward (your call!)
# under the License. | ||
|
||
[wrap-git] | ||
method = cmake |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the intention is to support both, although we vendor flatcc anyway for now. I should be more active contributing to that upstream since the IPC reader 100% relies on it!
# under the License. | ||
|
||
[wrap-git] | ||
url = ../../.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this complexity would be solved if the IPC extension just lived with the main sources!
Let's just wait to merge this until after then - there is no rush for this and I think waiting will make that consolidation easier |
@@ -34,3 +34,14 @@ subdir('src/nanoarrow') | |||
if get_option('benchmarks') | |||
subdir('dev/benchmarks') | |||
endif | |||
|
|||
|
|||
if get_option('apps') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should really move the meson.build
from the src/nanoarrow directory up to here to mirror what is done for CMake, but am leaving that for a separate PR. For now just put this apps piece up here since it does not live in src/nanoarrow
@@ -470,7 +470,7 @@ TEST(NanoarrowTestingTest, NanoarrowTestingTestFieldMetadata) { | |||
NANOARROW_RETURN_NOT_OK(ArrowSchemaSetMetadata(schema, "\0\0\0\0")); | |||
return NANOARROW_OK; | |||
}, | |||
[](ArrowArray* array) { return NANOARROW_OK; }, &WriteFieldJSON, | |||
[](ArrowArray*) { return NANOARROW_OK; }, &WriteFieldJSON, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not strictly related but it showed as a warning we could control with the Meson warning configuration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me!
Interesting to see the valgrind error in CI right now. I assume this is also why the windows build is failing, although surprised our CMake setup didn't catch it |
Seems like the error has to do with the fields being sent to ArrowErrorSet in this function: arrow-nanoarrow/src/nanoarrow/array.c Line 1247 in f443e46
Not familiar with the run end encoded implementation but I think we are missing something with how fields are being initialized and accessed through that code block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments but generally looking great!
@@ -23,8 +23,6 @@ | |||
|
|||
void dump_schema_to_stdout(struct ArrowSchema* schema, int level, char* buf, | |||
int buf_size) { | |||
int n_chars = ArrowSchemaToString(schema, buf, buf_size, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is still needed (or else the top-level type will never appear in the dump). Maybe just remove int n_chars =
(if the unused variable was a problem?)
auto result = memcmp(arrays[i].children[0]->buffers[1], one_two_three_le, | ||
sizeof(one_two_three_le)); | ||
// discard result to silence -Wunused-value | ||
(void)result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would NANOARROW_UNUSED()
work here?
@@ -470,7 +470,7 @@ TEST(NanoarrowTestingTest, NanoarrowTestingTestFieldMetadata) { | |||
NANOARROW_RETURN_NOT_OK(ArrowSchemaSetMetadata(schema, "\0\0\0\0")); | |||
return NANOARROW_OK; | |||
}, | |||
[](ArrowArray* array) { return NANOARROW_OK; }, &WriteFieldJSON, | |||
[](ArrowArray*) { return NANOARROW_OK; }, &WriteFieldJSON, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me!
dev/release/rat_exclude_files.txt
Outdated
@@ -18,3 +18,4 @@ python/src/nanoarrow/dlpack_abi.h | |||
subprojects/google-benchmark.wrap | |||
subprojects/gtest.wrap | |||
subprojects/nlohmann_json.wrap | |||
subprojects/zlib.wrap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason these are being ignored rather than just adding the Apache license header to the files? (It seems like those file types support comments and some of the wrap files already have the license header added)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ones I have excluded were auto generated by meson wrap install <library>
whereas the one that I created from scratch (flatcc.wrap) included the Arrow licensing agreement.
Not sure if that distinction matters. Happy to add license agreement to even the auto generated ones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be best to add the license headers rather than put it in the excluded files (although I think it is fine either way since it was generated for us).
cmake = import('cmake') | ||
cmake_opts = cmake.subproject_options() | ||
cmake_opts.add_cmake_defines({'CMAKE_POSITION_INDEPENDENT_CODE': true}) | ||
flatcc_subproj = cmake.subproject('flatcc', options: cmake_opts) | ||
flatcc_dep = flatcc_subproj.dependency('flatccrt') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would mean that CMake is required to build the flatcc runtime? I get that it is more "meson"ic to use the wrap file; however, it might also be a little strange that building nanoarrow_ipc via CMake will give you a slightly different result than using Meson. I think I'd prefer that the default build uses the vendored version (via a pure Meson compile of the four files in thirdparty/flatcc/src/runtime), and perhaps our dependency relationship with flatcc could be improved in both CMake/Meson (perhaps with some contributions to the upstream setup).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK cool will take a look. I believe flatcc is trying to switch to Meson as well, so yea can look and see if that makes for a good upstream contribution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is solved in https://github.com/apache/arrow-nanoarrow/pull/483/files/82a6b48e4218f24b320cfb9e05f972c59d2cdd37..b878e23bb721f07a058e9f63a4db3aa6588f0faa
Basically Meson allows you to provide your own "patch" file for systems that don't use Meson. The file(s) go in subprojects/packagefiles/<patch_dir>
and the wrap system still takes care of fetching / downloading the library for you, rather than requiring you to vendor into your source tree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh set up the subproject here correctly but forgot to come back and update the code block here. Will push up a follow up PR to correct
name + '-test', | ||
name.replace('-', '_') + '_test.cc', | ||
link_with: nanoarrow_ipc_lib, | ||
dependencies: config['deps'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to factor more common dependencies like gtest_dep
into link_with
?
(Also: I don't think that nanoarrow_dep
or flatcc_dep
is needed because it should be handled as a transitive dependency of nanoarrow_ipc_lib
? I might have put that in the CMake due to some weirdness about it being a library built on top of nanoarrow rather than part of nanoarrow itself)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the first question - what problem are you trying to solve by making that a direct link argument? The way the dependency is declared (see wrapdb source) I believe Meson will take care of that linkage for you, alongside having the proper include and threads dependency
As far as the second question goes the way I had this previously the dependencies were not transitive, but I've since updated the nanoarrow_ipc_dep to include those:
nanoarrow_ipc_dep = declare_dependency(..., dependencies: [nanoarrow_dep, flatcc_dep])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the first question
It seemed like the loop was handling linking each test with nanoarrow_ipc_lib
and I was wondering if the loop could also handle linking the dependencies that were common to every test executable. This is probably just me misunderstanding meson and feel free to ignore if it doesn't make sense 🙂
As far as the second question goes the way I had this previously the dependencies were not transitive, but I've since updated the nanoarrow_ipc_dep to include those
Great! I mostly just want to make sure my vauge mental model of what's going on here is correct 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think when coming from CMake it is helpful to understand that the word "dependency" means something else in Meson. In CMake, dependency refers to "something that needs to be built before this target can be". In Meson, a dependency helps you transitively use include directories / link targets. You declare this via the declare_dependency
function:
base_dep = declare_dependency(
include_directories: ['foo_dir'],
link_with: [threads_lib],
)
So when you end up using that dependency on another target, it will transitively carry over the include / link flags for you, i.e.
child_lib = library('child', sources: ['child.cc'], dependencies: [base_dep])
expands into
child_lib = library('child',
sources: ['child.cc'],
include_directories: ['foo_dir'],
link_with: [threads_lib],
)
(at least as far as I understand things)
ci/scripts/build-with-meson.sh
Outdated
@@ -68,12 +68,12 @@ function main() { | |||
show_header "Run test suite" | |||
meson configure -Dtests=true -Db_coverage=true | |||
meson compile | |||
meson test --wrap valgrind | |||
meson test --wrap='valgrind --track-origins=yes' --print-errorlogs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I often use --tool=memcheck --leak-check=full
(but this seems to give great output so take or leave!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure can definitely add the leak-check=full - its just a tradeoff between performance and visibility. This runs pretty quick in CI as is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it! Let's leave it (the memcheck CI job + verification both already do the memcheck/full version).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
The Meson build failure is a separate issue related to attempting to print very large numbers (unrelated to this PR).
No description provided.