Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add SimpleArray caster #266

Merged
merged 3 commits into from
Mar 5, 2024

Conversation

tigercosmos
Copy link
Collaborator

@tigercosmos tigercosmos commented Dec 26, 2023

  1. Introduced new SimpleArrayPlex and its wrapper WrapSimpleArrayPlex. The interface of SimpleArray and WrapSimpleArrayPlex should be the same as SimpleArray and WrapSimpleArray, but this should be a to-do for the next PR.
  2. Implemented the customized caster for SimpleArray to allow SimpleArrayPlex implicit casting.
  3. add a testhelper module for testing

the first step for #27

@@ -39,6 +39,7 @@ add_subdirectory(python)
add_subdirectory(spacetime)
add_subdirectory(view)
add_subdirectory(inout)
add_subdirectory(testhelper)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some tests require python binding functions or classes. We can put those test helpers in this testhelper module in the future.

@@ -82,13 +84,17 @@ inline size_t buffer_offset(small_vector<size_t> const & stride, small_vector<si
namespace detail
{

using shape_type = small_vector<size_t>;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the shape_type is not related to T, I think it's better to put it outside.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

template <typename T>
struct SimpleArrayInternalTypes
{
using shape_type = detail::shape_type;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

backward compatibility

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not reorder them. The diff looks ugly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the order should be back.

@@ -654,6 +659,86 @@ using is_simple_array = std::is_same<
template <typename S>
inline constexpr bool is_simple_array_v = is_simple_array<S>::value;

using SimpleArrayBool = SimpleArray<bool>;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for later macro usage in SimpleArrayCaster.hpp

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not beautiful, but I can live with it.

return m_instance_ptr;
}

/// TODO: add all SimpleArray public methods
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to let the PR be simple, no other methods for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


#define ARRAYPLEX_TYPE_CASTER(DATATYPE) \
template <> /* NOLINTNEXTLINE(bugprone-macro-parentheses) */ \
struct type_caster<modmesh::SimpleArray##DATATYPE> : public type_caster_base<modmesh::SimpleArray##DATATYPE> \
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the most important part for casting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should use a comment to highlight the importance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comment at the beginning of this header file.

{ return wrapped_type(make_shape(shape), datatype); }),
pybind11::arg("shape"),
pybind11::arg("dtype"))
/// TODO: should have the same interface as WrapSimpleArray
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the python module should have more methods, but keep it simple for now.

@@ -49,6 +50,10 @@ void initialize(pybind11::module_ mod)
initialize_spacetime(spacetime_mod);
pybind11::module_ onedim_mod = mod.def_submodule("onedim", "onedim");
initialize_onedim(onedim_mod);

pybind11::module_ testhelper_mod = mod.def_submodule("testhelper", "testhelper");
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the helper module for testing python binding.

.clang-tidy Outdated
@@ -16,6 +16,7 @@ Checks: >
-cppcoreguidelines-non-private-member-variables-in-classes,
-cppcoreguidelines-macro-usage,
-fuchsia-overloaded-operator,-fuchsia-default-arguments*,
-cppcoreguidelines-pro-type-reinterpret-cast,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are too many reinterpret_cast operations due to our low-level implementation manner, so disable the warning directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it is not the devastating, I would like to ask you to try suppressing this in only the file having the new reinterpret_cast. I am a little bit worried by the relaxation on the free casting.

@tigercosmos
Copy link
Collaborator Author

@yungyuc the PR should be ready for review. There are CI errors for the debug build on macos/windows; however the release works for both platforms. I am still figuring out the root cause.

@yungyuc yungyuc added the enhancement New feature or request label Jan 3, 2024
Copy link
Member

@yungyuc yungyuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great patch. Please leave a global comment when you are ready for the next review. Please make sure all CI runs are clean for review.

  • Try to relax reinterpret_cast on the file-basis.
  • Keep the buffer_type alias in SimpleArrayInternalTypes mixin.
  • Review and add back all end marking.
  • Rename argument name data_type to type_string or something including string.
  • Python.h always need to be included in the first place.
  • Use your name for the new files of your major contribution.
  • Make a plan to remove testing code from production build.
  • Also assert the type name string.

.clang-tidy Outdated
@@ -16,6 +16,7 @@ Checks: >
-cppcoreguidelines-non-private-member-variables-in-classes,
-cppcoreguidelines-macro-usage,
-fuchsia-overloaded-operator,-fuchsia-default-arguments*,
-cppcoreguidelines-pro-type-reinterpret-cast,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it is not the devastating, I would like to ask you to try suppressing this in only the file having the new reinterpret_cast. I am a little bit worried by the relaxation on the free casting.

@@ -82,13 +84,17 @@ inline size_t buffer_offset(small_vector<size_t> const & stride, small_vector<si
namespace detail
{

using shape_type = small_vector<size_t>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@@ -82,13 +81,17 @@ inline size_t buffer_offset(small_vector<size_t> const & stride, small_vector<si
namespace detail
{

using shape_type = small_vector<size_t>;
using sshape_type = small_vector<ssize_t>;
using buffer_type = ConcreteBuffer;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the buffer_type alias in SimpleArrayInternalTypes mixin. It's not impossible to use other buffer types in the (distant) future.

template <typename T>
struct SimpleArrayInternalTypes
{
using shape_type = detail::shape_type;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not reorder them. The diff looks ugly.

@@ -654,6 +659,86 @@ using is_simple_array = std::is_same<
template <typename S>
inline constexpr bool is_simple_array_v = is_simple_array<S>::value;

using SimpleArrayBool = SimpleArray<bool>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not beautiful, but I can live with it.

.def("test_cast_to_arrayint32", []() -> modmesh::SimpleArrayInt32
{
SimpleArrayInt32 arr(100);
return arr; })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt this passes clang-format. Let me know I am correct or not.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like it as well, this is formatted by clang-format...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I can add // clang-format off for this line to make it pretty.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. If clang-format does it, then let it be so. It's just a small part and perhaps clang-format will correct it in the future.

@@ -73,6 +74,7 @@
'Bezier3dFp64',
'WorldFp32',
'WorldFp64',
'testhelper',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not like to have testing code shipped with the production code. Please come up with a plan to make it disappear from production build before we can merge it. Once we conclude the plan it should be put in an issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a USE_PYTEST_HELPER_BINDING flag for this. will have explanation comments on the code.

def test_SimpleArray_casting(self):
# first check the caster works if the argument is exact the same type
array_float64 = modmesh.SimpleArrayFloat64((2, 3, 4))
self.assertEqual(modmesh.testhelper.TestSimpleArrayHelper.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-lines like this look ugly. Please beautify.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yungyuc current 80 length limit of flake8 makes it difficult if any variable has a long name. 80 sounds too short for a mordern monitor, maybe consider to 120?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It brings more pain to violate PEP-8. We should stick with it.

arrayplex_uint64 = modmesh.SimpleArray((2, 3, 4), dtype="uint64")
arrayplex_float64 = modmesh.SimpleArray((2, 3, 4), dtype="float64")

# check the type is the same with different data types
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also assert the type name string. It will make the test code more descriptive.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking the type strings

.TestSimpleArrayHelper \
.test_cast_to_arrayint32() # SimpleArray32 from casting
self.assertTrue(type(array_int32) is type(array_int32_2))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following does not seem to work (I am ok with it now):

# Obtain a plex from typed array
array_int32 = modmesh.SimpleArrayInt32((2, 3, 4))
arrayplex_int32 = array_int32.plex

# Obtain the typed array from a plex
arrayplex_int32 = modmesh.SimpleArray((2, 3, 4), dtype='int32')
array_int32 = arrayplex_int32.typed

How can we make the above work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we can do it. already added in the PR. will leave explanation comments later.

@tigercosmos
Copy link
Collaborator Author

  • Try to relax reinterpret_cast on the file-basis.
  • Keep the buffer_type alias in SimpleArrayInternalTypes mixin.
  • Review and add back all end marking.
  • Rename argument name data_type to type_string or something including string.
  • Python.h always need to be included in the first place.
  • Use your name for the new files of your major contribution.
  • Make a plan to remove testing code from production build.
  • Also assert the type name string.

@@ -159,7 +159,7 @@ jobs:
VERBOSE=1 USE_CLANG_TIDY=OFF \
BUILD_QT=OFF \
CMAKE_BUILD_TYPE=${{ matrix.cmake_build_type }} \
CMAKE_ARGS="-DPYTHON_EXECUTABLE=$(which python3)"
CMAKE_ARGS="-DPYTHON_EXECUTABLE=$(which python3) -DUSE_PYTEST_HELPER_BINDING=ON"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DUSE_PYTEST_HELPER_BINDING flag is used to build the pytest helper.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most of the time we need this flag, since almost all builds are used for testing.

@@ -177,6 +177,17 @@ jobs:
CMAKE_ARGS="-DPYTHON_EXECUTABLE=$(which python3)"
make buildext VERBOSE=1

# build with pytest helper binding for testing
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we first build the version without the helper, then build again with the helper, since later on we will run pytest.

CMakeLists.txt Outdated
@@ -82,8 +89,10 @@ message(STATUS "DEBUG_SYMBOL: ${DEBUG_SYMBOL}")
if(DEBUG_SYMBOL)
if(MSVC)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /DEBUG")
add_compile_options(/O2 /Ob2)
Copy link
Collaborator Author

@tigercosmos tigercosmos Jan 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately, pybind11 has bug for -O0 on mac/windows, customized caster will not work.
explicitly set -O2 for debug build.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the pybind11 bug with O0 and debug build?

Setting O2 may be a workaround but it may not be merged without describing the root cause clearly. A clear description must be given when the issue is found, or it will be overly difficult for reproduction later.

#include <modmesh/buffer/buffer.hpp>

/*
* The purpose of including this header is to facilitate implicit casting of
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the comment to point out the purpose of the header.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the doxygen format /** * */

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@@ -1,7 +1,7 @@
#pragma once

/*
* Copyright (c) 2019, Yung-Yu Chen <[email protected]>
* Copyright (c) 2024, An-Chi Liu <[email protected]>
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the author of the file is also me, so just changed the name at this time as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@@ -27,8 +27,9 @@
*/

#include <modmesh/buffer/pymod/buffer_pymod.hpp> // Must be the first include.
#include <modmesh/buffer/pymod/TypeBroadcast.hpp>

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a space here can prevent clang-format to reorder the includes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know

@@ -156,6 +157,7 @@ class MODMESH_PYTHON_WRAPPER_VISIBILITY WrapSimpleArray
.def_property_readonly("has_ghost", &wrapped_type::has_ghost)
.def_property("nghost", &wrapped_type::nghost, &wrapped_type::set_nghost)
.def_property_readonly("nbody", &wrapped_type::nbody)
.def_property_readonly("plex", &get_arrayplex)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is for

array_plex = array_typed.plex

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a lambda for plex.

(*this)
.def("test_cast_to_arrayint32", []() -> modmesh::SimpleArrayInt32
{
// clang-format off
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-format is stupid here, added this to make the code pretty

@@ -73,6 +74,9 @@
'Bezier3dFp64',
'WorldFp32',
'WorldFp64',

# TODO: have a way to enable this according to `USE_PYTEST_HELPER_BINDING`
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't find out a good way to enable testhelper in python when USE_PYTEST_HELPER_BINDING is set on C++.
there are some complicated ways, such as doing file editing in CMake, but I think it isn't worth it.

{ return wrapped_type(make_shape(shape), datatype); }),
pybind11::arg("shape"),
pybind11::arg("dtype"))
.def_property_readonly("typed", &get_typed_array)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is for array_typed = array_plex.typed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great

@tigercosmos
Copy link
Collaborator Author

@yungyuc the PR is ready for review. all aforementioned issues are addressed. There are also comments for more details.

@yungyuc
Copy link
Member

yungyuc commented Jan 27, 2024

@yungyuc the PR is ready for review. all aforementioned issues are addressed. There are also comments for more details.

Thanks, @tigercosmos . You added a lot of changesets and it is a little bit confusing to me. If you have time, could you please reorganize them into fewer commits and rebase?

@tigercosmos
Copy link
Collaborator Author

@yungyuc the old commits are like spaghetti, not easy to rebase them into a few commits, so I just rebase them into a single commit. hope this will help you.

Copy link
Member

@yungyuc yungyuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated patch is very good! We need to pin-point the pybind11 issue discovered.

  • IMPORTANT Clarify why the custom pybind11 caster does not work with O0 and debug build with mac/windows
  • Use doxygen format for file header comments.
  • Use initialization list for SimpleArrayPlex::m_has_instance_ownership.
  • Please use a lambda to wrap plex.
  • Make sure there is a build action without USE_PYTEST_HELPER_BINDING=ON

#include <modmesh/buffer/buffer.hpp>

/*
* The purpose of including this header is to facilitate implicit casting of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the doxygen format /** * */

return; // other does not have instance
}

m_has_instance_ownership = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use initialization list

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yungyuc in this case is m_has_instance_ownership already in default member initializer as false, here 236 line changes to true if the previous if statement is true. Or, do you mean I should use initialization list to declare m_has_instance_ownership as false and not to use default member initializer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! I overlooked it. In this case we should not move this to initialization list. Please keep it here. Thanks for pointing it out.

@@ -40,7 +40,6 @@ typedef SSIZE_T ssize_t;

namespace modmesh
{

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If change of blank lines is not by clang-format, please revert the unnecessary change.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@@ -1,7 +1,7 @@
#pragma once

/*
* Copyright (c) 2019, Yung-Yu Chen <[email protected]>
* Copyright (c) 2024, An-Chi Liu <[email protected]>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@@ -27,8 +27,9 @@
*/

#include <modmesh/buffer/pymod/buffer_pymod.hpp> // Must be the first include.
#include <modmesh/buffer/pymod/TypeBroadcast.hpp>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know

@@ -156,6 +157,7 @@ class MODMESH_PYTHON_WRAPPER_VISIBILITY WrapSimpleArray
.def_property_readonly("has_ghost", &wrapped_type::has_ghost)
.def_property("nghost", &wrapped_type::nghost, &wrapped_type::set_nghost)
.def_property_readonly("nbody", &wrapped_type::nbody)
.def_property_readonly("plex", &get_arrayplex)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a lambda for plex.

{ return wrapped_type(make_shape(shape), datatype); }),
pybind11::arg("shape"),
pybind11::arg("dtype"))
.def_property_readonly("typed", &get_typed_array)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great

arrayplex_uint64 = modmesh.SimpleArray((2, 3, 4), dtype="uint64")
arrayplex_float64 = modmesh.SimpleArray((2, 3, 4), dtype="float64")

# check the type is the same with different data types
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking the type strings

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a build action without USE_PYTEST_HELPER_BINDING=ON?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, please check USE_PYTEST_HELPER_BINDING=OFF case in CI

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally, I made a build without the helper first, and then had a build with the helper for pytest.

CMakeLists.txt Outdated
@@ -82,8 +89,10 @@ message(STATUS "DEBUG_SYMBOL: ${DEBUG_SYMBOL}")
if(DEBUG_SYMBOL)
if(MSVC)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /DEBUG")
add_compile_options(/O2 /Ob2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the pybind11 bug with O0 and debug build?

Setting O2 may be a workaround but it may not be merged without describing the root cause clearly. A clear description must be given when the issue is found, or it will be overly difficult for reproduction later.

@tigercosmos
Copy link
Collaborator Author

tigercosmos commented Feb 15, 2024

here is a short report for early studying of the behavoiur of different -O levels

I inserted some stub by using asm like asm("brk #0x99"), to check if pybind11::detail::type_caster<modmesh::SimpleArrayInt32>::load()(in SimpleArrayCaster.hpp) is used, and use objdump -D -S -g -m _modmesh.cpython-311-darwin.so to dump the assembly code.

for example, we can see asm("brk #0x99") in -O0 build assembly code.

__ZN8pybind116detail11type_casterIN7modmesh11SimpleArrayIiEEvE4loadENS_6handleEb:
    345bbc:	ff 43 02 d1	sub	sp, sp, #144
    345bc0:	fd 7b 08 a9	stp	x29, x30, [sp, #128]
    345bc4:	fd 03 02 91	add	x29, sp, #128
    345bc8:	a1 03 1f f8	stur	x1, [x29, #-16]
    345bcc:	a0 83 1e f8	stur	x0, [x29, #-24]
    345bd0:	28 00 80 52	mov	w8, #1
    345bd4:	48 00 08 0a	and	w8, w2, w8
    345bd8:	a8 73 1e 38	sturb	w8, [x29, #-25]
    345bdc:	a0 83 5e f8	ldur	x0, [x29, #-24]
    345be0:	e0 0f 00 f9	str	x0, [sp, #24]
    345be4:	20 13 20 d4	brk	#0x99

-O0 build

asm("brk #0x99") is found in the following symbol (called S1 later), but it is never used in -g0 build code.

__ZN8pybind116detail11type_casterIN7modmesh11SimpleArrayIiEEvE4loadENS_6handleEb
pybind11::detail::type_caster<modmesh::SimpleArray<int>, void>::load(pybind11::handle, bool))

also, this S1 symbol doesn't exis in -O1, -O2, -O3 build

-O1, -O2, -O3 build

on the other hand, for -O1, -O2, -O3 build, asm("brk #0x99") are all found in the following symbol (called S2 later).

__ZZN8pybind1112cpp_function10initializeIRPFbRN7modmesh11SimpleArrayIiEEEbJS5_EJNS_4nameENS_9is_methodENS_7siblingEEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE_8__invokeESP_
void pybind11::cpp_function::initialize<bool (*&)(modmesh::SimpleArray<int>&), bool, modmesh::SimpleArray<int>&, pybind11::name, pybind11::is_method, pybind11::sibling>(bool (*&)(modmesh::SimpleArray<int>&), bool (*)(modmesh::SimpleArray<int>&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&)::'lambda'(pybind11::detail::function_call&)::__invoke(pybind11::detail::function_call&)

Summary

The S2 symbol is the entrance of casting SimpleArray, and for -O1, -O2, -O3 build, they all have asm("brk #0x99") in S2.

For -O0 build, the S2 symbol also exists, but it never calls S1 (where the asm("brk #0x99") exists), and that's why the casting failed.

Somehow the compiler just ignores the customized caster template in -O0 build, but the reason is still unknown.

@yungyuc I would like to have your advice for further investigation.

@yungyuc
Copy link
Member

yungyuc commented Feb 17, 2024

I am guessing -O1 -O2 -O3 inlines load while -O0 does not. Perhaps you can try forcing clang to inline it and see what may happen.

@tigercosmos
Copy link
Collaborator Author

tigercosmos commented Feb 17, 2024

@yungyuc very interesting, if I force inline for pybind11::detail::type_caster<modmesh::SimpleArray<int>, void>::load(pybind11::handle, bool)), the function is not complied. I cannot find #0x99 in the assembly in -O0 build.

@tigercosmos
Copy link
Collaborator Author

could be a related issue: pybind/pybind11#1153

@tigercosmos
Copy link
Collaborator Author

also a possible related issue: pybind/pybind11#2763

@yungyuc
Copy link
Member

yungyuc commented Feb 18, 2024

@yungyuc very interesting, if I force inline for pybind11::detail::type_caster<modmesh::SimpleArray<int>, void>::load(pybind11::handle, bool)), the function is not complied. I cannot find #0x99 in the assembly in -O0 build.

By "not compiled", do you mean it is totally optimized away? Please try tricks like https://programfan.github.io/blog/2015/04/27/prevent-gcc-optimize-away-code/ . We need the asm showing up to know what's going on.

@tigercosmos
Copy link
Collaborator Author

tigercosmos commented Feb 19, 2024

@yungyuc with using asm volatile trick:

  • inline __attribute__((always_inline)) bool load(pybind11::handle src, bool convert), I cannot find asm code in the compiled file
  • inline bool load(pybind11::handle src, bool convert), the asm is compiled, but the result is the same as the above report, and the function is never used.

the asm code is like this:

        inline bool load(pybind11::handle src, bool convert)                                                                                                         \
        {                                      
            asm volatile("mov w8, #0");
            asm volatile("" : : : "memory");
            asm("brk #0x99");       

@tigercosmos
Copy link
Collaborator Author

aslo tried to use latest LLVM 17 toolchain to build ... the result is the same

@yungyuc
Copy link
Member

yungyuc commented Feb 20, 2024

Perhaps compiler explorer may help you find a way to insert the asm code.

@tigercosmos
Copy link
Collaborator Author

tigercosmos commented Feb 20, 2024

@yungyuc please let me clarify again. By using asm volatile trick, I confirmed the following result in both modmesh, and a simple code on compiler explorer.

  • inline __attribute__((always_inline)) foo(): by using this syntax for inline under -O0, the code needs to be called somewhere, otherwise the compiler will do nothing for this foo. Therefore, if I set inline __attribute__((always_inline)) for the customized caster load, we cannot find the inserted asm code due to the function is never used.
  • inline foo(): by using this syntax for inline -O0, seems the compiler will just ignore the inline keyword and still compile foo as a function. In modmesh case, the customized caster inline load is compiled as a function but the function is never called by others.

To sum up, the compiler decides not to use the pybind11 customized caster under -O0 (it decides to take the generic caster for unknown reasons), and it doesn't matter whether the load function is inline, force inline or not inline.

@yungyuc
Copy link
Member

yungyuc commented Feb 20, 2024

Can we say that, when we use LLVM, the custom caster is called only when it is inlined? If that statement is (or may be) true, it may be used to create a smaller test case for studying.

@tigercosmos
Copy link
Collaborator Author

tigercosmos commented Mar 5, 2024

  • IMPORTANT Clarify why the custom pybind11 caster does not work with O0 and debug build with mac/windows
  • Use doxygen format for file header comments.
  • Use initialization list for SimpleArrayPlex::m_has_instance_ownership. (no need)
  • Please use a lambda to wrap plex.
  • Make sure there is a build action without USE_PYTEST_HELPER_BINDING=ON
  • Add a cmake switch to control optimization level for both debug and release build (DEBUG_BUILD_OPTIMIZATION_ONE)
  • In Github Actions turn O1 for the problematic compiler setup
  • Draft a Github issue to track the compiler issue and file it on GitHub (Bad assembly with SimpleArray caster #283)

@@ -423,11 +445,26 @@ jobs:
Get-Command clang-tidy
Get-Command flake8

- name: cmake ALL_BUILD
- name: cmake ALL_BUILD USE_PYTEST_HELPER_BINDING=OFF DEBUG_BUILD_OPTIMIZATION_ONE=ON
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an instance where USE_PYTEST_HELPER_BINDING is set to OFF. While there are additional scenarios, I'm highlighting one for reference.

DEBUG_BUILD_OPTIMIZATION_ONE is ON for both the debug build and the release build, but it only works for the debug build. I don't think we need to have a special condition branch to separate the two builds.

@@ -148,7 +148,7 @@ jobs:
strategy:
matrix:
# https://github.com/actions/runner-images/blob/main/images/macos/macos-13-Readme.md
os: [macos-13]
os: [macos-13, macos-14]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add missing macos-14

@@ -31,6 +31,17 @@ option(LINT_AS_ERRORS "clang-tidy warnings as errors" OFF)
# - Address Sanitization: Protecting against address-related vulnerabilities. See https://clang.llvm.org/docs/AddressSanitizer.html
option(USE_SANITIZER "use sanitizer (undefined, leak, address)" OFF)

# FIXME: A workaround flag, check more on https://github.com/solvcon/modmesh/issues/283
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the issue #283 is pointed out here.

# We found that only /O2 can have the correct behavior, so we set /O2 here.
# /Ob2 is the default value of /O2, but we found that it need to explicitly set /Ob2 to have the correct behavior.
# See more: https://learn.microsoft.com/en-us/cpp/build/reference/o1-o2-minimize-size-maximize-speed
add_compile_options(/O2 /Ob2)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a little complicated here, but please check the comment in the code.

@tigercosmos
Copy link
Collaborator Author

  • IMPORTANT Clarify why the custom pybind11 caster does not work with O0 and debug build with mac/windows
  • Use doxygen format for file header comments.
  • Use initialization list for SimpleArrayPlex::m_has_instance_ownership. (no need)
  • Please use a lambda to wrap plex.
  • Make sure there is a build action without USE_PYTEST_HELPER_BINDING=ON
  • Add a cmake switch to control optimization level for both debug and release build (DEBUG_BUILD_OPTIMIZATION_ONE)
  • In Github Actions turn O1 for the problematic compiler setup
  • Draft a Github issue to track the compiler issue and file it on GitHub (SimpleArray caster issue #283)

@yungyuc All comments have been addressed. Please have a look at your convenience.

Copy link
Member

@yungyuc yungyuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'd like to merge it. I guess there are some rough edges to round, but we can do it against master.

@@ -31,6 +31,17 @@ option(LINT_AS_ERRORS "clang-tidy warnings as errors" OFF)
# - Address Sanitization: Protecting against address-related vulnerabilities. See https://clang.llvm.org/docs/AddressSanitizer.html
option(USE_SANITIZER "use sanitizer (undefined, leak, address)" OFF)

# FIXME: A workaround flag, check more on https://github.com/solvcon/modmesh/issues/283
# For macOS and Windows development, please turn on this flag to avoid the issue.
option(DEBUG_BUILD_OPTIMIZATION_ONE "use -O1 for debug build. This is a workaround for issue#283." OFF)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great.

@yungyuc yungyuc merged commit 2787aa2 into solvcon:master Mar 5, 2024
13 checks passed
@tigercosmos tigercosmos deleted the simple_array_caster20231227 branch March 5, 2024 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants