Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fix generated sources not being included as dependencies in cython transpilation #11000

Merged
merged 1 commit into from
Jan 15, 2023

Conversation

lithomas1
Copy link
Member

@lithomas1 lithomas1 commented Nov 4, 2022

Generated graph looks correct.
xref #8961.
(Note: This is for the .pyx-> c phase for a target for pandas)

image

In pandas, we are including the generated .pxi as sources, but I also tested using them as dependencies, and it still works.

I'm not sure how to test this, though.

Ideas could be write test cases and compare against a golden ninja.build file, or running ninja directly against the pyx target to make sure the deps work correctly (not sure how to hook into the build system to invoke ninja directly)

cc @rgommers @eli-schwartz @xclaesse (Sorry for pinging so close to the weekend).

@codecov
Copy link

codecov bot commented Nov 4, 2022

Codecov Report

Merging #11000 (d8ea5f9) into master (58cfd8f) will increase coverage by 27.41%.
The diff coverage is 87.50%.

❗ Current head d8ea5f9 differs from pull request most recent head def41be. Consider uploading reports for the commit def41be to get more accurate results

@@             Coverage Diff             @@
##           master   #11000       +/-   ##
===========================================
+ Coverage   41.39%   68.81%   +27.41%     
===========================================
  Files         207      414      +207     
  Lines       44609    90222    +45613     
  Branches     9159    20690    +11531     
===========================================
+ Hits        18466    62086    +43620     
+ Misses      24382    23467      -915     
- Partials     1761     4669     +2908     
Impacted Files Coverage Δ
mesonbuild/backend/ninjabackend.py 85.35% <87.50%> (+37.28%) ⬆️
mesonbuild/_typing.py 0.00% <0.00%> (ø)
mesonbuild/modules/cuda.py 0.00% <0.00%> (ø)
mesonbuild/modules/modtest.py 100.00% <0.00%> (ø)
mesonbuild/dependencies/cuda.py 20.19% <0.00%> (ø)
mesonbuild/scripts/env2mfile.py 14.44% <0.00%> (ø)
mesonbuild/backend/vs2010backend.py 0.00% <0.00%> (ø)
mesonbuild/dependencies/__init__.py 100.00% <0.00%> (ø)
scripts/delwithsuffix.py 0.00% <0.00%> (ø)
scripts/depfixer.py 63.31% <0.00%> (ø)
... and 360 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@eli-schwartz
Copy link
Member

Thanks for looking into this.

I'm not sure how to test this, though.

Ideas could be write test cases and compare against a golden ninja.build file, or running ninja directly against the pyx target to make sure the deps work correctly (not sure how to hook into the build system to invoke ninja directly)

I would write a test case in test cases/unit/, and run it as a unittests/*.py test -- pass the pyx target the same way, for example, unittests/allplatformstests.py -> test_build_by_default does it.

@rgommers
Copy link
Contributor

Thanks for working on this @lithomas1!

I've started to look at this to see what it does, using the meson branch in the NumPy repo which has some generated Cython code:

$ # first use Meson 0.64.0
$ meson setup build
$ ninja -C build -t inputs numpy/random/_generator.cpython-39-x86_64-linux-gnu.so.p/numpy/random/_generator.pyx.c
../numpy/random/_generator.pyx
numpy/random/_generator.pyx

# That shows the problem - the transpile step depends on nothing.

# Install Meson from this PR and repeat:
$ ninja -C build -t inputs numpy/random/_generator.cpython-39-x86_64-linux-gnu.so.p/numpy/random/_generator.pyx.c
../numpy/__init__.cython-30.pxd
../numpy/__init__.pxd
../numpy/__init__.py
../numpy/core/code_generators/generate_numpy_api.py
../numpy/core/code_generators/generate_umath.py
../numpy/core/src/npymath/npy_math_internal.h.src
../numpy/random/__init__.pxd
../numpy/random/__init__.py
../numpy/random/_bounded_integers.pxd.in
../numpy/random/_common.pxd
../numpy/random/_generator.pyx
../numpy/random/bit_generator.pxd
../numpy/random/c_distributions.pxd
/home/rgommers/code/numpy/numpy/_build_utils/process_src_template.py
/home/rgommers/code/numpy/numpy/_build_utils/tempita.py
/home/rgommers/mambaforge/envs/numpy-dev/bin/python3.9
numpy/__init__.cython-30.pxd
numpy/__init__.pxd
numpy/__init__.py
numpy/core/__multiarray_api.h
numpy/core/__umath_generated.c
numpy/core/npy_math_internal.h
numpy/random/__init__.pxd
numpy/random/__init__.py
numpy/random/_bounded_integers.pxd
numpy/random/_common.pxd
numpy/random/_generator.pyx
numpy/random/bit_generator.pxd
numpy/random/c_distributions.pxd

So we went from the transpile step depending on nothing to it depending on everything it needs to. And it does not depend on the static libraries that are linked into the final .so target - which is correct. So first impression is that this does the right thing.

I would write a test case in test cases/unit/, and run it as a unittests/*.py test -- pass the pyx target the same way, for example, unittests/allplatformstests.py -> test_build_by_default does it.

I'm not quite sure I understand if that tests the change here correctly. All that does is .init() (configure), `.build(), and then checking paths. I don't see the paths changing - e.g., for my example above:

$ tree build/numpy/random/_generator.cpython-39-x86_64-linux-gnu.so.p/
build/numpy/random/_generator.cpython-39-x86_64-linux-gnu.so.p/
├── meson-generated_numpy_random__generator.pyx.c.o
└── numpy
    └── random
        └── _generator.pyx.c

Looking at the source code for test case 129 doesn't help me. If we're building the final .so, everything is going to exist, it just gets into the build dir in the wrong order. Am I missing something there?

@rgommers
Copy link
Contributor

Ah wait, I didn't realize we could build a single target directly:

ninja -C build numpy/random/_generator.cpython-39-x86_64-linux-gnu.so.p/numpy/random/_generator.pyx.c
ninja: Entering directory `build'
[14/14] Compiling Cython source numpy/random/_generator.pyx

That's going to turn up more dependency issues for me that may need fixing, if I build all intermediate targets one by one.

@rgommers
Copy link
Contributor

That's going to turn up more dependency issues for me that may need fixing, if I build all intermediate targets one by one.

If I may ask another random question. I did some searching, but couldn't turn up a prebaked tool/script to do something like this:

  1. run meson setup builddir
  2. run ninja -C build -t targets depth 0 and parse it to get a list of unique build targets
  3. run cp builddir newtmpdir && ninja -C new_tmpdir a_target for each target

That would be fairly slow, but it only needs to run once in a while, and would smoke out a bunch of potential issues. Not too difficult to write from scratch, but would I be reinventing a wheel there?

@eli-schwartz
Copy link
Member

eli-schwartz commented Nov 14, 2022

I'd probably do ninja clean && rm .ninja_deps, the latter file is the thing that ninja uses to store dependencies that came from depfiles. Then build another target.

Since ninja 1.11, you can do ninja -t missingdeps, which finds cases where you have dependency issues but due to random ordering it successfully built once, and the depfile detected that dependency link after the fact. (For custom_targets that do not use depfiles, the missingdeps tool won't pick that up).

I've pondered writing a smoketest script for Meson and making it invokable via meson --internal, that would do the former (repeated builds + cleaning and removing the deps file in between).

@rgommers
Copy link
Contributor

Nice, thanks @eli-schwartz. I'll give it a go for Cython then. Depfiles won't cut it there, because the issues are mostly coming in because of Cython's output depending on the whole source tree (including __init__.py files).

@lithomas1
Copy link
Member Author

Sorry, I'm not following here. Just to be clear, this does work for numpy/scipy?

@rgommers
Copy link
Contributor

Sorry, I'm not following here. Just to be clear, this does work for numpy/scipy?

Yes, this does fix a race condition for numpy, and the changes I see in dependencies for the .pyx.c targets all look correct. I haven't been able to try for scipy yet, that will take quite a while because we're currently not using the Meson support for Cython (we only invoke Cython through a Python script called with generator.process().

@rgommers
Copy link
Contributor

rgommers commented Jan 7, 2023

Circling back to this - this is a fairly minimal patch, that fixes issues in the Pandas and NumPy builds and seems correct from the dependency graph. The change doesn't affect SciPy. Testing with SciPy after changing how it uses Cython is quite low on my prio list and doesn't seem needed.

Is there a reason not to merge this?

@eli-schwartz
Copy link
Member

Adding a test case, IIRC.

@lithomas1 lithomas1 force-pushed the cython-pxi-deps branch 3 times, most recently from 3bff8bb to 5d512b9 Compare January 15, 2023 16:33
@lithomas1
Copy link
Member Author

@eli-schwartz This is ready now.

target = next(filter(lambda t: "includestuff.cpython" in t["name"], targets)) # Only will be one target
# Meson introspection doesn't give us the pyx.c target :(
# This is really hacky
name = target["target_sources"][0]["generated_sources"][0]
Copy link
Member

@eli-schwartz eli-schwartz Jan 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess in theory you could also check to see which targets have a target_sources -> generated_sources ending in "includestuff.pyx.c"? Not sure whether this matters.

I guess it hardcodes cpython but we don't currently run tests for anything else.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I've updated this.

# TODO: introspection?
cython_sources.append(output)
else:
generated_sources[ssrc] = mesonlib.File.from_built_file(gen.get_subdir(), ssrc)
# Following logic in L883-900 where we determine whether to add generated source
# as a header(order-only) dep to the .so compilation rule
if not self.environment.is_source(ssrc) and not self.environment.is_object(ssrc) and not self.environment.is_library(ssrc) and not modules.is_module_library(ssrc):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is very long and should really be broken up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Split it into four, hopefully the style is good.

The file passes flake8 FWIW, but meson's flake8 config seems to be very lax.

@lithomas1 lithomas1 force-pushed the cython-pxi-deps branch 2 times, most recently from 941eb65 to 164a76d Compare January 15, 2023 19:52
Copy link
Member

@eli-schwartz eli-schwartz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks great.

@eli-schwartz eli-schwartz merged commit 9b999dd into mesonbuild:master Jan 15, 2023
@lithomas1 lithomas1 deleted the cython-pxi-deps branch January 15, 2023 22:00
@nirbheek nirbheek added this to the 1.0.1 milestone Feb 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants