-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PDF writing fails with joblib.Parallel using (default) Loky backend #181
Comments
Hi @leoschwarz, thanks for the report. I've transferred this over to the vl-convert repo which implements the image export logic. The vl-convert bundles the Deno JavaScript runtime which only supports running on a single thread, but my understanding is that the loky backend uses separate processes, so I'm not certain that's the issue. Do you have the same issue using the multiprocessing API? |
Thank you for transferring the issue. With from multiprocessing import freeze_support
import altair as alt
import joblib
import os
import pandas as pd
os.environ["RUST_BACKTRACE"] = "full"
def write_chart(filename):
df = pd.DataFrame({"x": [2, 3, 4], "y": [5, 5, 3]})
chart = alt.Chart(df).mark_point().encode(x="x", y="y")
chart.save(filename)
if __name__ == "__main__":
freeze_support()
filenames = [f"chart{i}.pdf" for i in range(2)]
joblib.Parallel(n_jobs=10, backend="multiprocessing")(
joblib.delayed(write_chart)(filename) for filename in filenames) Taken from the loky README: "All processes are started using fork + exec on POSIX systems. This ensures safer interactions with third party libraries. On the contrary, multiprocessing.Pool uses fork without exec by default, causing third party runtimes to crash (e.g. OpenMP, macOS Accelerate...)." So my understanding is they use different fork models, but in this case the default multiprocessing seems to work whereas loky does not. I'm not really an expert on the details of multiprocessing to understand how this relates to Deno's runtime. |
Thanks for the investigation @leoschwarz. Documentation is probably the best first step, just to let people know that the multiprocessing backend works but loky backend does not. One thing that we might be able to do is expose an alternative API that doesn't rely on a global instance of the Rust object that wraps Deno. We might be able to expose a |
I'm not sure if that would fully resolve the problem yet, because my workflow is basically joblib distributing tasks which execute the plotting within a new subprocess each starting its own Python interpreter (largely to avoid this type of problem), so I suspect the problem is located in a native extension doing something unusual with memory somewhere. I'm looking into creating a better example for this. |
So I've finally gotten around to trace this a bit further since it still is a problem for me. import pandas as pd
import altair as alt
import sys
import os
import threading
import argparse
def write_chart(filename):
df = pd.DataFrame({"x": [2, 3, 4], "y": [5, 5, 3]})
chart = alt.Chart(df).mark_line().encode(x="x", y="y")
chart.save(filename)
def run_with_thread():
thread = threading.Thread(target=write_chart, args=("chart_thread.pdf",))
thread.start()
thread.join()
def run_with_fork():
if os.fork() == 0:
write_chart("chart_fork.pdf")
sys.exit(0)
os.wait()
parser = argparse.ArgumentParser()
parser.add_argument("variant", choices=["thread", "fork", "thread-fork", "fork-thread"])
args = parser.parse_args()
if args.variant == "thread":
run_with_thread()
elif args.variant == "fork":
run_with_fork()
elif args.variant == "thread-fork":
run_with_thread()
run_with_fork()
elif args.variant == "fork-thread":
run_with_fork()
run_with_thread() Every variant except "thread-fork" works as expected.
New Python versions also give us the following warning message:
From what I understand, when a fork is created only the main thread will be present in the fork, so if in deno there is somehow some pointers to some existing threads, then this information will be lost in the forked process which leads to issues. I'm not sure if there is some easy way to avoid this type of problem... My personal take away is to stop using software that calls Initially I created this test case which someone might find handy later import pytest
import joblib
import pandas as pd
import altair as alt
import sys
import time
from multiprocessing import freeze_support
import os
def write_chart(filename):
df = pd.DataFrame({"x": [2, 3, 4], "y": [5, 5, 3]})
chart = alt.Chart(df).mark_point().encode(x="x", y="y")
chart.save(filename)
@pytest.mark.parametrize("backend", ["multiprocessing", "threading", "loky", None])
@pytest.mark.parametrize("ext", ["png", "pdf"])
def test_joblib(tmpdir, backend, ext):
filenames = [str(tmpdir/f"chart{i}.{ext}") for i in range(2)]
if backend:
freeze_support()
joblib.Parallel(n_jobs=2, backend=backend)(joblib.delayed(write_chart)(filename) for filename in filenames)
else:
for filename in filenames:
write_chart(filename) The good news is that it works with joblib's other backends. |
Thanks for the detailed writeup. If you have a chance, it would be great to add a short summary to the Limitations section of the README. |
What happened?
Dear developers,
I'm not sure if this is well known, but the following code
results in an error (full message below).
I'm reporting it here since I triggered with altair and would be nice to address with a fix or documentation, but maybe the issue originates in another project and it is beyond the scope of this issue tracker. If you think this would better fit into the loky or deno tracker I'm happy to move it there.
What would you like to happen instead?
The loop should work without an error, which is the case if you set either of:
n_jobs=1
backend="multiprocessing"
and addfreeze_support()
Especially the latter is interesting and is what I am using as a workaround now.
Which version of Altair are you using?
5.4.0
The text was updated successfully, but these errors were encountered: