Skip to content

Standalone Apache Spark with .Net on Windows ... Is performance supposed to be so poor? #922

Answered by imback82
dbeavon asked this question in Q&A
Discussion options

You must be logged in to vote

Those 2 worker processes in Java will spawn executors (purple lines in image), and will also spawn .Net as well (Microsoft.Spark.Worker.exe in green). Here is a discussion where I learned that the .Net processes are respawned for each task: #852

... that seemed problematic on its own

Yes, this is how the interop works (similar to pyspark). How else can you implement it?

But what is even worse are these java "executors". When you add it together, they consume a massive amount of CPU and RAM (purple lines).

The numbers look reasonable to me. You should configure your job such that it fits your machine.

So my question is if this behaves approximately the same on Windows as on Linux? Or …

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@dbeavon
Comment options

Answer selected by dbeavon
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants