-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build OpenBLAS with CROSS option to prevent tests at compile time #158
Build OpenBLAS with CROSS option to prevent tests at compile time #158
Conversation
Just curious, do you know why the tests fail in those conditions and whether they are actually false negatives (so we can ignore them)? |
When building OpenBLAS in docker on some machines, the following test seems to fail:
This also happened to @sbak5 recently. I suspect it's due to process spawning limitations placed on the docker runtime, which cause fork to fail (i.e. nothing to do with OpenBLAS itself). We could confirm this hypothesis, by trying to do a normal process fork in a test program inside this environment, and seeing if it works. |
@manopapad Thanks for sharing your finding. Yes, let's confirm that hypothesis (shouldn't be too hard to do that). |
@manopapad I'm building it on a machine where I had the problem. |
It is built without any error. |
I assume you mean that with this PR the full legate image build gets past the OpenBLAS phase and completes successfully. That is good to know. Could you also help us test the hypothesis about what is causing the OpenBLAS test failure in the first place? By building a test program that checks whether it can successfully perform a fork, and running that during docker build? |
Sure, I'll add a small test program to confirm it to Dockerfile and see if our hypothesis is correct. |
I tried to run a test program which fork a process and prints I'm building the image on different kinds of node at Skipping tests help to avoid any issue I had on different nodes. |
More specifically, the error happens at |
I also tried to run a fork bomb in the container in which we build, and there was no problem forking to many levels. So it is not a simple |
I tried to just run OpenBLAS make, and the error happens even outside of our build process:
|
@marcinz Can you add some code here to check what the value of |
So this is the line that fails: https://github.com/xianyi/OpenBLAS/blob/v0.3.15/utest/test_post_fork.c#L112 The failure is |
From what I've been reading, this may be related to fork's pessimistic view on memory allocation where
while on the machine where swap is disabled, the build fails:
@sbak5 Could you check your machine whether swap is enabled or disabled and also check if your memory overcommit is set to |
So I enabled swap on the second machine from my previous comment, and the build worked fine. Note that it had to also be enough swap. 1 GB was too little. The next value I tried and that made the build work was 100 GB, but I am certain that the minimum amount of swap necessary is closer to 1 GB (see the first machine in the previous comment) than to 100 GB. The conclusion here, in my opinion, is that the authors of OpenBLAS have never run this test in unfavorable conditions such as the combination of the memory overcommit policy and the lack of swap as on the machines that are giving us trouble. I think that this should not be a problem most of the time since the problems with We have a few options:
|
The machine I used has the following config. |
Let's go with option (3). @marcinz, could you please open an issue about this on the OpenBLAS bug tracker? I would especially note that this particular test fails even when running bare-metal on your local machine, but doesn't stop OpenBLAS from running correctly there. And let's merge this change, so we skip the OpenBLAS tests when building, and don't have to deal with such issues. |
Cleanup / Initial pass for type annotations
This change would prevent OpenBLAS from running tests during compilation. This is needed when building on machines or in conditions (e.g., using docker) that cause some of the OpenBLAS tests to fail. The potential downside of this is that we would want to run with OpenBLAS checks by default when users are building in the same environment in which they will run. If we want to run OpenBLAS tests by default, we could add an option to prevent testing at build time.