Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARM results in error #647

Open
culurciello opened this issue Sep 16, 2015 · 8 comments
Open

ARM results in error #647

culurciello opened this issue Sep 16, 2015 · 8 comments
Labels

Comments

@culurciello
Copy link

Dear developers, thank you for your great work on openBLAS.

using it on ARM 32 bit platforms and Ubuntu 14.04, we found some erroneous results used with Torch7:

The Lua code below should always give 1 as the result. On ARM it gives random numbers,
if compiled with OpenMP (and 0.99999999993838 if compiled without).

require 'nn'

torch.setdefaulttensortype('torch.FloatTensor')
data = torch.Tensor(4, 58, 58)
for i = 1,4 do
 for j = 1,58 do
   for k = 1,58 do
     data[i][j][k] = i+j+k  
   end
 end
end
n = nn.Sequential()
n:add( nn.SpatialConvolutionMM(4, 64, 5, 5, 1, 1) )
n.modules[1].weight = torch.Tensor(64,100)
for i = 1,100 do
 n.modules[1].weight[1][i] = i
end
n.modules[1].bias = torch.Tensor(64)

n2 = nn.Sequential()
n2:add( nn.SpatialConvolutionMM(64, 64, 5, 5, 1, 1) )
n2.modules[1].weight = torch.Tensor(64,1600)
for i = 1,1600 do
 n2.modules[1].weight[1][i] = i
end
n2.modules[1].bias = torch.Tensor(64)

data = n:forward(data)
data = n2:forward(data)
out = 0
for i = 1,50 do
for j = 1,50 do
  out = out + data[1][i][j]
end
end
print(out/259643747536)
@xianyi
Copy link
Collaborator

xianyi commented Oct 5, 2015

@culurciello , which kernel do you use? Now, OpenBLAS only supports ARM hard FP ABI. Is it possible an ABI issue?

@mvitez
Copy link

mvitez commented Oct 5, 2015

I am working with @culurciello. We use the Odroid U3 and XU3. They use the hard FP ABI. This problem has been present one year ago and it's still present, we have switched various kernels and OpenBLAS versions. I have tried to write a simple C program that shows this defect, but unfortunately I did not succeed. This problem only appears in complex environments, but by printing intermediate results I found that the errors in calculations come from OpenBLAS. Thank you.

@xianyi
Copy link
Collaborator

xianyi commented Oct 27, 2015

@mvitez , could you try export OMP_NUM_THREADS=1? It looks like the application uses float, sgemm. Am I right?

@xianyi xianyi added the Bug label Oct 27, 2015
@mvitez
Copy link

mvitez commented Oct 27, 2015

It works correctly with only one thread. We actually make OpenBLAS without NO_AFFINITY=1 USE_OPENMP=1 as we should and in such case it works with some limitations, but without errors, besides some segmentation faults, which are quite rare fortunately.

The applications uses float, sgemm, you are right.

@martin-frbg
Copy link
Collaborator

This old issue will hopefully have been fixed by the several rounds of thread safety improvements after
about december 2016.

@martin-frbg
Copy link
Collaborator

Actually still same results unfortunately (though the OPENMP build seems to give "correct" results of the 0.99999...38 type with OMP_NUM_THREADS=2 as well, on a quad-core Asus tinkerboard). The recently added NUM_PARALLEL option does not appear to have any effect either. Not sure how to debug this, as both helgrind and tsan do not work well with OpenMP.

@martin-frbg
Copy link
Collaborator

Switching to USE_SIMPLE_THREADED_LEVEL3 "solves" it however.

@martin-frbg martin-frbg added this to the 0.3.6 milestone Jan 1, 2019
@martin-frbg
Copy link
Collaborator

This appears to have been fixed in the meantime (to the extent that it now returns 0.99999..38 in every case), probably by the correction for #1851 that went into 0.3.4 already.

@martin-frbg martin-frbg modified the milestones: 0.3.6, 0.3.7 Apr 27, 2019
@martin-frbg martin-frbg modified the milestones: 0.3.7, 0.3.8 Aug 11, 2019
@martin-frbg martin-frbg modified the milestones: 0.3.8, 0.3.9 Feb 5, 2020
@martin-frbg martin-frbg modified the milestones: 0.3.9, 0.3.10 Mar 1, 2020
@martin-frbg martin-frbg removed this from the 0.3.10 milestone Jun 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants