Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

different results even if seed is fixed #63

Open
bab2min opened this issue Jul 12, 2020 · 5 comments · Fixed by #64
Open

different results even if seed is fixed #63

bab2min opened this issue Jul 12, 2020 · 5 comments · Fixed by #64
Labels
bug Something isn't working

Comments

@bab2min
Copy link
Owner

bab2min commented Jul 12, 2020

Depending on the environment(32bit or 64bit / SSE2, AVX or AVX2) in which tomotopy are installed, different results will be produced with the same seed.
It is possibly related to #60.

@bab2min bab2min added the bug Something isn't working label Jul 12, 2020
@bab2min
Copy link
Owner Author

bab2min commented Jul 14, 2020

There are three possible causes.

  1. Random Number Engine

    #if _WIN32 || _WIN64
    #if _WIN64
    typedef std::mt19937_64 RandGen;
    #else
    typedef std::mt19937 RandGen;
    #endif
    #endif
    #if __GNUC__
    #if __x86_64__ || __ppc64__
    typedef std::mt19937_64 RandGen;
    #else
    typedef std::mt19937 RandGen;
    #endif
    #endif

  2. Prefix Sum

    inline void prefixSum(float* arr, size_t K)
    {
    size_t Kf = (K >> 2) << 2;
    if (Kf) prefix_sum_SSE(arr, Kf);
    else Kf = 1;
    for (size_t i = Kf; i < K; ++i)
    {
    arr[i] += arr[i - 1];
    }
    }
    #else
    inline void prefixSum(float* arr, size_t K)
    {
    for (size_t i = 1; i < K; ++i)
    {
    arr[i] += arr[i - 1];
    }
    }

  3. Eigen's redux function(sum())
    The summation order can be varied by SIMD options.

(1) and (2) will can be fixed soon, but (3) seems not easy to fix.

bab2min added a commit that referenced this issue Jul 14, 2020
fixed #59, partially fixed #63
@bab2min bab2min mentioned this issue Jul 14, 2020
@bab2min bab2min reopened this Jul 14, 2020
@mayankchatteron1
Copy link

@bab2min could you tell the timeline by which it will be finished?
even setting -
np.random.seed(123)
random.seed(123)

on same avx2 machine giving different results on a different run.

@bab2min
Copy link
Owner Author

bab2min commented Feb 1, 2021

Hi, @mayankchatteron1
Since version 0.8.2, you can obtain the same result in the same machine when the seed is fixed with only workers=1.
If the result is different even though the seed is fixed on the same machine, please check that you set workers=1 first.
If you get different results even if workers=1, this is maybe another bug, so please report your source code.
Thank you!

@mayankchatteron1
Copy link

@bab2min Thanks it working now. Thank you again for the quick resolution.
But curious to know why different workers giving different results, is it due to some kind of model parallelism?

@bab2min
Copy link
Owner Author

bab2min commented Feb 2, 2021

@mayankchatteron1 Yes.
In a multithreaded environment, it is impossible to completely control the execution order by software.
In order to control it, each inference step needs to be synchronized, which reduces parallelization efficiency and makes it virtually no difference from running on a single thread.

To avoid confusion, I'll update package to print a warning message when seed is fixed and workers > 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants