Different results with each run #883

nucleosynthesis · 2023-12-16T11:29:25Z

As Pointed out by @adavidzh , when running HybridNew with the same (default) seed, the limit is different. The RooFit seed is set but not clear that ROOT gRandom is.

Possibly related to this line

HiggsAnalysis-CombinedLimit/src/HybridNew.cc

Line 1682 in ec31c30

TRandom3 *rnd = new TRandom3();

Does anyone know (maybe @guitargeek ?) If this coukd be the cause and how to resolve it?

guitargeek · 2023-12-16T18:10:36Z

Hi, from the top of my head I have no idea. How can this be reproduced? Then I can help figuring out what's going on

nucleosynthesis · 2023-12-18T15:46:43Z

Running the following will show the behaviour

combine template-analysis-datacard.txt -M HybridNew --LHCmode LHC-limits
--rMax 2.0 --clsAcc 0.01

The datacard can be found in

data/tutorials/CAT23001/template-analysis-datacard.txt

nucleosynthesis · 2023-12-18T16:23:14Z

Some updates. It seems to be related to the "on the fly" workspace creation. If one first creates the binary workspace and uses that as input, then the repeated runs produces the same output eg.

With the text file directly:

combine template-analysis-datacard.txt -M HybridNew --LHCmode LHC-limits --rMax 2.0 --clsAcc 0.01
 -- Hybrid New --
Limit: r < 0.335698 +/- 0.0143468 @ 95% CL
Done in 0.15 min (cpu), 0.15 min (real)

combine template-analysis-datacard.txt -M HybridNew --LHCmode LHC-limits --rMax 2.0 --clsAcc 0.01
 -- Hybrid New --
Limit: r < 0.356193 +/- 0.0349485 @ 95% CL
Done in 0.18 min (cpu), 0.18 min (real)

first building the workspace:

text2workspace.py template-analysis-datacard.txt

combine template-analysis-datacard.root -M HybridNew --LHCmode LHC-limits --rMax 2.0 --clsAcc 0.01
 -- Hybrid New --
Limit: r < 0.346362 +/- 0.0134581 @ 95% CL
Done in 0.24 min (cpu), 0.25 min (real)

combine template-analysis-datacard.root -M HybridNew --LHCmode LHC-limits --rMax 2.0 --clsAcc 0.01 
 -- Hybrid New --
Limit: r < 0.346362 +/- 0.0134581 @ 95% CL
Done in 0.24 min (cpu), 0.24 min (real)

Could the binary file creation have something to do with it?

nucleosynthesis · 2023-12-19T10:12:40Z

Digging a bit more, it seems like this might be intentional. In these lines

HiggsAnalysis-CombinedLimit/src/Combine.cc

Lines 311 to 322 in ec31c30

    
           TString tmpDir = "", tmpFile = "", pwd(gSystem->pwd()); 
        
           if (makeTempDir_) {  
        
               tmpDir = "roostats-XXXXXX"; tmpFile = "model"; 
        
               mkdtemp(const_cast<char *>(tmpDir.Data())); 
        
               gSystem->cd(tmpDir.Data()); 
        
               garbageCollect.path = tmpDir.Data(); // request that we delete this dir when done 
        
           } else if (!hlfFile.EndsWith(".hlf") && !hlfFile.EndsWith(".root")) { 
        
               char buff[99]; snprintf(buff, 98, "roostats-XXXXXX"); 
        
               int fd = mkstemp(buff); close(fd); 
        
               tmpFile = buff; 
        
               unlink(tmpFile); // this is to be deleted, since we'll use tmpFile+".root" 
        
           }

, a temporary file is created to store the workspace if the input is the datacard. This allows one to run multiple commands in parallel without worrying about each one writing over eachother.

However, this c++ method cannot be seeded for reproducibility and since it seems RooFits random number generator is tied to TRandom (which in turn is tied to this one - is that right @guitargeek ?), we can't avoid non-reproducible results if one runs the same command over and over on the .txt file.

@adavidzh , I think we just need to make it clear to a user that if they run the commands that use toys, they will get different but consistent results each time unless they first convert to a binary file and use that as the input.

adavidzh · 2023-12-24T16:47:30Z

a temporary file is created to store the workspace if the input is the datacard. This allows one to run multiple commands in parallel without worrying about each one writing over eachother.

I would argue that we should be able to create a unique - per job - deterministic identifier. I.e., we do not need a random thing, just a unique one, that can e.g., be made out of a hash of things like the datacard and command line options.

nucleosynthesis · 2023-12-24T16:58:22Z

To be clear, As far as I can tell this is done by ROOT rather than something we can control. Perhaps Someone more expert in the inner workings of root can see if it can be bypassed

…

On Sun, 24 Dec 2023, 16:47 André David, ***@***.***> wrote: a temporary file is created to store the workspace if the input is the datacard. This allows one to run multiple commands in parallel without worrying about each one writing over eachother. I would argue that we should be able to create a unique - per job - deterministic identifier. I.e., we do not need a random thing, just a unique one, that can e.g., be made out of a hash of things like the datacard and command line options. — Reply to this email directly, view it on GitHub <#883 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAMEVW4MY6EHAEC3JRDSKJ3YLBMC3AVCNFSM6AAAAABAXQYKTSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRYGU2TMMZZG4> . You are receiving this because you authored the thread.Message ID: ***@***.*** com>

nucleosynthesis · 2023-12-24T17:07:22Z

Hint @guitargeek 😉

adavidzh · 2023-12-24T17:27:09Z

I don't get it @nucleosynthesis: the code you reference (with mkdtemp and mkstemp) is combine's.

nucleosynthesis · 2023-12-24T17:38:16Z

Right,

https://man7.org/linux/man-pages/man3/mkstemp.3.html

https://man7.org/linux/man-pages/man3/mkdtemp.3.html

Are what we use to guarantee to uniqueness but it seems that uses the same seed that RooFit will then use.

Using the same datacard and command would imply the same hash so not sure that would work with concurrent identical command lines (which we can't do currently)

Too much of this is my own speculation (based on some minimal testing) though, so perhaps there is simply a way to avoid the clash between the seed that RooFit uses and the one being triggered by the call to mkstemp.

kcormi · 2024-04-10T16:07:58Z

Just for completeness, I don't think this is limited to toy-based methods like HybridNew. I see the same effect (slightly different numerical values) when running the asymptotic Significance method from the datacards, but if I produce the workspace first and rerun the results are identical. Still, negligibly small differences, but it appears to be the same issue.

nucleosynthesis added the question label Dec 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different results with each run #883

Different results with each run #883

nucleosynthesis commented Dec 16, 2023

guitargeek commented Dec 16, 2023

nucleosynthesis commented Dec 18, 2023

nucleosynthesis commented Dec 18, 2023 •

edited

Loading

nucleosynthesis commented Dec 19, 2023

adavidzh commented Dec 24, 2023

nucleosynthesis commented Dec 24, 2023 via email

nucleosynthesis commented Dec 24, 2023

adavidzh commented Dec 24, 2023

nucleosynthesis commented Dec 24, 2023

kcormi commented Apr 10, 2024

Different results with each run #883

Different results with each run #883

Comments

nucleosynthesis commented Dec 16, 2023

guitargeek commented Dec 16, 2023

nucleosynthesis commented Dec 18, 2023

nucleosynthesis commented Dec 18, 2023 • edited Loading

nucleosynthesis commented Dec 19, 2023

adavidzh commented Dec 24, 2023

nucleosynthesis commented Dec 24, 2023 via email

nucleosynthesis commented Dec 24, 2023

adavidzh commented Dec 24, 2023

nucleosynthesis commented Dec 24, 2023

kcormi commented Apr 10, 2024

nucleosynthesis commented Dec 18, 2023 •

edited

Loading