make `pseudodata_table` correctly deal with multiple replicas #2034

RoyStegeman · 2024-04-03T13:50:52Z

For the thcovmat alphas stuff it makes a big difference whether I use the central data or the average over data replicas. While looking into it I want to use this function with multiple replicas.

scarlehoff · 2024-04-11T08:31:02Z

validphys2/src/validphys/n3fit_data.py

@@ -343,7 +343,7 @@ def replica_nnseed_fitting_data_dict(replica, exps_fitting_data_dict, replica_nn

 replicas_nnseed_fitting_data_dict = collect("replica_nnseed_fitting_data_dict", ("replicas",))
 groups_replicas_indexed_make_replica = collect(
-    "indexed_make_replica", ("group_dataset_inputs_by_experiment", "replicas")
+    "indexed_make_replica", ("replicas", "group_dataset_inputs_by_experiment")


Why is the change of order necessary?

I thin it makes what we do in pseudodata_table more readable. There we group the entries to groups_replicas_indexed_make_replica corresponding to a given replica. I think that's easier to understand the way it's done now than if we had to take e.g. index 0 and then skip a number of indexes equal to the number of groups to get the second input corresponding to the same replica

validphys2/src/validphys/n3fit_data.py

scarlehoff · 2024-04-11T08:32:55Z

validphys2/src/validphys/n3fit_data.py

+    df = [
+        pd.concat(groups_replicas_indexed_make_replica[i : i + groups_per_replica])
+        for i in range(0, len(groups_replicas_indexed_make_replica), groups_per_replica)
+    ]


Why can't you achieve this with a reshape or permutation of groups_replicas_indexed_make_replica

because groups_replicas_indexed_make_replica is a list of indexed_make_replica containing a number of dataframes equal to number_of_replicas x number_of_datagroups. It's not a really clean input.

Here I group the list items (all different data groups) that correspond to the same replica into a single dataframe for each replica

Just complicating your life here, but would it be possible to do something along the lines of

np.array(groups_replicas_indexed_make_replica).reshape(replicas, groups_per_replica) ?

(or the other way around)

No because it's a list of dataframes and I do need to retain the information on the labels

Co-authored-by: Juan M. Cruz-Martinez <[email protected]>

RoyStegeman requested review from scarlehoff and andreab1997 April 3, 2024 13:50

scarlehoff reviewed Apr 11, 2024

View reviewed changes

RoyStegeman and others added 5 commits April 11, 2024 10:32

make pseudodata_table correctly deal with multiple replicas

5432e68

Update validphys2/src/validphys/n3fit_data.py

4040eb3

Co-authored-by: Juan M. Cruz-Martinez <[email protected]>

clarify inline comment in pseudodata_table

1b46e3b

Update n3fit_data.py

2722aa7

Update validphys2/src/validphys/n3fit_data.py

ddf7702

Co-authored-by: Juan M. Cruz-Martinez <[email protected]>

RoyStegeman force-pushed the rs-quickfix branch from 44f0068 to ddf7702 Compare April 11, 2024 09:32

RoyStegeman merged commit 0207a00 into master Apr 11, 2024
6 checks passed

RoyStegeman deleted the rs-quickfix branch April 11, 2024 14:16

scarlehoff mentioned this pull request May 20, 2024

To Do for 4.0.10 #1854

Open

33 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make `pseudodata_table` correctly deal with multiple replicas #2034

make `pseudodata_table` correctly deal with multiple replicas #2034

RoyStegeman commented Apr 3, 2024

scarlehoff Apr 11, 2024

RoyStegeman Apr 11, 2024

scarlehoff Apr 11, 2024

RoyStegeman Apr 11, 2024 •

edited

Loading

scarlehoff Apr 11, 2024

RoyStegeman Apr 11, 2024

make pseudodata_table correctly deal with multiple replicas #2034

make pseudodata_table correctly deal with multiple replicas #2034

Conversation

RoyStegeman commented Apr 3, 2024

scarlehoff Apr 11, 2024

Choose a reason for hiding this comment

RoyStegeman Apr 11, 2024

Choose a reason for hiding this comment

scarlehoff Apr 11, 2024

Choose a reason for hiding this comment

RoyStegeman Apr 11, 2024 • edited Loading

Choose a reason for hiding this comment

scarlehoff Apr 11, 2024

Choose a reason for hiding this comment

RoyStegeman Apr 11, 2024

Choose a reason for hiding this comment

make `pseudodata_table` correctly deal with multiple replicas #2034

make `pseudodata_table` correctly deal with multiple replicas #2034

RoyStegeman Apr 11, 2024 •

edited

Loading