Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: Object 'reps' Not Found During Parallel Execution #32

Open
timverlaan opened this issue Nov 28, 2024 · 0 comments
Open

Error: Object 'reps' Not Found During Parallel Execution #32

timverlaan opened this issue Nov 28, 2024 · 0 comments

Comments

@timverlaan
Copy link

Description:
I encountered an error when running microsynth with n.cores > 1 (parallel execution enabled). The issue appears to be related to the get.stats1 function, specifically the handling of the reps object. The error does not occur when using single-core execution (n.cores = 1), which suggests it is linked to the way objects are passed or handled in parallel processing.

Steps to Reproduce:
Below is an example of the code used:

microsynth_low_Crime <- microsynth(
  data = df_low,                                                     
  idvar = "GG_AREA_ID",                                                     
  intvar = "GG_TREATMENT_LOW",                                                   
  timevar = "GG_TIME",                                                      
  start.pre = 1,                                                            
  end.pre = (86 - cutoff_point),                                                              
  end.post = (89 - cutoff_point),                                                             
  match.out = c("Crime"),
  perm = 500,
  result.var = c("Crime"),                                                 
  use.survey = TRUE,
  confidence = 0.95,                                                         
  n.cores = 10,  # Parallel execution enabled
  test = "upper",
  check.feas = TRUE,
  use.backup = TRUE,
  omnibus.var = FALSE  # Explicitly set to FALSE
)

Observed Behaviour:
With n.cores = 10, the following error is raised during the parallelised calculation of survey statistics for permutation groups:

Calculating survey statistics for end.post = 89...
Completed survey statistics for main weights: Time = 0.04
Calculating survey statistics for permutation groups...
Parallelizing with n.cores = 10...
Error in get.stats1(data, w.tmp, Inter.tmp, mse.tmp, result.var, end.pre = end.pre,  : 
  object 'reps' not found

The same code runs successfully when n.cores = 1, suggesting the issue lies with the parallelisation.

Likely Cause:
From inspecting the get.stats1 function (via getAnywhere(get.stats1)), it appears the reps object is defined in the main weights computation but is not propagated correctly to worker processes during parallel execution. Since reps is critical for group assignment, its absence in parallelised runs results in the error.

Expected Behaviour:
Parallel execution should replicate the behaviour of single-core execution, ensuring that all necessary objects, such as reps, are properly passed to worker threads or recreated locally.

Suggestions for Fix:
Recreate reps in Each Worker: Modify the code to ensure reps is computed within each thread.

if (is.null(reps)) {
    reps <- assign.groups(as.numeric(treat), G = G.tmp)
}

Pass reps Explicitly: Use mechanisms such as parallel::clusterExport to share reps across worker threads.

cl <- parallel::makeCluster(n.cores)
parallel::clusterExport(cl, varlist = c("reps"), envir = environment())

Adjust Parallelisation Scope: If reps is not inherently tied to parallelisation, consider restructuring the computation to avoid requiring reps in parallelised sections.

Environment:
R Version: R version 4.4.0
Operating System: Windows 11

Additional Notes:
This issue may be affecting other parallelised computations in the package. Disabling parallelisation (n.cores = 1) serves as a workaround but significantly increases computation time for larger datasets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant