Truncated proposals for SNPE (TSNPE) implementation #1354
-
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
Hi there, thanks for raising this! A quick question: What is Also, what is the dimensionality of theta and x? Michael |
Beta Was this translation helpful? Give feedback.
-
in general, when you sample from the posterior that uses a large embedding net / is conditioned on large data, it will generate samples through forward passes through the underlying embedding net and density estimator. Thus, the custom embedding net will always be used. If you draw 1,000,000 samples, these samples will be accumulated on the GPU, which is likely the cause for the out of memory error. Thus, if you really need that many samples, a workaround would be using a for-loop to draw 10 x 100,000 samples, or 100 x 10,000 samples etc. and moving the individual batches to CPU in each iteration. Regarding the device mismatch error after moving the posterior estimator to CPU: you need to make sure that everything lives on the CPU then - the prior, the posterior object, the data, and the net. Ideally, you just create a new
does this help? Cheers, |
Beta Was this translation helpful? Give feedback.
-
Thank you. We use a licensed software (Vensim) that makes it challenging for us to share the simulator. I am in the process of developing a minimal example using only Python, which I hopefully share soon. We are dealing with time series data and that's why we rely on GPU (mostly RNN networks that we've borrowed from BayesFlow and revised based on the feedback we received from this community). Regarding the comments, sometimes the problems have 12-24 dimensions that become more complex with number of time points going above 200. Also, I've been working on the cpu solution, but I have struggled a bit, which I shared here #1368. However, in low-dimensional problems, using 10,000 samples for accept_reject_fn works fine. I'm not sure why the default value was set to 1,000,000, which made me confused and prevented me to test fewer sample sizes. Yet, in high-dimensional problems (12 dimensions, 300 time points, and 3 outcome time series), 10,000 sample size would lead to 44GB RAM on CUDA, which I'm still investing. Thanks again for the suggestions, and I'll keep you updated. Best, |
Beta Was this translation helpful? Give feedback.
Hi @ali-akhavan89
in general, when you sample from the posterior that uses a large embedding net / is conditioned on large data, it will generate samples through forward passes through the underlying embedding net and density estimator. Thus, the custom embedding net will always be used. If you draw 1,000,000 samples, these samples will be accumulated on the GPU, which is likely the cause for the out of memory error. Thus, if you really need that many samples, a workaround would be using a for-loop to draw 10 x 100,000 samples, or 100 x 10,000 samples etc. and moving the individual batches to CPU in each iteration.
Regarding the device mismatch error after moving the posterior estimator t…