missing features and todos for score estimation #1226

janfb · 2024-08-19T17:50:32Z

there are a couple of unsolved problems and enhancements for NPSE:

MAP

MAP is using the score directly for doing gradient ascent on the posterior to find to MAP. this is currently not working accurately

IID sampling

iid sampling is implemented as proposed in Geffner et al, i.e., using the iid_bridge and by accumulating the individual scores over a batch of iid samples. However, this is not working accurately either. I have not found the source of the error yet, I added a couple of TODOs, e.g., here:

sbi/sbi/samplers/score/score.py

Lines 93 to 109 in 9c6734f

    
           # TODO: for iid setting, self.batch_shape.numel() will be the iid-batch. But we 
        
           # don't want to generate num_obs samples, but only one sample given the the iid 
        
           # batch. 
        
           # TODO: the solution will probably be to distinguish between the iid setting and 
        
           # batched sampling setting with a flag. 
        
           # TODO: this fixes the iid setting shape problems, but iid inference via 
        
           # iid_bridge is not accurate. 
        
           # num_batch = self.batch_shape.numel() 
        
           # init_shape = (num_batch, num_samples) + self.input_shape 
        
           init_shape = ( 
        
               num_samples, 
        
           ) + self.input_shape  # just use num_samples, not num_batch 
        
           # NOTE: for the IID setting we might need to scale the noise with iid batch 
        
           # size, as in equation (7) in the paper. 
        
           eps = torch.randn(init_shape, device=self.device) 
        
           mean, std, eps = torch.broadcast_tensors(self.init_mean, self.init_std, eps) 
        
           return mean + std * eps

Log prob and sampling via CNF

Once trained, we can use the score_estimator to define a probabilistic ODE, e.g., a CNF via zuko and directly call log_prob and sample on it. At the moment this is already happening when constructing the ScorePosterior with sample_with="ode". However, it is a bit all over the place, e.g., log_prob is coming from the potential via zuko anyways, and the for the sampling we construct a flow with each call. A possible solution to make things clearer is creating a ODEPosterior that could be used by flow matching as well.

Allow transforms for potential

See score_estimator_based_potential, which currently asserts whether theta enable_transform=False

Better converged checks

Unlike the ._converged method in base.py, this method does not reset to the best model. We noticed that this improves performance. Deleting this method will make C2ST tests fail. This is because the loss is very stochastic, so resetting might reset to an underfitted model. Ideally, we would write a custom ._converged() method which checks whether the loss is still going down for all t.

The text was updated successfully, but these errors were encountered:

gmoss13 · 2024-09-06T13:19:53Z

MAP

MAP is using the score directly for doing gradient ascent on the posterior to find to MAP. this is currently not working accurately

Is it the case that it does not work accurately, or that this doesn't run at all? Trying to find the MAP with gradient ascent requires differentiating through the backward method of zuko CNF's, which causes autograd errors for me.

Regardless, even if we can backprop through log_prob as constructed with CNFs, this would be incredibly slow as evaluating the log prob in this way requires an ODE solve. I am wondering if we could instead find an approximate MAP by using a variational lower bound of the log prob, e.g. Eq. 11 of Maximum Likelihood Training of Score-Based Diffusion Models. This way we don't need to compute a lot of ODE solves. @manuelgloeckler do you have any thoughts on this?

gmoss13 · 2024-09-11T15:50:11Z

Update after talking to @manuelgloeckler, the easiest way to calculate the MAP here would be to use the score directly at a time t = epsilon, instead of calculating and gradding through the exact log_prob. which as stated above would be really inefficient. I will implement this soon.

janfb · 2024-09-11T17:42:13Z

Update after talking to @manuelgloeckler, the easiest way to calculate the MAP here would be to use the score directly at a time t = epsilon, instead of calculating and gradding through the exact log_prob. which as stated above would be really inefficient. I will implement this soon.

I think that's actually what @michaeldeistler had implemented already. It's in the backup branch, here:

sbi/sbi/utils/sbiutils.py

Lines 946 to 954 in 2b233ce

    
               optimize_inits.requires_grad_(False)  # type: ignore 
        
               gradient = potential_fn.gradient(optimize_inits) 
        
           except (NotImplementedError, AttributeError): 
        
               optimize_inits.requires_grad_(True)  # type: ignore 
        
               probs = potential_fn(optimize_inits).squeeze() 
        
               loss = probs.sum() 
        
               loss.backward() 
        
               gradient = optimize_inits.grad 
        
               assert isinstance(gradient, Tensor), "Gradient must be a tensor."

and then in case of the score-based potential it would just use the gradient directly from here:

sbi/sbi/inference/potentials/score_based_potential.py

Lines 132 to 160 in 2b233ce

    
               def gradient( 
        
                   self, theta: Tensor, time: Optional[Tensor] = None, track_gradients: bool = True 
        
               ) -> Tensor: 
        
                   r"""Returns the potential function gradient for score-based methods. 
        
                   Args: 
        
                       theta: The parameters at which to evaluate the potential. 
        
                       time: The diffusion time. If None, then `t_min` of the 
        
                           self.score_estimator is used (i.e. we evaluate the gradient of the 
        
                           actual data distribution). 
        
                       track_gradients: Whether to track gradients. 
        
                   Returns: 
        
                       The gradient of the potential function. 
        
                   """ 
        
                   if time is None: 
        
                       time = torch.tensor([self.score_estimator.t_min]) 
        
                   if self._x_o is None: 
        
                       raise ValueError( 
        
                           "No observed data x_o is available. Please reinitialize \ 
        
                           the potential or manually set self._x_o." 
        
                       ) 
        
                   with torch.set_grad_enabled(track_gradients): 
        
                       if not self.x_is_iid or self._x_o.shape[0] == 1: 
        
                           score = self.score_estimator.forward( 
        
                               input=theta, condition=self.x_o, time=time 
        
                           )

Or are you referring to yet a different approach?

gmoss13 · 2024-12-03T08:48:44Z

@manuelgloeckler ping re: IID sampling

janfb added the enhancement New feature or request label Aug 19, 2024

janfb mentioned this issue Aug 27, 2024

Score-based density estimators for SBI #1015

Merged

gmoss13 self-assigned this Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

missing features and todos for score estimation #1226

missing features and todos for score estimation #1226

janfb commented Aug 19, 2024 •

edited by michaeldeistler

Loading

gmoss13 commented Sep 6, 2024 •

edited

Loading

gmoss13 commented Sep 11, 2024

janfb commented Sep 11, 2024

gmoss13 commented Dec 3, 2024

missing features and todos for score estimation #1226

missing features and todos for score estimation #1226

Comments

janfb commented Aug 19, 2024 • edited by michaeldeistler Loading

MAP

IID sampling

Log prob and sampling via CNF

Allow transforms for potential

Better converged checks

gmoss13 commented Sep 6, 2024 • edited Loading

gmoss13 commented Sep 11, 2024

janfb commented Sep 11, 2024

gmoss13 commented Dec 3, 2024

janfb commented Aug 19, 2024 •

edited by michaeldeistler

Loading

gmoss13 commented Sep 6, 2024 •

edited

Loading