You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to implement your Batch Normalization tutorial on your DDPG algorithme tutorial, but i have a hard time understanding the bits?
one of my problems is:
` self.a_loss = - tf.reduce_mean(q) # maximize the q
self.atrain = tf.train.AdamOptimizer(LR_A).minimize(self.a_loss, var_list=a_params)
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
with tf.control_dependencies(target_update): # soft replacement happened at here
self.q_target = self.R + ((GAMMA * (1- self.Done)) * (q_ * (1 - self.Done)))
self.td_error = tf.losses.mean_squared_error(labels=self.q_target, predictions=q)
self.ctrain = tf.train.AdamOptimizer(LR_C).minimize(self.td_error, var_list=c_params) `
Since you said you need to have that update_ops i imagned that it should look something like this, but this then won't include the atrain, if not this being incorrect of course?
furthermore if you could give some signs of directions on how to implement it on your ddpg implementation that would be nice,
Jan
The text was updated successfully, but these errors were encountered:
Hi Morvan,
I am trying to implement your Batch Normalization tutorial on your DDPG algorithme tutorial, but i have a hard time understanding the bits?
one of my problems is:
Since you said you need to have that update_ops i imagned that it should look something like this, but this then won't include the atrain, if not this being incorrect of course?
furthermore if you could give some signs of directions on how to implement it on your ddpg implementation that would be nice,
Jan
The text was updated successfully, but these errors were encountered: