Skip to content

Commit

Permalink
For mpi: v^0 set to min_(s, a) r(s, a) / (1-beta) to guarantee conver…
Browse files Browse the repository at this point in the history
…gence
  • Loading branch information
oyamad committed Aug 25, 2015
1 parent 487c12d commit b0d754d
Showing 1 changed file with 6 additions and 3 deletions.
9 changes: 6 additions & 3 deletions quantecon/markov/mdp.py
Original file line number Diff line number Diff line change
Expand Up @@ -598,8 +598,11 @@ def solve(self, method='policy_iteration',
Solution method.
v_init : array_like(float, ndim=1), optional(default=None)
Initial value function, of length n. If None, set v_init(s)
= max_a r(s, a).
Initial value function, of length n. If None, `v_init` is
set such that v_init(s) = max_a r(s, a) for value iteration
and policy iteration; for modified policy iteration,
v_init(s) = min_(s', a) r(s', a)/(1 - beta) to guarantee
convergence.
epsilon : scalar(float), optional(default=None)
Value for epsilon-optimality. If None, the value stored in
Expand Down Expand Up @@ -733,7 +736,7 @@ def midrange(z):

v = np.empty(self.num_states)
if v_init is None:
self.s_wise_max(self.R, out=v)
v[:] = self.R[self.R > -np.inf].min() / (1 - self.beta)
else:
v[:] = v_init

Expand Down

0 comments on commit b0d754d

Please sign in to comment.