Issues Running MAPPO #38

roggirg · 2022-07-18T18:33:07Z

Hi Folks,

I'm trying to run "on-policy PPO" using python examples/on_policy_files/nocturne_runner.py algorithm=ppo and there are a couple of issues I'm encountering.

algo vs. algorithm: The config.yml file uses algorithm whereas the script uses cfg.algo. Switching algo to algorithm seems to fix the issue.
wandb_name seems to be missing from the cfg. To make it work, I just disabled use of wandb.
The wrapper environment calls len(self.vehicles) on line 30 which throws AttributeError: 'BaseEnv' object has no attribute 'vehicles'. Replacing self.vehicles with self.controlled_vehicles seems to solve the issue. Is this the correct way to fix it?

Thanks for your help.

The text was updated successfully, but these errors were encountered:

xiaomengy · 2022-07-18T18:54:09Z

@eugenevinitsky Could you help take a look?

eugenevinitsky · 2022-07-18T19:13:49Z

Hi, sorry this bug is here! I am out today but this will be definitively fixed by tomorrow afternoon.

eugenevinitsky · 2022-07-18T19:14:32Z

I believe the fixes that you have there are correct though.

eugenevinitsky · 2022-07-19T22:51:26Z

Thanks for your patience, working on getting this merged but the relevant fixes are in:
#39

eugenevinitsky · 2022-07-19T23:11:41Z

Heads up though, that code has not been extensively hyper-parameter tuned

eugenevinitsky · 2022-07-20T18:33:39Z

No rush at all but let us know if this resolves your issue?

roggirg · 2022-07-21T14:16:43Z

Hi @eugenevinitsky ,

Everything is running now thanks for the fixes.

Just out of curiosity before we close this issue, what should the fps be during training? I'm getting 25-30:

average episode rewards is 0.33026985824108124
maximum per step reward is 0.058307357132434845

 Algo rmappo Exp intersection updates 50/1250000 episodes, total num timesteps 4080/100000000.0, FPS 29.

average episode rewards is 2.849382162094116
maximum per step reward is 8.059619903564453
episode reward of rendered episode is: 0.8622641801569368

 Algo rmappo Exp intersection updates 55/1250000 episodes, total num timesteps 4480/100000000.0, FPS 25.

average episode rewards is 0.9344396740198135
maximum per step reward is 0.05804213136434555

 Algo rmappo Exp intersection updates 60/1250000 episodes, total num timesteps 4880/100000000.0, FPS 26.

average episode rewards is 1.3483695685863495
maximum per step reward is 8.056236267089844

 Algo rmappo Exp intersection updates 65/1250000 episodes, total num timesteps 5280/100000000.0, FPS 27.

average episode rewards is 1.1445978283882141
maximum per step reward is 0.057421959936618805
```
Thanks!

xiaomengy · 2022-07-21T16:17:12Z

Hi @eugenevinitsky ,

Everything is running now thanks for the fixes.

Just out of curiosity before we close this issue, what should the fps be during training? I'm getting 25-30:

average episode rewards is 0.33026985824108124
maximum per step reward is 0.058307357132434845

 Algo rmappo Exp intersection updates 50/1250000 episodes, total num timesteps 4080/100000000.0, FPS 29.

average episode rewards is 2.849382162094116
maximum per step reward is 8.059619903564453
episode reward of rendered episode is: 0.8622641801569368

 Algo rmappo Exp intersection updates 55/1250000 episodes, total num timesteps 4480/100000000.0, FPS 25.

average episode rewards is 0.9344396740198135
maximum per step reward is 0.05804213136434555

 Algo rmappo Exp intersection updates 60/1250000 episodes, total num timesteps 4880/100000000.0, FPS 26.

average episode rewards is 1.3483695685863495
maximum per step reward is 8.056236267089844

 Algo rmappo Exp intersection updates 65/1250000 episodes, total num timesteps 5280/100000000.0, FPS 27.

average episode rewards is 1.1445978283882141
maximum per step reward is 0.057421959936618805

Thanks!

It's hard to say what is the normal FPS. It depends on lost of things. Could you provide more details such as what machine you are using, what and how many CPU cores you have, what and how many GPUs you have, etc.

eugenevinitsky · 2022-07-21T17:59:17Z

Hey @roggirg, it depends on the number of rollout threads you're using and whether you are using a GPU or just CPU; the MAPPO code uses an RNN by default and includes the time for backprop when computing the FPS. Can you try increasing the value of algorithm.n_rollout_threads? It should basically scale linearly in the number of threads or workers

roggirg · 2022-07-21T18:31:47Z

Ah cool, thanks @eugenevinitsky @xiaomengy . I played around with n_rollout_threads=4 (did not know of its existence) and the FPS jumped up to ~50ish.
FYI, I'm running on a 1080Ti with a 12 -core CPU.
Thanks for your help.

eugenevinitsky · 2022-07-21T18:54:28Z

We're going to re-open this because that's a good deal slower than we expect it to be. @xiaomengy, any chance you could run the line
python examples/on_policy_files/nocturne_runner.py algorithm=ppo algorithm.n_rollout_threads=4 and report the FPS? I don't have GPU access for a little while so I can't check it myself.

xiaomengy assigned eugenevinitsky Jul 18, 2022

eugenevinitsky mentioned this issue Jul 19, 2022

changes to MAPPO code to make it functional #39

Merged

eugenevinitsky closed this as completed in #39 Jul 20, 2022

eugenevinitsky reopened this Jul 20, 2022

roggirg closed this as completed Jul 21, 2022

eugenevinitsky reopened this Jul 21, 2022

wenjie-mo mentioned this issue Oct 12, 2022

[Question] Optimal hyperparameters and scripts to reach 2000 steps/sec training speed #58

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues Running MAPPO #38

Issues Running MAPPO #38

roggirg commented Jul 18, 2022 •

edited

Loading

xiaomengy commented Jul 18, 2022

eugenevinitsky commented Jul 18, 2022

eugenevinitsky commented Jul 18, 2022

eugenevinitsky commented Jul 19, 2022

eugenevinitsky commented Jul 19, 2022

eugenevinitsky commented Jul 20, 2022 •

edited

Loading

roggirg commented Jul 21, 2022

xiaomengy commented Jul 21, 2022

eugenevinitsky commented Jul 21, 2022 •

edited

Loading

roggirg commented Jul 21, 2022

eugenevinitsky commented Jul 21, 2022

Issues Running MAPPO #38

Issues Running MAPPO #38

Comments

roggirg commented Jul 18, 2022 • edited Loading

xiaomengy commented Jul 18, 2022

eugenevinitsky commented Jul 18, 2022

eugenevinitsky commented Jul 18, 2022

eugenevinitsky commented Jul 19, 2022

eugenevinitsky commented Jul 19, 2022

eugenevinitsky commented Jul 20, 2022 • edited Loading

roggirg commented Jul 21, 2022

xiaomengy commented Jul 21, 2022

eugenevinitsky commented Jul 21, 2022 • edited Loading

roggirg commented Jul 21, 2022

eugenevinitsky commented Jul 21, 2022

roggirg commented Jul 18, 2022 •

edited

Loading

eugenevinitsky commented Jul 20, 2022 •

edited

Loading

eugenevinitsky commented Jul 21, 2022 •

edited

Loading