Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Optimal hyperparameters and scripts to reach 2000 steps/sec training speed #58

Open
wenjie-mo opened this issue Oct 12, 2022 · 3 comments
Labels
question Further information is requested

Comments

@wenjie-mo
Copy link

Question

Hello I am wondering which script and hyperparameters could achieve the 2000+ step/sec training speed as mentioned in the paper. So I have tried the following:

  1. run_sample_factory.py algorithm=APPO
    Problem: When using sample_factory library: parameters lr_schedule and max_entropy_coeff are missing, not sure what are the optimal numbers I should use.

  2. run_rllib.py
    Problem: same run time error for every worker, attached below:

image

  1. nocturne_runner.py
    Problem: the training speed is not that fast (around 100 steps/sec with around 40 fps). I have tried Issues Running MAPPO #38 and fps improved to around 80fps but the steps are still around the same.

My settings:
Code: newest code from main branch
OS: Ubuntu 20.04
GPU: RTX 3080 with CUDA 11.6
sample_factory: I have tried latest and aed6cc92a7eb3510c4d4bcfac083ced07b5222f9 (as mentioned in paper)

Please let me know if I made anything wrong when running the scripts. Thanks so much for answering!

@wenjie-mo wenjie-mo added the question Further information is requested label Oct 12, 2022
@eugenevinitsky
Copy link
Collaborator

Hi! Sorry you've been having trouble. Let me answer each one piece by piece. First off, that 2k number corresponds to environment stepping time (i.e. no RL algo in the loop) so during training you'll see an FPS that differs significantly from algorithm depending on the type of policy used and whether the environment calculates a per-agent FPS or an overall "amount of experience generated per second in total". As for each particular one.

  1. In the first type, we didn't freeze our sample factory version and the newest one has an additional hparam that we didn't have in our version. This is fixed here Freeze sample-factory version, add missing hparams #59 and will be merged shortly. If you run on that PR on the machine you have you should see about 10k-20k fps.

  2. Looking into this one, this one usually means something went wrong with setting the config.

  3. For this one, you need to increase the value of n_training_threads. The environment is running without any vectorization by default. Hope that helps

@wenjie-mo
Copy link
Author

Hi Eugene, thanks so much for the reply and clarification! I will try out these solutions soon and let you know if they all works!

@wenjie-mo
Copy link
Author

Hi, sorry I accidentally closed the issue. I would like to keep the issue open just for tracking purpose. Thanks!

@wenjie-mo wenjie-mo reopened this Oct 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants