Skip to content

An experiment shows the performances of random search methods on simple control problem

License

Notifications You must be signed in to change notification settings

egg-west/random-search-for-control-problem

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Normal random search

1. Task description 'CartPole-v0' in openai gym: using left or right action to control the tiny car to keep the balance of the bar standing on it. See figure-0.

​ figure-0

2. Method Optimize a W that make a policy like this $$ W \times observation = a $$ we use \hat{W} = W + random_bias to generate new weight and replace W with \hat{W} when \hat{W} performs better.

script: src/random_search_epoch.py use random bias to generate new weight, then we got figure-1

a.png

​ figure-1

Obviously, the average is above 100. But we can see the reward slip down quickly

To fix the slip down problem, I add control such as epoch_reward > 150 , expecting to prevent small reward.

if epoch_reward < 150 and epoch_reward > all_reward[-1]:
		update_weight

then we got

b.png ​ figure-2 It is clearly that the average reward gets higher but the slip down problem is there still.

Conclusion

From above simple experiments we can see that in simple problem, random search works. You can see more result in Appendix. Your will find the result can be perfect ( figure-3 ) sometimes. In another word, cart-pole task is an task for test so do not take this task serious.

Appendix

Other result

​ figure-3

​ figure-4

About

An experiment shows the performances of random search methods on simple control problem

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages