Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A better documentation would be nice 😄 #2

Open
xXWarMachineRoXx opened this issue Aug 8, 2023 · 0 comments
Open

A better documentation would be nice 😄 #2

xXWarMachineRoXx opened this issue Aug 8, 2023 · 0 comments

Comments

@xXWarMachineRoXx
Copy link

a: 0.1
e: 9.251777478947598e-05
g: 0.9

I don't know what these mean its just an example but comments and how to make it better etc is surely missing. I know its your learning project but do tell us what you learnt as I too wanna do this Q learning project as my first and want to make the perfect snake 🥇 .

on a closer look I think its just ,

a (α - Alpha): This is the learning rate, denoted by α. It determines how quickly the Q-values are updated based on new experiences. A higher value means that the agent will adjust its Q-values more rapidly in response to new information. A lower value makes the agent more resistant to changing its Q-values based on new experiences.

e (ε - Epsilon): This is the exploration factor, denoted by ε. It determines the likelihood that the agent will choose a random action instead of following its learned policy. Exploration is important to discover new actions and states, which helps the agent find better policies. A higher ε encourages more exploration, while a lower ε favors exploitation of the current knowledge.

g (γ - Gamma): This is the discount factor, denoted by γ. It determines the agent's consideration of future rewards in the decision-making process. A higher value of γ makes the agent prioritize long-term rewards, while a lower value makes it focus more on immediate rewards. It is used to calculate the cumulative discounted future rewards when updating Q-values.

If it was written somewhere in README.md , it would be greatly helpful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant