Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing the way reward is given so that agent performs well. #16

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

satinder147
Copy link

When I trained your model for the first time, I saw that the agent was not buying anything on actual data. I made some changes to the way, the reward was given. I have tested it on a few stocks and I am getting descent profits.

When I trained your model for the first time, I saw that the agent was not buying anything on actual data. I made some changes to the way, the reward was given. I have tested it on a few stocks and I am getting descent profits.
@eneszv
Copy link

eneszv commented Apr 3, 2020

This is a really good idea but do you think that maybe variables a and b should be part of the state of the agent? Without it, the agent doesn't know why you give him a penalty (reward) of -200 and that penalty could be considered as the wrong action at the moment for a particular state S without knowledge of previous actions

@satinder147
Copy link
Author

satinder147 commented Apr 4, 2020

@eneszv as far as I know, the accuracy of a reinforcement learning agent depends upon the design of the reward function. The reward function used by the original author is no doubt good, I modified it only so that it converges faster. In reality, we should not be providing state variables (like a and b) because they put constraints on what the agent can learn for eg the use of a and b forced the agent to buy things, but sometimes it may not be the most optimal things to do at a given point of time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants