Changing the way reward is given so that agent performs well. #16

satinder147 · 2019-02-18T18:03:55Z

When I trained your model for the first time, I saw that the agent was not buying anything on actual data. I made some changes to the way, the reward was given. I have tested it on a few stocks and I am getting descent profits.

eneszv · 2020-04-03T14:30:30Z

This is a really good idea but do you think that maybe variables a and b should be part of the state of the agent? Without it, the agent doesn't know why you give him a penalty (reward) of -200 and that penalty could be considered as the wrong action at the moment for a particular state S without knowledge of previous actions

satinder147 · 2020-04-04T00:02:26Z

@eneszv as far as I know, the accuracy of a reinforcement learning agent depends upon the design of the reward function. The reward function used by the original author is no doubt good, I modified it only so that it converges faster. In reality, we should not be providing state variables (like a and b) because they put constraints on what the agent can learn for eg the use of a and b forced the agent to buy things, but sometimes it may not be the most optimal things to do at a given point of time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changing the way reward is given so that agent performs well. #16

Changing the way reward is given so that agent performs well. #16

satinder147 commented Feb 18, 2019

eneszv commented Apr 3, 2020

satinder147 commented Apr 4, 2020 •

edited

Loading

Changing the way reward is given so that agent performs well. #16

Are you sure you want to change the base?

Changing the way reward is given so that agent performs well. #16

Conversation

satinder147 commented Feb 18, 2019

eneszv commented Apr 3, 2020

satinder147 commented Apr 4, 2020 • edited Loading

satinder147 commented Apr 4, 2020 •

edited

Loading