Deep-Q Learning demo / example #3481

dwctic · 2024-09-19T16:54:08Z

dwctic
Sep 19, 2024

The only example using the QAgent is the tic-tac-toe one, and after playing with it, It doesnt appear that it actually trains a worthwhile model. After running some simulations it doesn't even appear to have learned how to block the opposing player when a win is emminent. I don't mind putting in the effort to update the example, but I need to better understand how (or if) the QAgent api even works, and am hoping someone else who has used it can help.

Is the agent training against itself only? If so, could this be part of the problem? Even when the validation win rate is 90% the model still doesn't seem to predict optimal moves.

frankfliu · 2024-09-19T17:08:04Z

frankfliu
Sep 19, 2024

You might want to take a look this example: https://towardsdatascience.com/train-undying-flappy-bird-using-reinforcement-learning-on-java-98ff68eb28bf

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deep-Q Learning demo / example #3481

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Deep-Q Learning demo / example #3481

dwctic Sep 19, 2024

Replies: 1 comment

frankfliu Sep 19, 2024

dwctic
Sep 19, 2024

frankfliu
Sep 19, 2024