Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in the 2-area Kundur case #16

Open
frostyduck opened this issue May 26, 2021 · 1 comment
Open

Bug in the 2-area Kundur case #16

frostyduck opened this issue May 26, 2021 · 1 comment

Comments

@frostyduck
Copy link

I've had some free time in the last few days and I probably figured out a bug where, during training, the agent cannot overcome the reward boundary of -602 in your Kundur 2-area case. The fact is that during training and testing in the environment (Kundur's scheme), short circuits are not simulated. I checked it out. That is, the agent learns purely on the normal operating conditions of the system. In this case, the optimal policy is never to apply the dynamic brake, i.e. actions are always 0, which corresponds to the specified value of the reward (-602 or 603).

I'm guessing it has something to do with the PowerDynSimEnvDef modifications. Initially, you used PowerDynSimEnvDef_v2, and now I am working with PowerDynSimEnvDef_v7.

@qhuang-pnl
Copy link
Collaborator

qhuang-pnl commented Jun 2, 2021

Thanks for your feedback and your email. I will look into this and reply to you asap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants