-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export aggregation difference to excel #237
Export aggregation difference to excel #237
Conversation
Workflow seems fine from my side. "clones the corresponding workflow directory locally" refers in my case to |
Exactly, as this is just a first quick draft. I'll work on making the instructions better.
You would run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @phackstock, a few suggestions below.
One more comment: please make sure to |
Not sure I understand the point of why do to this, the output that I currently get of the difference file looks as attached. That should be fine, no? |
Thanks @phackstock for showing the file, I guess that pandas changed their output structure recently... But I still think that |
Not sure why test tests are failing, they're running fine on my local machine ... |
Tests are passing now. There are however, some API changes that I've had to make that I'm not super happy with:
def apply(self, ...):
return self.apply_and_capture_differences()[0] this would preserve the interface.
I can also keep it as is for now and cross that bridge when we get there, which will be very soon. |
624f364
to
6adf8ca
Compare
Co-authored-by: Daniel Huppmann <[email protected]>
I have implemented the following updates:
|
@danielhuppmann since you think the cli is too much for the non-expert user should I just remove the option? |
No, keep the CLI - I think having both the CLI implementation and the Python-code-example is the most user-friendly approach! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, a few minor suggestions and clarifications below…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, a few minor suggestions and clarifications below…
Co-authored-by: Daniel Huppmann <[email protected]>
Updated the return value of |
Almost there... |
Co-authored-by: Daniel Huppmann <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
Started by a feature request from @orichters (here: https://github.com/iiasa/ngfs-phase-4-internal-workflow/discussions/6) I have implemented a first version of exporting the differences found in region aggregation to excel.
To be precise, this tackles the case where a model natively reports a region, e.g. World that is also re-calculated as part of our region aggregation. So far only the top rows of the difference data frame are printed as part of the log file (e.g. https://data.ece.iiasa.ac.at/ngfs-phase-4-internal/#/uploads/jobs/21/log):
I have added an
export_difference
option toRegionProcessor.apply()
that writes the data to a file called difference.xlsx.So far, the workflow I envision, should work as follows:
nomenclature run-region-processing your_data.xlsx --export-differences
, they then get the differences into a file called difference.xlsx for more close inspection.@orichters and @danielhuppmann, does this sound like a good workflow to you?
Update
For the data frame of the differences between original and aggregated data I now added a relative (absolute) difference column, calculated as
100*abs((original-aggregated)/original)
.I also sort the data frame by the newly created column to show the user the biggest differences first.
Finally, I changed the relative tolerance from 1e-5 to 1% to reduce the rate of false positives.