Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a bug where meta-indicators are dropped during region-processing #228

Merged
merged 9 commits into from
Mar 15, 2023

Conversation

danielhuppmann
Copy link
Member

@danielhuppmann danielhuppmann commented Mar 1, 2023

This PR fixes a bug where meta-indicators are dropped in a specific combination of arguments. The culprit is fixed specifically in 71d7ec7 (and no, it wasn't a pyam bug).

The subsequent commit refactors the apply-code to only operate on the timeseries data and add meta-indicators at the end of the loop.

closes #227

@danielhuppmann danielhuppmann self-assigned this Mar 1, 2023
@danielhuppmann
Copy link
Member Author

fyi @phackstock

@phackstock
Copy link
Contributor

Ah right, thanks for the quick PR, seems like there is an issue indeed.

@phackstock
Copy link
Contributor

Coming to think of it, if the processing only uses meta preserving operations, is this a pyam bug?

@danielhuppmann danielhuppmann marked this pull request as ready for review March 13, 2023 09:03
Copy link
Contributor

@phackstock phackstock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, looks good, just three small comments.
One clarifying question from my side is how did we lose the meta indicators in the first place?
Was it the concatenation step or the aggregation? And is it something that needs to be addressed on the pyam side as well or is the behavior as expected and it was simply not correct here on the nomenclature side of things?

nomenclature/processor/region.py Show resolved Hide resolved
nomenclature/processor/region.py Outdated Show resolved Hide resolved
nomenclature/processor/region.py Outdated Show resolved Hide resolved
@danielhuppmann
Copy link
Member Author

The problem of the dropped meta-indicators was caused by the line

common_region_df.append(aggregate_df._data[index])

in the compare-and-merge between aggregated and original (provided) data.

If there was no common-region-data, an empty IamDataFrame (i.e., no meta indicators) was merged with only the timeseries data (_data) of the aggregated IamDataFrame.

This was fixed with the commit 71d7ec7, merging both IamDataFrame instances, ensuring that meta indicators were carried over.

However, this bug made me realize that the previous implementation did a lot of concatenating of IamDataFrame instances with identical meta indicators. Hence the refactoring where only the timeseries data is added to _processed_data in each region-processing step, and then concatenated and cast to a new IamDataFrame instance with the meta indicators added explicitly using

processed_dfs.append(IamDataFrame(_data, meta=model_df.meta))

@phackstock
Copy link
Contributor

If there was no common-region-data, an empty IamDataFrame (i.e., no meta indicators) was merged with only the timeseries data (_data) of the aggregated IamDataFrame.

Ah of course, we did hit an edge case there with the GENIE upload where we only had aggregated data.

Copy link
Contributor

@phackstock phackstock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to be merged (apart from the sticker)

@danielhuppmann danielhuppmann merged commit 0b3dec1 into IAMconsortium:main Mar 15, 2023
@danielhuppmann danielhuppmann deleted the bugfix/drop-meta branch March 15, 2023 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RegionProcessor.apply drops meta data
2 participants