Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add validation test for emissions factor references provided in zone configs #6001

Merged
merged 34 commits into from
Oct 24, 2023
Merged

Add validation test for emissions factor references provided in zone configs #6001

merged 34 commits into from
Oct 24, 2023

Conversation

jayaddison
Copy link
Contributor

Description

While developing #5987 it was nontrivial for me to check whether the source field defined for an emission factor matched an existing top-level sources entry in the zone.

This pull request provides a test case to perform that validation automatically, and also caters for a number of existing situations where any of the three of {averaged Electricity Maps data, or implicit well-known source references, or assumptions} are made about the emissions factor estimates.

Double check

  • I have tested my parser changes locally with poetry run test_parser "zone_key"
  • I have run pnpx prettier --write . and poetry run format to format my changes.

@github-actions github-actions bot added tests zone config Pull request or issue for zone configurations labels Oct 11, 2023
@@ -7,6 +7,24 @@


class ZonesJsonTestcase(unittest.TestCase):
# Add well-known sources that don't require config-based references here
GLOBAL_SOURCE_REFERENCES = {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be preferable to create a separate YAML reference sources file that contains these and any relevant metadata -- extracting a constant set here to hold them is something of a migration phase to figure out how many items exist and to spot patterns in them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After more thought: first it'd make sense to check whether this would be a worthwhile and acceptable change.

…sions factors stored directly on zone config objects as read from YAML"

This reverts commit 88ea0a6.
… reference classes defined above it within the module"

This reverts commit 3ca1990.
…rence, so assert that it is a required value
…d a link to the calculation that derived it

While doing this I also confirmed that HOPS is an ENTSOE transmission system operator in Croatia; the linked document is also available on their English-language homepage under Documents -> Basic Data at 'http://hops.hr/page-file/iXgT7cSSC4U28AfzW8lBZ6/publications/Temeljni_podaci_2018.pdf'

Cross-references-commit d3072b4.
… URL by defining 'sources' entries for it per-zone

Also confirmed while doing this that the comment is accurate, in that the percentages mentioned refer to the Y2020 energy mix reported by the charts at the given URL
…atemalan AMM (Administrador del Mercado Mayorista)

The link is to the page containing the API where data was extracted from and that belongs to the authority source; the docs.google.com URL for the derived analysis remains as a _url property on the relevant emissions factors
…ng the sentence, hopefully with no loss of information (and retaining the explicit disclaimer note)
… (that was a temporary measure to allow the unit tests to pass while cleaning up the zone config files)
@@ -90,7 +90,7 @@ emissionFactors:
_comment: estimated yearly share of solar & wind in this mixed renewable category
based on installed capacity in 2018
_url: https://en.wikipedia.org/wiki/Renewable_energy_in_Ukraine
source: renewable mix; assumes 50% solar, 50% wind
source: assumes 50% solar, 50% wind renewable mix
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to figures from the latest currently-available iteration of the same Wikipedia source page, in Y2021 the ratio of solar:wind capacity was 6227:1673.

However: those figures predate the conflict in Ukraine that began in Y2022 and so the numbers may not correspond to the situation more recently. Note also that data for Ukraine is currently unavailable on Electricity Maps (the parser has been down for a while; I'm not cross-linking the relevant issue thread here because it's not relevant to most of this pull request).

Copy link
Contributor Author

@jayaddison jayaddison Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: installed capacity ratio doesn't necessarily indicate actual output capacity. So even though the capacity ratio may have changed, the ratio itself may not be suitable to map 1-1 to the percentages assumed here.

- http://srldc.org/2018-19srallocation.aspx
source: calculated 20% nuclear, 80% thermal (coal)
_comment: calculated 20% nuclear, 80% thermal (coal)
source: Souther Regional Load Dispatch Center Allocation Limits 2018-2019; SRLDC 2018-2019 derived calculation for Karnataka
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zuron7 this is data you collected a few years ago now, but I wondered whether you know where there might be updated data for Karnataka from SRLDC (or elsewhere)? It looks like their website has moved from srldc.org to srldc.in, and I'm having difficulty accessing it.

@jayaddison jayaddison mentioned this pull request Oct 13, 2023
2 tasks
…are required (not optional) on each CO2 parameters model
@jayaddison
Copy link
Contributor Author

I'm going to open this as 'ready for review'. Initially my goal here was solely to write a test case to make it easier to detect source-related problems - along the way the work has branched off into a few other source related cleanups and questions, but I think I should pause before going any further into the details on those.

@jayaddison jayaddison marked this pull request as ready for review October 13, 2023 15:40
@VIKTORVAV99 VIKTORVAV99 self-requested a review October 13, 2023 15:49
@VIKTORVAV99
Copy link
Member

Since as you mentioned this has gotten a little bigger than initially intended I won't have time to review it today but at first glance everything seems pretty straightforward. So hopefully I can test and review it this weekend.

@jayaddison
Copy link
Contributor Author

Thanks @VIKTORVAV99 - it can (dare I say should?) wait until next week, though.

Copy link
Member

@VIKTORVAV99 VIKTORVAV99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works really well, most of the changes are straight forward and nothing special to note.

Just some small comments but nothing that should affect functionality.

continue
if source in self.GLOBAL_SOURCE_REFERENCES:
continue
self.assertIn(source, CONFIG_MODEL.zones[zone_key].sources)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line here has a type error and prints the following using pylance:

Argument of type "dict[str, Source] | None" cannot be assigned to parameter "container" of type "Iterable[Any] | Container[Any]" in function "assertIn"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok; and it seems that this is due to the fact that the field is optional (an x in foo check is invalid if foo is None). But it is correct for it to be an optional field. And pyright isn't convinced-enough by adding a assertIsNotNone on the previous line to eliminate the | None possibility.

Thinking about how to resolve this.. (adding or {} to the end of this line would be a quick although not-ideal way).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And pyright isn't convinced-enough by adding a assertIsNotNone on the previous line to eliminate the | None possibility.

See microsoft/pyright#2007 for the (reasonable) rationale behind that. However, the workaround suggested (add an assert ... is not None - attempted locally as assert CONFIG_MODEL.zones[zone_key].sources is not None) does not seem to be working for me. I'll try to figure out whether that's a pyright bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I think there could be a pyright bug here. assert .. is not None doesn't seem to preserve the asserted non-empty condition in the case where attribute access is used here:

                            zone_sources = CONFIG_MODEL.zones[zone_key].sources
                            assert CONFIG_MODEL.zones[zone_key].sources is not None
                            assert zone_sources is not None
                            self.assertIn(source, CONFIG_MODEL.zones[zone_key].sources)  # pyright complains here
                            self.assertIn(source, zone_sources)  # pyright does not complain here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Maybe that's as-expected - possibly because the pydantic.BaseModel that Zone derives from implements __getattr__, so the results of attribute access could change from call to call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this should now be resolved (96974de - hold the container's value in a local variable and use an assert ... is not None to help pyright narrow the typechecking for it).

}

def test_zone_sources(self):
for measurement_basis, model in CO2EQ_CONFIG_MODEL:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since measurement_basis is not used I would underscore it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, my mistake committing a partial change-in-progress there: I was considering adding that variable to the error message here to help people figure out where the source entry is missing. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully resolved; I think that making the error messages from this unit test as unambiguous as possible would make sense, so I've standardized the format of them a bit, and included the full path to the source key when it's missing (making use of the measurement_basis variable).

@jayaddison jayaddison changed the title Draft: add validation test for emissions factor references provided in zone configs Add validation test for emissions factor references provided in zone configs Oct 16, 2023
@VIKTORVAV99 VIKTORVAV99 self-requested a review October 19, 2023 07:30
Copy link
Member

@VIKTORVAV99 VIKTORVAV99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 🎉

Thanks for making the changes!

Edit: Merging as soon as the merge conflicts are resolved. :)

@jayaddison
Copy link
Contributor Author

Thank you @VIKTORVAV99! Taking a look at resolving the merge conflicts now.

@jayaddison
Copy link
Contributor Author

I've resolved the conflicts locally, although tests fail (as intended, I think) due to a recently-added CH zone source without a top-level definition in the zone config.

I'll wait to hear back from the related CH zone thread to confirm whether there's a preferred way to attribute that updated source.

@pierresegonne
Copy link
Member

I think I understand what you had in mind for CH. Yes you can definitely create:

sources:
  - SFOE, Electricity Maps:
    link: ....

And then only reference SFOE, ELectricity Maps in the individual entries for emission factors

…rived from Electricity Maps worksheet with SFOE contribution
@jayaddison
Copy link
Contributor Author

Thanks @pierresegonne - yep, you understood my question :) (I also read your earlier comment before this).

@jayaddison
Copy link
Contributor Author

Note: if this test causes too much friction in future when editing zone configs, please feel free to skip/remove them, I won't be offended.

On the other hand, if they turn out to be particularly useful, I'd be willing to try refactoring the logic into a pydantic field validator (like these I see in the events module) but I think it's worth figuring out whether the test is valuable, first.

@VIKTORVAV99 VIKTORVAV99 merged commit e673724 into electricitymaps:master Oct 24, 2023
17 checks passed
@VIKTORVAV99
Copy link
Member

Note: if this test causes too much friction in future when editing zone configs, please feel free to skip/remove them, I won't be offended.

On the other hand, if they turn out to be particularly useful, I'd be willing to try refactoring the logic into a pydantic field validator (like these I see in the events module) but I think it's worth figuring out whether the test is valuable, first.

I just merged it so let's let it run a few weeks maybe (we don't get a ton of updates to the config files). But at least in my opinion the more work that can be done by CI (and is reliable) the more time we can spend doing other things like more in depth reviews or developing new features for example.

Thanks once again!

@jayaddison jayaddison deleted the tests/emissions-factor-reference-validation branch October 24, 2023 22:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests zone config Pull request or issue for zone configurations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants