Interoperability between IF and PCF Data Exchange protocol #905

jmcook1186 · 2024-07-11T11:15:54Z

jmcook1186
Jul 11, 2024
Maintainer

What is PCF?

The PCF data exchange protocol specifies an API that serves product carbon footprint data in a specific format. https://wbcsd.github.io/data-exchange-protocol/v2/#dt-pf. Its a concept defined by PACT.

If I've understood correctly, each node/server keeps a database of pre-calculated carbon footprints, and if requests cannot be served using data in the existing database (because it is missing or expired) then there is a separate API method that triggers new calculations.

They call nodes "host systems" - they expose the API and can make secondary requests to individual permissioned hosts for specific datasets. The idea is to facilitate peer-to-peer exchange of product carbon footprint assessments in a way that is interoperable and intercomparable.

The data exchanged has to conform to a certain schema. The basic schema is known as a data model, which can be thought of as equivalent to an interface in programming. Then, for specific industries or use-cases there can be data model extensions that build on top of the basic data model, just like extending an interface.

From my reading of the docs, I understand that there's an expectation that organizations will do their own measurements and calculations according to the GHG protocol, and then export the results into an appropriate data model for peer-to-peer exchange via the PACT REST API, whereas IF also retrieves observations and generates results before exporting them in the manifest file format.

The question I want to explore is whether IF and PACT can be interoperable.
e.g.

Can an IF run be made to output its results to a PACT data model?
Do we need a data model extension specifically for software (do we need sub-models for different types of software)?
Can IF and PACT enjoy a two way interoperability where data models can become manifest files and vice versa?
Can we establish a pipeline where IF and PACT are joined in a pipeline with IF doing observations and calculations and PACT serving the data p2p?

Can IF export PACT-conformant data?

I think it could be very achievable to create an IF exhaust script that transcribes an executed IF manifest into a PACT data model format.

The bulk of the plugin logic would be validation to ensure the data conforms to the data model standards, and then it's a case of populating a JSON object with values from the manifest file.

Given that IF itself does not have strict rules around naming or units, it's likely that you'd want to know in advance that the IF results were destined for a PACT data model so you can get your naming conventions and units configured, but it is also possible to add plugins that map elements between the two formats.

While the validation logic might be pretty substantial, the IF -> PACT flow feels like a lighter lift than the converse. In fact, it might not even make sense to do it in reverse in most cases, as the idea of the manifest is to have very granular time series data and to capture fine details about an applications architecture - that granularity is needed to enable the computations and transformations defined by the pipeline, but if I've understood correctly it's really only the results summary that's captured in the PACT data model. manifest -> data model is lossy and therefore not really doable in reverse.

Do the current data models serve the needs of software products?

Possibly not. Honestly I need to dive deeper to judge this, but my initial sense is that a software specific data model extension is needed. There might even need to be data model extensions for subcategories of software, e.g. LLMs, games, SaaS, blockchains, etc etc.

There seems to be high potential for the lessons we've learned in developing manifests could be useful in defining a software-specific data model extension, and then we could tailor an IF exhaust script to target those data models.

Can we build an IF -> PACT pipeline?

My a priori assumption is that we can do this. The challenges are to align the initial manifest configuration with the requirements of the data model. But a flow from observing usage metrics -> executing a computation pipeline -> doing any necessary unit conversions -> aggregating -> exporting to data model format -> serving via PACT API seems achievable.
There are some interesting nuances such as the way each system treats units - we allow users to prescribe their own units, but track them in metadata - the idea is to be able to audit units through a pipeline and eventually we'd like to be able to do some static analysis, almost like compiler checks, using the metadata to ensure units are correctly handled right through a pipeline. PACT on the other hand assert that declared units have to be one of a set of enum variants.

Next steps

i think ultimately we can make these systems interoperable, and that this could be well worth investing some time and energy into. However, seeing as IF is mostly for computing manifests and PACT is mostly for exchanging the resulting data, with the expectation that data has been generated according to the GHG protocol, two fundamental questions emerge:

can IF perform computations that conform to the GHG protocol? If so, reformatting outputs for PACT should be no problem.
can PACT handle alternative measurement protocols to GHG? If so, we have a larger design space to work in.

We can start a discussion here with the aim of designing a prototype exhaust plugin for IF, and see what issues and opportunities we bump into!

jmcook1186 · 2024-07-11T11:48:21Z

jmcook1186
Jul 11, 2024
Maintainer Author

cc @grugnog @jawache

0 replies

jawache · 2024-07-23T15:09:29Z

jawache
Jul 23, 2024
Maintainer

Thanks @jmcook1186 great analysis,

I would lean towards two areas:

Can a IF manifest file format (and eventual IMP ISO spec) be "the" communication format for PACT, same way as HTML is the web protocol and HTTP is the communication mechanism? If so then that will build true interoperability across the board and matches with some of the conversations we're having outside of this - if a piece of software is communicating an environmental impact, communicate it as a manifest file. I mean even if everyone just spoke YAML we'd be a long way to interoperability, but if they spoke the same dialect of YAML we'd be even further along.
I agree that perhaps an IF manifest file can be the mechanism for an organization to generate it's own bespoke data-pack, I think it speaks to the complexity and bespokeness of each organizations software stack that they would need their own manifest files.

Personally I think that a manifest file that generates a SCI score would only need to be adjusted by a few plugins to turn that into a manifest that generates a GHG value, in fact maybe a manifest file can just generate both. But we've not really explored GHG much so far so I'm guessing here.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interoperability between IF and PCF Data Exchange protocol #905

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Interoperability between IF and PCF Data Exchange protocol #905

jmcook1186 Jul 11, 2024 Maintainer

What is PCF?

Next steps

Replies: 2 comments

jmcook1186 Jul 11, 2024 Maintainer Author

jawache Jul 23, 2024 Maintainer

jmcook1186
Jul 11, 2024
Maintainer

jmcook1186
Jul 11, 2024
Maintainer Author

jawache
Jul 23, 2024
Maintainer