Interoperability between IF and PCF Data Exchange protocol #905
Replies: 2 comments
-
Thanks @jmcook1186 great analysis, I would lean towards two areas:
Personally I think that a manifest file that generates a SCI score would only need to be adjusted by a few plugins to turn that into a manifest that generates a GHG value, in fact maybe a manifest file can just generate both. But we've not really explored GHG much so far so I'm guessing here. |
Beta Was this translation helpful? Give feedback.
-
What is PCF?
The PCF data exchange protocol specifies an API that serves product carbon footprint data in a specific format. https://wbcsd.github.io/data-exchange-protocol/v2/#dt-pf. Its a concept defined by PACT.
If I've understood correctly, each node/server keeps a database of pre-calculated carbon footprints, and if requests cannot be served using data in the existing database (because it is missing or expired) then there is a separate API method that triggers new calculations.
They call nodes "host systems" - they expose the API and can make secondary requests to individual permissioned hosts for specific datasets. The idea is to facilitate peer-to-peer exchange of product carbon footprint assessments in a way that is interoperable and intercomparable.
The data exchanged has to conform to a certain schema. The basic schema is known as a data model, which can be thought of as equivalent to an interface in programming. Then, for specific industries or use-cases there can be data model extensions that build on top of the basic data model, just like extending an interface.
From my reading of the docs, I understand that there's an expectation that organizations will do their own measurements and calculations according to the GHG protocol, and then export the results into an appropriate data model for peer-to-peer exchange via the PACT REST API, whereas IF also retrieves observations and generates results before exporting them in the manifest file format.
The question I want to explore is whether IF and PACT can be interoperable.
e.g.
Can IF export PACT-conformant data?
I think it could be very achievable to create an IF exhaust script that transcribes an executed IF manifest into a PACT data model format.
The bulk of the plugin logic would be validation to ensure the data conforms to the data model standards, and then it's a case of populating a JSON object with values from the manifest file.
Given that IF itself does not have strict rules around naming or units, it's likely that you'd want to know in advance that the IF results were destined for a PACT data model so you can get your naming conventions and units configured, but it is also possible to add plugins that map elements between the two formats.
While the validation logic might be pretty substantial, the IF -> PACT flow feels like a lighter lift than the converse. In fact, it might not even make sense to do it in reverse in most cases, as the idea of the manifest is to have very granular time series data and to capture fine details about an applications architecture - that granularity is needed to enable the computations and transformations defined by the pipeline, but if I've understood correctly it's really only the results summary that's captured in the PACT data model.
manifest -> data model
is lossy and therefore not really doable in reverse.Do the current data models serve the needs of software products?
Possibly not. Honestly I need to dive deeper to judge this, but my initial sense is that a software specific data model extension is needed. There might even need to be data model extensions for subcategories of software, e.g. LLMs, games, SaaS, blockchains, etc etc.
There seems to be high potential for the lessons we've learned in developing manifests could be useful in defining a software-specific data model extension, and then we could tailor an IF exhaust script to target those data models.
Can we build an IF -> PACT pipeline?
My a priori assumption is that we can do this. The challenges are to align the initial manifest configuration with the requirements of the data model. But a flow from observing usage metrics -> executing a computation pipeline -> doing any necessary unit conversions -> aggregating -> exporting to data model format -> serving via PACT API seems achievable.
There are some interesting nuances such as the way each system treats units - we allow users to prescribe their own units, but track them in metadata - the idea is to be able to audit units through a pipeline and eventually we'd like to be able to do some static analysis, almost like compiler checks, using the metadata to ensure units are correctly handled right through a pipeline. PACT on the other hand assert that
declared units
have to be one of a set of enum variants.Next steps
i think ultimately we can make these systems interoperable, and that this could be well worth investing some time and energy into. However, seeing as IF is mostly for computing manifests and PACT is mostly for exchanging the resulting data, with the expectation that data has been generated according to the GHG protocol, two fundamental questions emerge:
We can start a discussion here with the aim of designing a prototype exhaust plugin for IF, and see what issues and opportunities we bump into!
Beta Was this translation helpful? Give feedback.
All reactions