-
Notifications
You must be signed in to change notification settings - Fork 3
Research Writing
Data Contract is a Service agreement between producer and consumer with attribute dependencies for downstream Data Product evolution with dedicated lineage. A data contracts can provide tools for collaboration on data requirements as product promises within a shared context that inform policies for contract mutation along side Data Product releases.
A Data Contract’s Product Promises are what the data product owners expect from its data consumer up to the latest block of information. These promises may include data quality, data usage terms and conditions, schema, service-objectives, billing, etc. Data Contract policy mutation cascaded downstream as bilateral lateral agreements that “forks” lineage as a new Data Product version. For Example, the consumer takes the risk of violating privacy. Data Producers create Data Contracts on Organization and Business Terms. The consumer of the Data Contract enforces Governance policies. The producer of the Data Contract owns the Data Product if the organization doesn't have a Governance body.
Governance policies are discussed between data producers and consumers to agree upon data producer requirements. These discussions should culminate into an amenable data structure / dataset. Structured data is conducive for pre-exsisting policies and less discussion. Less structured data will need more discussion and policy feedback loops. We need a Minimal Viable Data Contract that includes what is necessary for an organization to govern with the means of supporting policy feedback loops in a way that guides discussion in a way that balances the prioritization of outcomes and methodologies.
Interdependent data domains have sub-domains with identifiers for generating Data Products. CAT Nodes will generate and execute Virtual Data Products composed as Data Contracts that enforce Data Provenance using Bills of Materials (BOMs). BOMs are CATs' Content-Addressed Data Provenance record for verifiable data processing and transport on a Mesh network of CAT Nodes. Data Contracts will contain a BOMs lineages and act as block headers for Content-Addressed Transformers (CATs) instances. Data Products are mutated during policy feedback loops informed collaborators communicating their understanding of knowledge domains. Collaborators will identify knowledge sub-domains with references and will access sub-domains using Content-Addresses. Access is federated via knowledge domain hierarchies in abstractions that enable collaborators to participate in governance cycles by leveraging their understanding of knowledge.
CATs Data Products will consist of Data Contracts with provenance as executable BOMs lineages and act as block headers for Content-Addressed Transformers (CATs) instances that contain Data Assets. BOMs are CATs' Content-Addressed Data Provenance record for verifiable data processing and transport on a Mesh network of CAT Nodes that can contain Data Assets. A data asset may be a system or application output” (dataset) that holds value for an organization or individual that is accessible. Data Assets’ value can derive from the data's potential for generating insights, informing decision-making, contributing to product development, enhancing operational efficiency, or creating economic benefits through its sale or exchange.
CATs' Content-Addressed Data Assets are processed, sold / exchanged / published on CAT’s Data Mesh via CAT Nodes subsumed by downstream CATs’ Data Products. Data Assets consist of the following:
- Data Domains - "A predefined or user-defined Model repository object that represents the functional meaning of an" attribute "based on column data or column name such as" account identification.
- Data Objects - Content-Addresses of data sources used to extract metadata for analysis.
CATs are governable and support multi-disciplinary collaboration of data processing because CATs Architectural Quantum is an abstract governance model enforced within CATs’ Bills-Of-Materials (BOMs) for which knowledge domains are represented as meta-data of data provenance records to support domain ownership.
BOMs are unique identifiers that provide the means of data production (assembly) and transportation as reproducible lineage contextualised by knowledge domains for federated governance. BOMs consist of Data Product service Orders of data processing that are Invoiced as fulfillments of service agreements specified by Data Product’s Data Contracts
Federated Governance is enabled by BOMs due the following. The domain specific data provenance BOMs establish the legitimacy of network policy changes suggested by Fractional Stewards of Data Products by enabling them to identify data quality issues at their source on a self-serviced Data Platform of many Data Products.
CATs enables Fractional Stewards to do this because historical data production is contextualised and reproducible within the scope of their knowledge domains by design during development and production as a requirement of a service Order. CATs data processes submitted by their service Orders are Invoiced to fulfil agreements within Data Products’ Data Contracts.
A Data Contract is a Service agreement between producer and consumer with attribute dependencies for downstream Data Product evolution with dedicated lineage. Governance policy discussions between data producers and consumers in policy feedback loops about data production requirements should balance the prioritization of outcomes and methodologies should culminate into an amenable data structure / dataset.
“Data as an asset” enables the consumption, production, prosumption of Data Assets on CATs Data Mesh
“Data as an asset” 0. conceptually emphasizes recognizing and treating data as a strategic investment organizations can leverage to deliver future economic benefits by enabling the consumption, production, prosumption of ones own data as an asset. Prosumption is the consumption and production of value, "either for self-consumption or consumption by others, and can receive implicit or explicit incentives from organizations involved in the exchange." 1.
The availability of high-quality and domain-specified Data Assets enables Data Products on inter-connected CAT Nodes on CATs Data Mesh to facilitate cross-functional asset utilization within Data Initiatives in a way that support Data Sovereignty. "Data sovereignty refers to a group or individual’s right to control and maintain their own data, which includes the collection, storage, and interpretation of data." 2.
Registering and cataloging CATs can accelerate innovative Data Product creation and facilitate Data Sovereignty in Data Initiatives that discover and utilize “Data as an asset”. Data Products use and operate CAT Nodes to produce, register, and catalog “Data as an asset” as searchable and discoverable Data Assets by Data Products on CATs Data Mesh. CATs Data Assets enhances strategic, operational, and analysis informed decision-making by using BOMs as feedback loop mechanisms across domains in a way that suits specific collaborative contexts across organizations.
Data Product(s) CATs are executed by Data Contract deployments with Data Provenance by Ordering CATs that are Invoiced within Bills of Materials (BOMs). BOMs are CATs' Content-Addressed Data Provenance record for verifiable data processing and transport on CAT Mesh. Data Contracts will contain BOM lineages and act as headers for Content-Addressed Transformer instances (CATs). Their inclusion of BOMs are necessary for organizations to rapidly mutate Data Products alongside discussions that affect product outcomes and development methodologies.
Data Products are mutated during stakeholder discussions about Data Contracts with respect to network policy / protocol. These discussions continuously inform multi-lateral Data Product agreements between stakeholders and collaborators that produce and consume data using BOMs as feedback loop mechanisms for (re)submitting CAT Orders. These discussions should also culminate into a CAT Order of amenable data structures / datasets for which processing is Invoiced within BOMs. Collaborators can participate in data provenance supported product development by Content-Addressing Data as an Asset.
-
Governance Plane: z(t)
- is for the Stewardship of a Data Product Supply Network of CATs represented as a Directed Acyclic Graph of Data Product Supply
-
Control Plane: y(t)
- is for the Networking of what is Produced as a result of Science & Engineering CATs
-
Action Plane: x(t)
- is for the Science & Engineering of Data Transformation as Computational Processing, a.k.a. CATs
-
Design Description
- CATs and LangGraphs integration can enable a row wise business function as a Chart Tool of Multi-Agent Collaboration (MAC) if CAT Orders act as a Transfer (Network) Function implemented as an OOP Command Pattern for which CATs Ingress and Egress sub-processes can be executed by CATs’ Content-Addressable Router (CAR).
- Architectural Considerations: CATs can inform business decisions given the following:
- Action Plane: x(t)
- CAT Functions can be defined as LangGraph Call Tools executed by LangGraphs Tool Node
- CAT Factory produces CAT Executors integrated with LangGraphs Tool Executor.
- Control Plane: y(t) [aka Content-Addressable Router (CAR)]
- CAR integrated with LangGraphs Router.
- cadCAD (Network) Policies aka “Algorithmic Suggestions” can be deployed on LangGraphs Agent Nodes with specified Domain-Name references as Rule Asset RIDs
- Governance Plane: z(t)
- A GreyBox Model for as a feature parameterized Tensor Field with process variable (PV) as label
- The business function is a CATs Control & Action Matrix - a 2 dimensional representation of 3 dimensional space
- Action Plane: x(t)