-
Notifications
You must be signed in to change notification settings - Fork 3
Home
JEJodesty edited this page Jul 26, 2024
·
98 revisions
Update readme and refactor
- Review Designs within the context of Data Sovereignty;
- Research CLI wrapper alternative to CDKTF
- Review Database Sharding within the context of Data Products’ data: https://aws.amazon.com/what-is/database-sharding/
- Review Value of data
- Verify CATs’ Project Update: Factory & Executor components; Invoice, Order, Function, Executor, & BOM Block Designs, Structure’s Ray Cluster Deployment on Kubernetese, BOM Initialization, CAT Node & Node Design
- Research Dynamic Terraform Providers for Plant Deployments
- Verify CATs’ Project Update: Structure Block Design, Data Service Collaboration Diagram, Ray Integration
- Watched Computational Governance Panel
- Review Ray documentation for InfraFunction Hooks
- Research Open Contracting Data Standard with respect to Data Product Teams: https://standard.open-contracting.org/latest/en/
- 1/22:
- Updated CATs integration tests and demo
- Resolved dependency bug
- Verify CATs’ Project Update: Process Component, Sub-Process Logging, Executor & Function Components
- 1/23:
- Updated Documentation and Demo
- Added License and Packaging for CATs
- Verify CATs’ Project Update: s3 & CoD Integration
- 1/24:
- Updated Documentation & Refactor
- CATs Data Verification
- Verify CATs’ Project Update: Updating Order Structure, Node, Service & Structure Components
- 1/25 - 1/26:
- Updated Documentation & Refactor
- Update Factory
- Reviewed Novo Nordisk Data Mesh Platform discussion
- Verify CATs’ Project Update: CATs s3 cache, BOM ERD
- Included Ubuntu 20.04 Installation Update
- Refactored CATs
- Researched CAT cache access management
- Research Economic Adapters for CATs from Ocean Protocol
- Research multilevel linked-list for CATs’ subgraph
- Research bidirectional mapping supports multilevel linked-list for CATs’ subgraph
- Consider Transducers for CAT MIMO
- Updated PR Template
- Review Model-Driven Engineering: https://en.wikipedia.org/wiki/Model-driven_engineering
- 2/12: Drafted CATs capabilities in GitHub Project and reviewed Activity Artifact Policy
- 2/13: Reviewed implementation examples of Data Contracts
- 2/14 - 2/15:
- Reviewed Data Mesh Roundtable Discussions about Data Contracts and “Agile” Data Products
- Attended Protocol Labs project updates
- 2/16:
- Research System Architecture layers and wrote notes as Data Contract Article for CATs
- Wrote Article: What does a CATs data contract do?
Data Mesh Resources:
- “Inside a Data Contract”: https://www.youtube.com/watch?v=ye4geXMuJKs
- “Agile in Data”: https://www.youtube.com/watch?v=XnstATam0jM
- Data Contract Articles: https://www.datamesh-architecture.com/#data-contract
Data Contract Implementation Examples:
- https://blog.det.life/data-contracts-a-guide-to-implementation-86cf9b032065
- https://levelup.gitconnected.com/create-a-web-scraping-pipeline-with-python-using-data-contracts-281a30440442
- https://docs.soda.io/soda/data-contracts.html
System Architecture:
- 2/19 - 2/21: Contextualize value of BOM within the context of Data as a Product that contains Data Contracts
- 2/22 - 2/23: Updated Readme informed by examples of Data Assets within the context of Machine-Readable Cataloging
- Wrote Article: What is a Content-Addressed Data Asset (CADA)?
Resources:
- https://www.loc.gov/marc/umb/um01to06.html
- https://docs.informatica.com/data-engineering/data-engineering-quality/10-2-1/business-glossary-guide/glossary-content-management/business-term-links/data-asset.html
- 2/26: Researched Digital Asset Management related Data Contracts and Data Mesh Registry & considered a Rule Asset being used for Network Policies in addition to Attribute Quality
- 2/27: Considered Data & Rule Assets for Data Mesh Registry Artifact Schema
- https://towardsdatascience.com/the-data-mesh-registry-a-window-into-your-data-mesh-20dece35e05a
- https://docs.informatica.com/data-engineering/data-engineering-quality/10-2-1/business-glossary-guide/glossary-content-management/business-term-links/data-asset.html
- https://docs.informatica.com/data-engineering/data-engineering-quality/10-2-1/business-glossary-guide/glossary-content-management/business-term-links/rule-asset.html
- 2/28: Verify CATs Executing FaaS on PaaS
- 2/29: Review Domain-Oriented Ownership with respect to Conway's law
- 3/1: Review Data Column Lineage value to in establishing Domain-Oriented Ownership in CATs Invoice in a way that makes BOM’s searchable and discoverable
- Wrote Article: What makes CATs Governable by including BOMs within Data Product’s Data Contracts?
- 3/4: Contextualize “Data as an asset” with CATs Architecture
- 3/5: Contextualize Data sovereignty with “Data as an asset” for CATs Data Mesh
- 3/6: Contextually map Data Contract initialization roles to cross-functional Operational Model for Data Products
- 3/7: Contextually map "Fractional Ownership" of "Decentralized Data Objects" ("DDOs" / "Data Assets") to "Data as an asset" and Data Partioning / Sharding
- 3/8: Contextualize Ocean Protocol & CATs Architecture with prosumption
- Wrote Article: “Data as an asset” enables the consumption, production, prosumption of Data Assets on CATs Data Mesh
Resources:
- 3/11: Review ocean Data NFTs and Datatokens and relate Hexagonal architecture to Data Contract SLAs
- https://docs.oceanprotocol.com/developers/contracts/datanft-and-datatoken
- https://en.wikipedia.org/wiki/Non-fungible_token#:~:text=A%20non%2Dfungible%20token%20(NFT,to%20be%20sold%20and%20traded.
- https://en.wikipedia.org/wiki/Hexagonal_architecture_(software)
- https://blog.thepete.net/blog/2020/09/25/service-templates-service-chassis/
- 3/12
- Review Bidirectional Mapping libraries for Data Mesh BOM graph for cataloged representation
- Review Custom Terraform Provider software that enables providers to be written in any language for CATs Plant
- Review Model-Based System Engineering relate it to knowledge organization infrastructure
- https://medium.com/block-science/knowledge-networks-and-the-politics-of-protocols-af81ad0fa2d4
- 3/13 - 3/15
- Review 4 kinds of data moats within the context of data’s strategic value as a “data asset”
- Review Model-driven architecture approaches for CATs Architectural Quantum
- Review ocean.py for integration into CATs’ ingress and egress
- Review “Commons-based peer production” for CAT Node
- Updated CATs architecture, readme, and interactive logs
- 3/18 - 3/20: Contextualize data contract creation team' role responsibilities into modern roles
- 3/21 - 3/22:
- Contextualize modern data contract creation team' role responsibilities into CATs Control and Action planes for an operational model for the placement of Data Stewardship responsibilities
- Communicate the value of Data Contract inclusion in BOM bellow.
- Wrote Article: Why should Data Contracts be included in CATs' BOMs for Data Product development on a Data Mesh?
- 3/25 - 3/27:
- Review Bitol's Data Contract examples
- Review Data Contract Implementation Guide for CATs
- Review Wayfair's differentiation of Data Mesh design lean personas: Data Producer, Data Consumers, and Data Engineer
- Contextualize IBMs Knowledge Catalog as a DataOps tool in consideration of KMS and CAT-aloging
- Review Statistical Process Control to contextualize the inclusion of https://www.soda.io/
- Research data product life cycle to contextualize Data Product Manager, Data Steward, and Data Engineer
- 3/28 - 3/29:
- Contextualize a Federated Governance Model within Federated Computational Governance
- Research types of Data Valuation to avoid confirmation bias
- Contextualize Event-Driven programming for CAT Plant and Dataflow programming for CATs Process and InfrFunction
- 4/1 - 4/3:
- Research "Stewardship Fractalization" and System Architecture facilitating it and relate it to Data Stewardship
- Consider Dynamic Prompt engineering using Generative AI via an LLM for contextualization of CAT Actions that fulfill Data Contracts. These actions are initially contextualized with CATs Architectural Quantum.
- 4/4 - 4/5:
- Distinguish between Quantitative and Qualitative design drivers for end-user and data product consumer contextualization
- Consider a Streaming Data Integration for Stewardship lineage views and metadata management
- Consider each CAT Factory Client a Stream Broker as a Consumer and Producer (https://www.scaler.com/topics/kafka-broker/)
- Consider "IoT Edge-Application Management" for "IoT Analytics"
- Consider a language like SISAL for stream dataflow composition
- Review updated CoD Architecture
- Research how Analysts supports domain-oriented ownership in consideration of data procurement
- Research "telemetry data pipelines" from starburst.io to contextualize a “telemetry-catalog” in "data lakehouse" as a flatfile store
- Consider Data Engineering pain points to split and contextualize Data Engineering within CATs Action & Control Planes
- Distinguish the difference between Data Lakes and Data Federation for the implementation of a data lake solution
- Research GPT to communicate a Federated Governance Model designed to be a GPT
- 4/15:
- Contextualize LLMs and Generative AI for Fractional Data Stewardship
- Reduce scope of Data Product with Stewarship Fractionaliztion dApp steps
- Note Dataflow Programming for CAT
- Note Data Flow Architecture for project definition
- Note Statistical process control (SPC) (as user responsibility)
- 4/16-18:
- Apply Manufacturing Production to BOM design with respect to an Engineering & Manufacturing BOM types
- Contextualize CAT orders with a Transfer (Network) Function
- Contextually lift Mesh partnership with Model-Based Institution Design (MBID) and relate to Model-Based System Engineering in preperation to include Computer-Aided Governance in CATs3
- Research LangGraph for CAT Mesh reification
- Note different types of SBOMs for each CAT Arch Quantum SubComponents
- Consider Multi-Agent Conversation for row-wise business function
- https://arxiv.org/abs/2308.08155
- https://github.com/langchain-ai/langgraph/blob/main/examples/multi_agent/multi-agent-collaboration.ipynb
- Consider Pro-curation for on-boarding information onto CAT Mesh reflective of Prosumer
- Research integrating langgraph
tool_node
into CAR (Content-Addressable Router) - Research integrating langgraph
tool_executor
into CATs' Executor - Review LangChain Agents for Network Governance Reification graph state tracking
- Review "Knowledge Networks and the Politics of Protocols" within the context of Roles
- Review "Engineering for Legitimacy"
- Review Scaled and Leveled Stewardship
- Review contextualization of responsibilities based on Prompt Engineering Questions & general responsibilities of "Fractional Stewards"
- Review Project Roadmap for Stewardship Fractalization in consideration for CAT Team Dynamics
- Review Fractional Stewardship MVP approach in consideration to publishing a Policy development in Steward profile to Agent Nodes in LangGraph. These Policies are front loaded as "algorithmic suggestions"
- Note Abstract User Stories as application references
- Review "DAO Governance Model" for comparison to Federated Computational Governance Model
- Consider Marketing Steward using Prompt Engineering / partial input being a "Comparison Table/Matrix summarizing different Stewardship Organization/Solutions missions/purposes, designs and features"
- Removing s3 cache from CATs and replace with local storage solution
- Removed s3 cache from CATs and replaced with local storage solution
- Research adaptive Retrieval Augmented Generation (aRAG)
- Reviewed KMS-identity for integration into CATs
- Read "A Language for Studying Knowledge Networks: The Ethnography of LLMs"
- The Plant is a Transfer Function that accepts an Order as Input and produces and Output with by executing Function (Process) with Executor (Actuator) that executes a Process(es). The Plant exposes the control variable (u(t)) for Control Feedback Loop and the Function (Process) produces the process variable (y(t)). The Process Variable is the Statistical Process Control of CATs Dataset I/O (Ingress/Egress)
- Docker can be executed within an Alpine Linux Docker container ["Docker in Docker" (DinD)] for upcoming cadCAD's nested Block executions as a summation of the control variable (u(t)) that configure CATs Data Product and the summation of the process variable (y(t))
- Note: "Integral windup particularly occurs as a limitation of physical systems, compared with ideal systems, due to the ideal output being physically impossible (process saturation: the output of the process being limited at the top or bottom of its scale, making the error constant)."
- Concern: "Integral windup particularly occurs as a limitation of physical systems, compared with ideal systems, due to
the ideal output being physically impossible (process saturation: the
output of the process being limited at the top or bottom of its scale, making the error constant)."
- https://en.wikipedia.org/wiki/Integral_windup
- Alleviated by "A CAT at its core is a unit of computational work specified by the triplet 1) what the input is, 2) what does the computation, and 3) what the output is. Controllers require feedback, which is currently outside of the scope of a single cat. Any cyclic orchestration must be external to CATs." - BlockScience
- Alpine Linux Docker can be the execution paradigm of cadCAD and CATs Plant because they can run as Docker inside Docker "DinD” to and functionally map cadCAD multi-dimensional blocks to CAT Functions
- Review RAG stewardship fictionalization context
- Review Software Governance with respect to fractional stewardship
- Consider a Stewardship Profile that maps to agents within a Multi-agent system
- Consider roles as Architectural Responsibilities with respect to RolePlayer
- Review Docker workload on-boarding for cat Refactor
- Cosider homestar (Everywhere Computer network) for IPVM inclusion for "resilience, certainty or portability"
- Updated Bacalhau Node and refactor for CoD interoperability for CATs v3
- Exposed ingress and egress to action plane via Process with a interoperable integration point for CATs v3
- Included data product disciplines to CATs Architectural Quantum for CATs v3
- Implement InfraStructure Sub Component separately
- IPFS daemon initiated by CAT Node
- partially implement function for applying sbom
- Refactor infrafunction composes Processor & Plant and Infrstructure
- Installed KMS locally for cat/rid Integration
- bring your own cache otherwise it is local (bg: Expanso introduces breaking changes to bacalhau without stable release)
- reviewed pid controller to suggest the Architectural purpose of CATs in consideration of a Monad
- Expressed Architectural purpose of CATs data mesh as a function
- Described design for Multi-Agent Collaboration (MAC) for CATs using Content-Addressable Router (CAR)
- Contextualize CATs explanation with "Data Mesh Architecture: Interoperability, Co-Operation, and Co-Regulation"
- Contextualize PID Controler within the context of CATs & cadCAD
- Consider a Monadic Transfomer for CATs
- Investigating kubray helm chart not being available due to Github server connectivity issues and trying
alternative chart sources:
- Old Sources:
- Alternative Sources:
- Conclusion: Consider a chart directly from a cloned kuberay repo offline or hosting elsewhere