Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrating from Python-based ag-solo provisioning and setup #1238

Closed
18 of 25 tasks
michaelfig opened this issue Jun 29, 2020 · 3 comments
Closed
18 of 25 tasks

Migrating from Python-based ag-solo provisioning and setup #1238

michaelfig opened this issue Jun 29, 2020 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@michaelfig
Copy link
Member

michaelfig commented Jun 29, 2020

What is the Problem Being Solved?

We want to have the entire Agoric provisioning process use idiomatic blockchain transactions (rather than Magic Wormhole) to provision new ag-solos, and migrate provisioning state to the successor chain via automated export and upgrade.

In the process, we will remove all of the Python provisioning-server, the "controller" ag-solo vat, and the ag-setup-solo Python setup script.

Description of the Design

Provisioning (swingset egress) state is already kept in the chain's Merkle tree (#1183), as are (trivially) the token balances of all cosmos accounts.

To migrate smoothly to the new implementation, we will (in order):

  • Translate all HTTP calls from the provisioning-server via the "controller" ag-solo into ag-cosmos-helper tx swingset provision-one ...
  • Bump/bootstrap the testnet so that all migrated and future provisioning is captured in the Merkle tree
  • Add a ag-setup-cosmos play export-genesis to explicitly export the chain state (as in Cosmos state export and restart #1231) from all nodes
  • Change the bootstrap process to:
    • merge any freshly-exported chain state with new genesis (except for validators, since we don't have zero-downtime upgrade) if, say, --import-from=node3 is given
    • override genesis params with the ones in chain-params.js
    • make genaccounts and create validators only if not import-from, otherwise just unsafe-reset-all
    • configure a static web server as another terraformed asset, used instead of node0
    • publish the genesis.json and network-config on the static server
  • Create ag-setup-cosmos add-egress <nickname> <my-address> to contact the chain and add an egress
  • Create ag-setup-cosmos add-delegate <nickname> <address> to contact the chain and send staking tokens to the address
  • Create ag-solo add-chain to:
    • create a new ag-cosmos-helper address/keypair if not present or --reset is given
    • download network config and genesis.json and update local configuration
    • confirm connectivity from ag-cosmos-helper to the chain
    • if ag-cosmos-helper query swingset egress <my-address> fails, then prompt the user to join Keybase and request provisioning for the address
  • Modify ag-solo start to check egress if a chain is configured, prompting to add-chain again if failed
  • Implement a new, robust agoric_faucet bot to allow Agoric admins easily to:
    • transfer staking tokens to a delegate address
    • add a new ag-solo egress to the chain with provision-one
  • At some future release, bump the testnet again to verify that the export of state works
  • Remove deprecated code:
    • remove the "controller" ag-solo from the bootstrap process and from node0
    • remove packages/cosmic-swingset/provisioning-server
    • remove packages/cosmic-swingset/setup-solo

Security Considerations

We introduce a new static https://testnet.agoric.com site, which we should verify has a reliable and secure default configuration while still allowing straightforward content changes from the bootstrap script.

Test Plan

  1. Test the bootstrap process locally under Docker
  • check that provisioning with ag-setup-solo works as before
  1. After add-chain is created, check that provisioning works with ag-solo add-chain <docker-static-web-address>
  • introduce the egress with ag-setup-cosmos add-egress me agoric1926eb97ec...
  1. Check that a second bootstrap allows add-chain to work without needing to add the egress address a second time
@michaelfig michaelfig added the enhancement New feature or request label Jun 29, 2020
@michaelfig michaelfig self-assigned this Jun 29, 2020
@michaelfig
Copy link
Member Author

@warner, would you PTAL at this plan?

@warner
Copy link
Member

warner commented Jun 30, 2020

That sounds reasonable to me. "migrating the egresses" sounds accurate but weird, so let me write down what I think it means from an E-era vat-centric perspective (I'm imagining @erights as my audience here):

  • Vat 0 was created exactly one magic exported object. This object responds to a provision message that creates a new bundle of interesting user-facing stuff and exports it as an object reference. This message requires some low-level pubkey handshake to happen.
  • The result of provision is that some external machine now has the right crypto bits and the right kind of c-list entries to get a copy (the only copy) of that bundle reference. Let's call this external machine A0. We record the crypto bits used in the provision call for later.
  • The magic provision object isn't referenced through a c-list. It's magic.
  • provision is called many times during Vat 0's lifetime. Many external machines have crypto bits that let them talk to the bundle objects in Vat 0. Machines A0, B0, C0, etc.
  • Later, we shut down Vat 0. We also shut down machines A0, B0, C0, etc.
  • We start up Vat 1, with new code.
  • We copy the cosmos token allocation table from Vat 0 (which maps cosmos-sdk pubkey to balance).
  • We take the crypto bits we recorded from before and effectively replay all the provision messages. This creates new bundle objects (because Vat 1 has maybe different code than Vat 0 did). We put them into c-lists which are indexed by the same external pubkeys as before.
  • We tell A to start up a new machine A1, with the same crypto bits as before. A1 can pretend that they participated in the provisioning process, but they don't have to do it interactively: we use the pubkey they submitted the first time around. A1 winds up with access to Vat1's bundle for A.
  • B1, C1, etc do the same thing.
  • Only the initial bundle reference is provided to the new vats. All other state from Vat0 is lost, so whatever A0/B0/C0 did with their Vat0 bundles, whatever other objects they obtained, are gone. A1/B1/C1 are new machines with just the one initial provisioning step made for them.

Of course in the old world we talked about Vats, and in the new world we talk about swingsets (which contain multiple vats).

So from a vat/cap-tp point of view, this provisioning/export step is a small shortcut: the only thing it does is let us avoid re-gathering pubkeys from all the participants each time we replace the testnet with a new one. They're still starting new machines each time, it's just that they're not starting it entirely from scratch.

@michaelfig
Copy link
Member Author

The original purpose of this issue is satisfied. The plan is incomplete, but I am closing the issue now that our emphasis has shifted to a more decentralized network (its operation won't require the deployment package).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants