Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(multichain-testing): stakeIca contract e2e test #9534

Merged
merged 13 commits into from
Jul 3, 2024

Conversation

0xpatrickdev
Copy link
Member

@0xpatrickdev 0xpatrickdev commented Jun 19, 2024

closes: #8896

Description

Adds e2e tests and relevant tooling for stakeIca.contract.js to multichain-testing. More specifically:

  • Adds multichain-testing/tools/deploy.ts to: 1) build a contract and proposal with agoric run (local bin), 2) copy files to container 3) run installBundles and runCoreEval
  • Adds logic to gather chain info from the starship environment (registry node on localhost:8081) and execute revise-chain-info proposal in local testing and CI
  • Tests happy-path wallet flows of stakeIca via stakeOsmo and stakeAtom instances

Areas for improvement:

  • Notifiers for wallet offer results are not working correctly here. Instead of relying on them for wallet offer results we are polling vstorage on an interval (see makeRetryUntilCondition) for the initial offer, and verify the behavior by querying state on remote chains. If an offer result results in an error, that is not currently captured until orchestration e2e testing: validate offer results #9643

Copy link

cloudflare-workers-and-pages bot commented Jun 20, 2024

Deploying agoric-sdk with  Cloudflare Pages  Cloudflare Pages

Latest commit: 55f1896
Status: ✅  Deploy successful!
Preview URL: https://26faff3b.agoric-sdk.pages.dev
Branch Preview URL: https://9042-stake-atom-e2e-rebased.agoric-sdk.pages.dev

View logs

@0xpatrickdev 0xpatrickdev force-pushed the 9042-stake-atom-e2e-rebased branch from f0c9bb1 to 37f2bd3 Compare June 20, 2024 03:51
mergify bot added a commit that referenced this pull request Jun 21, 2024
refs: #8896

## Description

Extracted from #9534 to focus that PR on multichain and lighten the review. Also it had merge conflicts with master that this resolves.

### Security Considerations
nothing new

### Scaling Considerations
no

### Documentation Considerations
none

### Testing Considerations
new coverage

### Upgrade Considerations
none
Copy link
Contributor

@LuqiPan LuqiPan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments


```sh
make setup-kind
make setup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing an error running this command but I'm not quite sure why. Did you happen to see this error before?

Expand to see logs(it's long)
make setup
bash /Users/luqi/github/Agoric/agoric-sdk/multichain-testing/scripts/dev-setup.sh
All binaries are installed
kind create cluster --name agship
Creating cluster "agship" ...
 ✓ Ensuring node image (kindest/node:v1.30.0) 🖼
 ✓ Preparing nodes 📦
 ✓ Writing configuration 📜
 ✗ Starting control-plane 🕹️
Deleted nodes: ["agship-control-plane"]
ERROR: failed to create cluster: failed to init node with kubeadm: command "docker exec --privileged agship-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1
Command Output: I0625 00:52:21.621598     139 initconfiguration.go:260] loading configuration from "/kind/kubeadm.conf"
W0625 00:52:21.631689     139 initconfiguration.go:348] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta3, Kind=JoinConfiguration
[init] Using Kubernetes version: v1.30.0
[certs] Using certificateDir folder "/etc/kubernetes/pki"
I0625 00:52:21.660190     139 certs.go:112] creating a new certificate authority for ca
[certs] Generating "ca" certificate and key
I0625 00:52:21.832105     139 certs.go:483] validating certificate period for ca certificate
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [agship-control-plane kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local localhost] and IPs [10.96.0.1 172.18.0.2 127.0.0.1]
[certs] Generating "apiserver-kubelet-client" certificate and key
I0625 00:52:22.096770     139 certs.go:112] creating a new certificate authority for front-proxy-ca
[certs] Generating "front-proxy-ca" certificate and key
I0625 00:52:22.191994     139 certs.go:483] validating certificate period for front-proxy-ca certificate
[certs] Generating "front-proxy-client" certificate and key
I0625 00:52:22.277570     139 certs.go:112] creating a new certificate authority for etcd-ca
[certs] Generating "etcd/ca" certificate and key
I0625 00:52:22.389020     139 certs.go:483] validating certificate period for etcd/ca certificate
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [agship-control-plane localhost] and IPs [172.18.0.2 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [agship-control-plane localhost] and IPs [172.18.0.2 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
I0625 00:52:22.775910     139 certs.go:78] creating new public/private key files for signing service account users
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
I0625 00:52:22.918423     139 kubeconfig.go:112] creating kubeconfig file for admin.conf
[kubeconfig] Writing "admin.conf" kubeconfig file
I0625 00:52:23.175013     139 kubeconfig.go:112] creating kubeconfig file for super-admin.conf
[kubeconfig] Writing "super-admin.conf" kubeconfig file
I0625 00:52:23.303598     139 kubeconfig.go:112] creating kubeconfig file for kubelet.conf
[kubeconfig] Writing "kubelet.conf" kubeconfig file
I0625 00:52:23.472604     139 kubeconfig.go:112] creating kubeconfig file for controller-manager.conf
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
I0625 00:52:23.680414     139 kubeconfig.go:112] creating kubeconfig file for scheduler.conf
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
I0625 00:52:23.816255     139 local.go:65] [etcd] wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
I0625 00:52:23.816525     139 manifests.go:103] [control-plane] getting StaticPodSpecs
I0625 00:52:23.817755     139 certs.go:483] validating certificate period for CA certificate
I0625 00:52:23.818030     139 manifests.go:129] [control-plane] adding volume "ca-certs" for component "kube-apiserver"
I0625 00:52:23.818044     139 manifests.go:129] [control-plane] adding volume "etc-ca-certificates" for component "kube-apiserver"
I0625 00:52:23.818046     139 manifests.go:129] [control-plane] adding volume "k8s-certs" for component "kube-apiserver"
I0625 00:52:23.818048     139 manifests.go:129] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-apiserver"
I0625 00:52:23.818050     139 manifests.go:129] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
I0625 00:52:23.818464     139 manifests.go:158] [control-plane] wrote static Pod manifest for component "kube-apiserver" to "/etc/kubernetes/manifests/kube-apiserver.yaml"
I0625 00:52:23.818475     139 manifests.go:103] [control-plane] getting StaticPodSpecs
I0625 00:52:23.818583     139 manifests.go:129] [control-plane] adding volume "ca-certs" for component "kube-controller-manager"
I0625 00:52:23.818592     139 manifests.go:129] [control-plane] adding volume "etc-ca-certificates" for component "kube-controller-manager"
I0625 00:52:23.818595     139 manifests.go:129] [control-plane] adding volume "flexvolume-dir" for component "kube-controller-manager"
I0625 00:52:23.818597     139 manifests.go:129] [control-plane] adding volume "k8s-certs" for component "kube-controller-manager"
I0625 00:52:23.818598     139 manifests.go:129] [control-plane] adding volume "kubeconfig" for component "kube-controller-manager"
I0625 00:52:23.818600     139 manifests.go:129] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-controller-manager"
I0625 00:52:23.818602     139 manifests.go:129] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-controller-manager"
I0625 00:52:23.818981     139 manifests.go:158] [control-plane] wrote static Pod manifest for component "kube-controller-manager" to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
I0625 00:52:23.818989     139 manifests.go:103] [control-plane] getting StaticPodSpecs
[control-plane] Creating static Pod manifest for "kube-scheduler"
I0625 00:52:23.819110     139 manifests.go:129] [control-plane] adding volume "kubeconfig" for component "kube-scheduler"
I0625 00:52:23.819465     139 manifests.go:158] [control-plane] wrote static Pod manifest for component "kube-scheduler" to "/etc/kubernetes/manifests/kube-scheduler.yaml"
I0625 00:52:23.819554     139 kubelet.go:68] Stopping the kubelet
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
I0625 00:52:24.210376     139 loader.go:395] Config loaded from file:  /etc/kubernetes/admin.conf
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet. This can take up to 4m0s
[kubelet-check] The kubelet is not healthy after 4m0.001385985s

Unfortunately, an error has occurred:
	The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' returned error: Get "http://localhost:10248/healthz": context deadline exceeded


This error is likely caused by:
	- The kubelet is not running
	- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
	- 'systemctl status kubelet'
	- 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
	- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
	Once you have found the failing container, you can inspect its logs with:
	- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'
couldn't initialize a Kubernetes cluster
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init.runWaitControlPlanePhase.func1
	k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init/waitcontrolplane.go:110
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init.runWaitControlPlanePhase
	k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init/waitcontrolplane.go:115
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
	k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:259
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
	k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:446
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
	k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:232
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
	k8s.io/kubernetes/cmd/kubeadm/app/cmd/init.go:128
github.com/spf13/cobra.(*Command).execute
	github.com/spf13/[email protected]/command.go:940
github.com/spf13/cobra.(*Command).ExecuteC
	github.com/spf13/[email protected]/command.go:1068
github.com/spf13/cobra.(*Command).Execute
	github.com/spf13/[email protected]/command.go:992
k8s.io/kubernetes/cmd/kubeadm/app.Run
	k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:52
main.main
	k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25
runtime.main
	runtime/proc.go:271
runtime.goexit
	runtime/asm_amd64.s:1695
error execution phase wait-control-plane
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
	k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:260
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
	k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:446
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
	k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:232
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
	k8s.io/kubernetes/cmd/kubeadm/app/cmd/init.go:128
github.com/spf13/cobra.(*Command).execute
	github.com/spf13/[email protected]/command.go:940
github.com/spf13/cobra.(*Command).ExecuteC
	github.com/spf13/[email protected]/command.go:1068
github.com/spf13/cobra.(*Command).Execute
	github.com/spf13/[email protected]/command.go:992
k8s.io/kubernetes/cmd/kubeadm/app.Run
	k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:52
main.main
	k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25
runtime.main
	runtime/proc.go:271
runtime.goexit
	runtime/asm_amd64.s:1695
make: *** [setup-kind] Error 1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't seen this before, but have a few suggestions:

  1. Ensure Kubernetes is enabled in Docker and Docker has enough resources allocated:
screenshots image image

_I'm not sure all of these resources are necessary, but this is what mine is configured to. Would be great if we can determine the minimum amount required. I suspect at least ~4 CPU and ~8GB RAM given the resource overrides in config.yaml.

  1. Try adding --verbosity 9 to the setup-kind command, for more detailed log output: kind create cluster --name agship --verbosity 9

  2. Take a look at the Starship Docs for the primary source of truth and see if there's something I might've missed documenting.


test.before(async t => {
const { deleteTestKeys, setupTestKeys, ...rest } = await commonSetup(t);
// deleteTestKeys().catch();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question

Is this still needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In active development, if a test fails with an uncaught exception, the test.after block will not run and the key will not be deleted. This will cause the next test run to fail.

The only harm in leaving this is some extra unnecessary work in the happy path (attempting to delete keys that aren't there), so it may be best to include this.

I'm not sure the best pattern has been arrived at yet, so suggestions are welcome. It would be nice if we didn't need to rely on the keyring in the container and could use a new DirectSecp256k1HdWallet per run.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've decided to keep this around, as it's harmless. Included a comment about why it's there.

const accounts = ['user1', 'user2'];

test.before(async t => {
const { deleteTestKeys, setupTestKeys, ...rest } = await commonSetup(t);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question

Is it me or commonSetup doesn't seem to return deleteTestKeys nor setupTestKeys?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are part of makeKeyring, which is spread into commonSetup via keyring here

t.log(`${scenario.contractName} makeAccountInvitationMaker offer`);
const makeAccountofferId = `makeAccount-${Date.now()}`;

// FIXME we get payouts but not an offer result; it times out
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question

Did you plan to address this in this PR or a separate PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main blocker on the PR currently. Ideally, we can square this away here and read offer results from the Notifier / @agoric/casting instead of polling vstorage.

However, since we are still able to read offer results in vstorage, there is an opportunity to fix forward if we wish to land this faster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #9643 so we can tackle this in a separate PR.

@0xpatrickdev 0xpatrickdev force-pushed the 9042-stake-atom-e2e-rebased branch from 37f2bd3 to 3c1654c Compare July 1, 2024 23:23
@0xpatrickdev 0xpatrickdev changed the title feat(multichain-testing): dynamic chain registry feat(multichain-testing): stakeIca contract e2e test Jul 1, 2024
@0xpatrickdev 0xpatrickdev force-pushed the 9042-stake-atom-e2e-rebased branch from ed73eb0 to 13aec9a Compare July 2, 2024 18:26
@0xpatrickdev 0xpatrickdev marked this pull request as ready for review July 2, 2024 18:39
@0xpatrickdev
Copy link
Member Author

I've updated the description so that this PR closes #8896. It was arguably already closed via #9462, but we now have a nicer setup that incudes 1) a dynamic chain registry, and 2) tests with a real contract.

Reviewers - in addition to code review, please confirm you are able to execute the test suite locally.

@0xpatrickdev 0xpatrickdev requested review from LuqiPan and turadg July 2, 2024 18:42
Copy link
Member

@turadg turadg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clear progress. I'm not certain it completes #8896 but I suppose any oustanding work is worth new tickets.

multichain-testing/Makefile Outdated Show resolved Hide resolved
multichain-testing/Makefile Outdated Show resolved Hide resolved
multichain-testing/README.md Outdated Show resolved Hide resolved
@@ -56,7 +60,8 @@
"**/*.test.ts"
],
"concurrency": 1,
"serial": true
"serial": true,
"timeout": "125s"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oddly specific :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅 the unbonding_period is 2min

import { makeAgdTools } from '../tools/agd-tools.js';
import { makeDeployBuilder } from '../tools/deploy.js';

async function main() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect we'll DRY later

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree for .js files in /tools wrt to #8963. This file, I'm surprised to hear this feedback - can you point me to something similar?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing similar but I could see it being a regular CLI command. Though I didn't have any specific ideas in mind other than to ignore my DRY/factoring sniffer.

;;
*)
;;
-c | --config)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, doesn't happen for me, but I fixed up the diff

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only config.yaml changes to single quotes if we do:

diff --git a/package.json b/package.json
index 40bce1032..39be23458 100644
--- a/package.json
+++ b/package.json
@@ -53,7 +53,7 @@
     "create-agoric-cli": "node ./scripts/create-agoric-cli.cjs",
-    "format": "yarn prettier --write .github golang packages scripts",
+    "format": "yarn prettier --write .github golang packages scripts multichain-testing",
     "lint:format": "yarn prettier --check .github golang packages scripts",

This PR is already in the merge queue, but I think someone can pick up this change the next time they're contributing to this directory

@@ -7,37 +7,41 @@ import { commonSetup } from '../support.js';

const test = anyTest as TestFn<Record<string, never>>;

test('create a wallet and get tokens', async t => {
const walletScenario = test.macro(async (t, scenario: string) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 macros

),
{ stdio: ['ignore', 'pipe', 'ignore'] },
);
paths.forEach(path => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for (const path of paths) {

i'm surprised the linter didn't complain

execFileSync,
}: Pick<typeof import('child_process'), 'execFile' | 'execFileSync'>,
) => {
const bundleCache = makeNodeBundleCache(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm partial to,

const bundleCache = unsafeMakeBundleCache('bundles');

though I wonder why we even need any arguments

@0xpatrickdev 0xpatrickdev force-pushed the 9042-stake-atom-e2e-rebased branch from 13aec9a to fc52aef Compare July 3, 2024 20:47
- includes revise-chain-info.builder.js in ci to ensure agoricNames matchs the starship environment
@0xpatrickdev 0xpatrickdev force-pushed the 9042-stake-atom-e2e-rebased branch from 79ca698 to 55f1896 Compare July 3, 2024 20:58
@0xpatrickdev 0xpatrickdev added the automerge:rebase Automatically rebase updates, then merge label Jul 3, 2024
@0xpatrickdev
Copy link
Member Author

0xpatrickdev commented Jul 3, 2024

Here is a transcript of the ava test: https://github.com/Agoric/agoric-sdk/actions/runs/9784730113/job/27016269113#step:12:1

Wishes ava logs would appear at the same time as console logs.

@mergify mergify bot merged commit bfa47d1 into master Jul 3, 2024
86 checks passed
@mergify mergify bot deleted the 9042-stake-atom-e2e-rebased branch July 3, 2024 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge:rebase Automatically rebase updates, then merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Docker environment for testing cross-chain orchestration
4 participants