Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CLI] Add txn stream to local testnet #10101

Merged
merged 1 commit into from
Sep 25, 2023
Merged

Conversation

banool
Copy link
Contributor

@banool banool commented Sep 18, 2023

Description

This PR adds a txn stream to the local testnet. It also makes a variety of other changes to improve the local testnet:

  • Enable the faucet by default.
  • Use tokio tasks rather than raw futures.
  • Add a more systematic way to run one service after another.
  • Add a more systematic way to do healthchecking after startup to ensure that everything is running.
  • Write logs from components that use tracing into specific directories in the testnet directory.
  • Add a new ready check server that exposes a unified endpoint for checking overall readiness of all local testnet components.

Test Plan

Build a tools image locally:

docker/builder/docker-bake-rust-all.sh tools

I used the remote docker builder for this. I tagged the image as devnet so I could use it as the base network for the CLI E2E tests.

Run the CLI E2E tests where we use the local tools image:

cd crates/aptos/e2e
poetry install
poetry run python main.py -d --base-network devnet --test-cli-path ~/a/core/target/debug/aptos --image-repo-with-project aptos-core
023-09-20 19:58:29,559 - INFO - All tests passed!

See that the txn stream comes up:

cargo run -p aptos -- node run-local-testnet --force-restart --assume-yes
Completed generating configuration:
        Log file: "/Users/dport/.aptos/testnet/validator.log"
        Test dir: "/Users/dport/.aptos/testnet"
        Aptos root key path: "/Users/dport/.aptos/testnet/mint.key"
        Waypoint: 0:f87e3e33347cfde3b61de0b531e6eeb3d5bbdb3204e21b87d3c4b58fe3c7e105
        ChainId: 4
        REST API endpoint: http://0.0.0.0:8080
        Metrics endpoint: http://0.0.0.0:9101/metrics
        Aptosnet fullnode network endpoint: /ip4/0.0.0.0/tcp/6181
        Indexer gRPC node stream endpoint: 0.0.0.0:50051

Aptos is running, press ctrl-c to exit

Readiness endpoint: http://0.0.0.0:8090/

Node API is starting, please wait...
Faucet is starting, please wait...
Transaction stream is starting, please wait...

Node API is running. Endpoint: http://0.0.0.0:8080/
Transaction stream is running. Endpoint: http://0.0.0.0:50051/
Faucet is running. Endpoint: http://0.0.0.0:8081/

All services are running, you can now use the local testnet!
grpcurl -plaintext -d '{ "starting_version": 0 }' 127.0.0.1:50051 aptos.indexer.v1.RawData/GetTransactions

@banool banool force-pushed the banool/grpc-fullnode-local-testnet branch 2 times, most recently from b4cff83 to 6fe2002 Compare September 19, 2023 12:34
@banool banool changed the base branch from main to banool/indexer-tls-overhaul September 19, 2023 12:34
@banool banool force-pushed the banool/indexer-tls-overhaul branch 4 times, most recently from bf14d8c to 9c893ac Compare September 19, 2023 12:56
@banool banool changed the title banool/grpc fullnode local testnet [CLI] Add txn stream to local testnet Sep 19, 2023
@banool banool force-pushed the banool/indexer-tls-overhaul branch from f6093a9 to 38e10f9 Compare September 19, 2023 13:22
@banool banool force-pushed the banool/grpc-fullnode-local-testnet branch 2 times, most recently from 0a00cd8 to 625d487 Compare September 19, 2023 13:28
@banool banool force-pushed the banool/indexer-tls-overhaul branch from 9c893ac to f6093a9 Compare September 19, 2023 13:37
@banool banool force-pushed the banool/grpc-fullnode-local-testnet branch 4 times, most recently from ebb75e6 to ce7ae71 Compare September 19, 2023 16:30
@banool banool force-pushed the banool/indexer-tls-overhaul branch from 4dd7665 to e84e282 Compare September 19, 2023 16:34
@banool banool force-pushed the banool/grpc-fullnode-local-testnet branch from ce7ae71 to 6f61e93 Compare September 19, 2023 16:37
@banool banool force-pushed the banool/indexer-tls-overhaul branch 3 times, most recently from d488be2 to f6a48e6 Compare September 20, 2023 09:19
@banool banool force-pushed the banool/grpc-fullnode-local-testnet branch from 6f61e93 to e99ec6c Compare September 20, 2023 09:27
@banool banool force-pushed the banool/indexer-tls-overhaul branch from f6a48e6 to 0926fd9 Compare September 20, 2023 09:27
@banool banool force-pushed the banool/grpc-fullnode-local-testnet branch 4 times, most recently from 2fcecfa to 95f273d Compare September 20, 2023 18:11
@@ -0,0 +1,102 @@
// Copyright © Aptos Foundation
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this isn't strictly necessary, we could just write all our tracing output to a single file, but separating it by service makes it easier to understand.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the already existing logging framework?

Copy link
Contributor Author

@banool banool Sep 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The txn stream services don't use the existing logger bc we didn't want to bring in all those deps, so it uses tracing instead.

@banool banool marked this pull request as ready for review September 20, 2023 18:17
Comment on lines +66 to +83
let reflection_service = tonic_reflection::server::Builder::configure()
// Note: It is critical that the file descriptor set is registered for every
// file that the top level API proto depends on recursively. If you don't,
// compilation will still succeed but reflection will fail at runtime.
//
// TODO: Add a test for this / something in build.rs, this is a big footgun.
.register_encoded_file_descriptor_set(INDEXER_V1_FILE_DESCRIPTOR_SET)
.register_encoded_file_descriptor_set(TRANSACTION_V1_TESTING_FILE_DESCRIPTOR_SET)
.register_encoded_file_descriptor_set(UTIL_TIMESTAMP_FILE_DESCRIPTOR_SET)
.build()
.expect("Failed to build reflection service");

let reflection_service_clone = reflection_service.clone();

let tonic_server = Server::builder()
.http2_keepalive_interval(Some(std::time::Duration::from_secs(60)))
.http2_keepalive_timeout(Some(std::time::Duration::from_secs(5)));
.http2_keepalive_timeout(Some(std::time::Duration::from_secs(5)))
.add_service(reflection_service_clone);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused lol, why are we need this now and how come we didn't need it before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the new service that Larry just added recently so we didn't really have anything before. Without this you can't use grpcurl without "importing" all the proto files. Just a nice UX win, we have this on some of our other grpc servers.

@banool banool enabled auto-merge (squash) September 25, 2023 19:10
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@banool banool force-pushed the banool/grpc-fullnode-local-testnet branch from 1ab98dd to 37578a9 Compare September 25, 2023 19:48
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

❌ Forge suite framework_upgrade failure on aptos-node-v1.5.1 ==> 37578a95778fad611a2dfcbc442baa62cceda217

Compatibility test results for aptos-node-v1.5.1 ==> 37578a95778fad611a2dfcbc442baa62cceda217 (PR)
Upgrade the nodes to version: 37578a95778fad611a2dfcbc442baa62cceda217
Test Failed: Waiting for pod aptos-node-0-validator-0

Stack backtrace:
   0: <core::result::Result<T,F> as core::ops::try_trait::FromResidual<core::result::Result<core::convert::Infallible,E>>>::from_residual
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/result.rs:1961:27
      aptos_forge::backend::k8s::stateful_set::wait_stateful_set::{{closure}}
             at ./testsuite/forge/src/backend/k8s/stateful_set.rs:69:5
      aptos_forge::backend::k8s::stateful_set::scale_stateful_set_replicas::{{closure}}
             at ./testsuite/forge/src/backend/k8s/stateful_set.rs:243:6
   1: <aptos_forge::backend::k8s::node::K8sNode as aptos_forge::interface::node::Node>::start::{{closure}}
             at ./testsuite/forge/src/backend/k8s/node.rs:137:84
   2: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/future/future.rs:125:9
      <aptos_forge::backend::k8s::swarm::K8sSwarm as aptos_forge::interface::swarm::Swarm>::upgrade_validator::{{closure}}
             at ./testsuite/forge/src/backend/k8s/swarm.rs:262:27
   3: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/future/future.rs:125:9
      aptos_testcases::batch_update::{{closure}}
             at ./testsuite/testcases/src/lib.rs:51:60
      tokio::runtime::park::CachedParkThread::block_on::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.29.1/src/runtime/park.rs:283:63
      tokio::runtime::coop::with_budget
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.29.1/src/runtime/coop.rs:107:5
      tokio::runtime::coop::budget
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.29.1/src/runtime/coop.rs:73:5
      tokio::runtime::park::CachedParkThread::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.29.1/src/runtime/park.rs:283:31
   4: tokio::runtime::context::blocking::BlockingRegionGuard::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.29.1/src/runtime/context/blocking.rs:66:9
      tokio::runtime::scheduler::multi_thread::MultiThread::block_on::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.29.1/src/runtime/scheduler/multi_thread/mod.rs:87:13
      tokio::runtime::context::runtime::enter_runtime
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.29.1/src/runtime/context/runtime.rs:65:16
   5: tokio::runtime::scheduler::multi_thread::MultiThread::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.29.1/src/runtime/scheduler/multi_thread/mod.rs:86:9
      tokio::runtime::runtime::Runtime::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.29.1/src/runtime/runtime.rs:313:50
   6: <aptos_testcases::framework_upgrade::FrameworkUpgrade as aptos_forge::interface::network::NetworkTest>::run
             at ./testsuite/testcases/src/framework_upgrade.rs:56:9
   7: aptos_forge::runner::Forge<F>::run::{{closure}}
             at ./testsuite/forge/src/runner.rs:598:42
      aptos_forge::runner::run_test
             at ./testsuite/forge/src/runner.rs:666:11
      aptos_forge::runner::Forge<F>::run
             at ./testsuite/forge/src/runner.rs:598:30
   8: forge::run_forge
             at ./testsuite/forge-cli/src/main.rs:410:11
      forge::main
             at ./testsuite/forge-cli/src/main.rs:336:21
   9: core::ops::function::FnOnce::call_once
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/ops/function.rs:250:5
      std::sys_common::backtrace::__rust_begin_short_backtrace
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/sys_common/backtrace.rs:135:18
  10: std::rt::lang_start::{{closure}}
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/rt.rs:166:18
  11: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/ops/function.rs:284:13
      std::panicking::try::do_call
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:500:40
      std::panicking::try
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:464:19
      std::panic::catch_unwind
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panic.rs:142:14
      std::rt::lang_start_internal::{{closure}}
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/rt.rs:148:48
      std::panicking::try::do_call
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:500:40
      std::panicking::try
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:464:19
      std::panic::catch_unwind
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panic.rs:142:14
      std::rt::lang_start_internal
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/rt.rs:148:20
  12: main
  13: __libc_start_main
  14: _start
Trailing Log Lines:
      std::panicking::try
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:464:19
      std::panic::catch_unwind
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panic.rs:142:14
      std::rt::lang_start_internal
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/rt.rs:148:20
  12: main
  13: __libc_start_main
  14: _start


Swarm logs can be found here: See fgi output for more information.
{"level":"INFO","source":{"package":"aptos_forge","file":"testsuite/forge/src/backend/k8s/cluster_helper.rs:292"},"thread_name":"main","hostname":"forge-framework-upgrade-pr-10101-1695672437-aptos-node-v1-5-1","timestamp":"2023-09-25T20:17:03.448008Z","message":"Deleting namespace forge-framework-upgrade-pr-10101: Some(NamespaceStatus { conditions: None, phase: Some(\"Terminating\") })"}
{"level":"INFO","source":{"package":"aptos_forge","file":"testsuite/forge/src/backend/k8s/cluster_helper.rs:400"},"thread_name":"main","hostname":"forge-framework-upgrade-pr-10101-1695672437-aptos-node-v1-5-1","timestamp":"2023-09-25T20:17:03.448036Z","message":"aptos-node resources for Forge removed in namespace: forge-framework-upgrade-pr-10101"}

failures:
    framework_upgrade::framework-upgrade

test result: FAILED. 0 passed; 1 failed; 0 filtered out

Failed to run tests:
Tests Failed
Error: Tests Failed

Stack backtrace:
   0: aptos_forge::runner::Forge<F>::run
             at ./testsuite/forge/src/runner.rs:618:13
   1: forge::run_forge
             at ./testsuite/forge-cli/src/main.rs:410:11
      forge::main
             at ./testsuite/forge-cli/src/main.rs:336:21
   2: core::ops::function::FnOnce::call_once
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/ops/function.rs:250:5
      std::sys_common::backtrace::__rust_begin_short_backtrace
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/sys_common/backtrace.rs:135:18
   3: std::rt::lang_start::{{closure}}
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/rt.rs:166:18
   4: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/ops/function.rs:284:13
      std::panicking::try::do_call
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:500:40
      std::panicking::try
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:464:19
      std::panic::catch_unwind
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panic.rs:142:14
      std::rt::lang_start_internal::{{closure}}
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/rt.rs:148:48
      std::panicking::try::do_call
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:500:40
      std::panicking::try
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:464:19
      std::panic::catch_unwind
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panic.rs:142:14
      std::rt::lang_start_internal
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/rt.rs:148:20
   5: main
   6: __libc_start_main
   7: _start
Debugging output:
NAME                                   READY   STATUS              RESTARTS   AGE
aptos-node-0-validator-0               0/1     ContainerCreating   0          5m8s
aptos-node-1-validator-0               1/1     Running             0          6m49s
aptos-node-2-validator-0               1/1     Running             0          6m49s
aptos-node-3-validator-0               1/1     Running             0          6m49s
genesis-aptos-genesis-eforge10-x2fgp   0/1     Completed           0          8m52s

@github-actions
Copy link
Contributor

✅ Forge suite compat success on aptos-node-v1.6.2 ==> 37578a95778fad611a2dfcbc442baa62cceda217

Compatibility test results for aptos-node-v1.6.2 ==> 37578a95778fad611a2dfcbc442baa62cceda217 (PR)
1. Check liveness of validators at old version: aptos-node-v1.6.2
compatibility::simple-validator-upgrade::liveness-check : committed: 4201 txn/s, latency: 7175 ms, (p50: 6000 ms, p90: 9300 ms, p99: 27400 ms), latency samples: 180680
2. Upgrading first Validator to new version: 37578a95778fad611a2dfcbc442baa62cceda217
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 1848 txn/s, latency: 15684 ms, (p50: 18500 ms, p90: 22000 ms, p99: 22300 ms), latency samples: 92400
3. Upgrading rest of first batch to new version: 37578a95778fad611a2dfcbc442baa62cceda217
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 1307 txn/s, submitted: 1332 txn/s, expired: 24 txn/s, latency: 15896 ms, (p50: 17700 ms, p90: 21900 ms, p99: 31100 ms), latency samples: 85002
4. upgrading second batch to new version: 37578a95778fad611a2dfcbc442baa62cceda217
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 3264 txn/s, latency: 9345 ms, (p50: 9400 ms, p90: 13200 ms, p99: 14100 ms), latency samples: 143620
5. check swarm health
Compatibility test for aptos-node-v1.6.2 ==> 37578a95778fad611a2dfcbc442baa62cceda217 passed
Test Ok

@github-actions
Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 37578a95778fad611a2dfcbc442baa62cceda217

two traffics test: inner traffic : committed: 6444 txn/s, latency: 6080 ms, (p50: 5800 ms, p90: 7500 ms, p99: 11600 ms), latency samples: 2790580
two traffics test : committed: 100 txn/s, latency: 2404 ms, (p50: 2300 ms, p90: 2800 ms, p99: 6000 ms), latency samples: 1780
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.234, avg: 0.212", "QsPosToProposal: max: 0.165, avg: 0.153", "ConsensusProposalToOrdered: max: 0.619, avg: 0.595", "ConsensusOrderedToCommit: max: 0.508, avg: 0.485", "ConsensusProposalToCommit: max: 1.124, avg: 1.080"]
Max round gap was 1 [limit 4] at version 1438472. Max no progress secs was 3.42073 [limit 10] at version 1438472.
Test Ok

@banool banool merged commit 9aa41b0 into main Sep 25, 2023
@banool banool deleted the banool/grpc-fullnode-local-testnet branch September 25, 2023 20:21
Comment on lines +100 to +104
/// This does nothing, we already run a faucet by default. We only keep this here
/// for backwards compatibility with tests. We will remove this once the commit
/// that added --no-faucet makes its way to the testnet branch.
#[clap(long, hide = true)]
with_faucet: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to broadly share this before you fully remove this flag, so I suspect we will need this to be a while before we remove it

Comment on lines -215 to -218
println!(
"Faucet is running. Faucet endpoint: http://{}:{}",
self.server_config.listen_address, self.server_config.listen_port
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this no longer prints out for if you're running a faucet on your own?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, I just moved this into the healthchecking logic of the CLI.

Comment on lines +10 to +12
### Updated
- The `--with-faucet` flag has been removed from `aptos node run-local-testnet`, we now run a faucet by default. To disable the faucet use the `--no-faucet` flag.
- When using `aptos node run-local-testnet` we now expose a transaction stream. Learn more about the transaction stream service here: https://aptos.dev/indexer/txn-stream/. Opt out of this with `--no-txn-stream`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that this is a breaking change, and note the new default ports that will be also used.

Say if someone was using this for CI, and then it starts failing because there's an extra port running.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Odds are low this is breaking for anyone but since it could be, I'll mention it.

@@ -0,0 +1,102 @@
// Copyright © Aptos Foundation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the already existing logging framework?

Comment on lines +110 to +116
/// Do not run a transaction stream service alongside the node.
///
/// Note: In reality this is not the same as running a Transaction Stream Service,
/// it is just using the stream from the node, but in practice this distinction
/// shouldn't matter.
#[clap(long)]
no_txn_stream: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much overhead does this create?

Mostly concerned since we've had to tune the local testnet several times, it can be a lot of overhead on local machines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran some tests and I found the CPU overhead to be negligible. If you don't connect to the stream then this doesn't really do anything. I think maybe it enables some extra tracking in storage for tables (which will be moved out as some point).

Comment on lines +18 to +22
#[derive(Debug, Clone, Parser)]
pub struct ReadyServerConfig {
#[clap(long, default_value_t = 8090)]
pub ready_server_listen_port: u16,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, there's now yet another service just to check if the other services are running?

Could we shove this into an existing port?

So, now it looks like the local testnet node opens probably like 10 ports, between the node, the faucet, the grpc endpoint, the API, and the ready server.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why shove this into an existing service when it doesn't belong there? I don't think opening a bunch of ports is a big deal.

Comment on lines +20 to +21
#[clap(long, default_value_t = 8090)]
pub ready_server_listen_port: u16,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a doc comment about what the ready server is

banool added a commit that referenced this pull request Sep 26, 2023
banool added a commit that referenced this pull request Sep 26, 2023
Poytr1 pushed a commit to sentioxyz/aptos-core that referenced this pull request Oct 4, 2023
Poytr1 added a commit to sentioxyz/aptos-core that referenced this pull request Oct 4, 2023
* Update system-integrators-guide.md (aptos-labs#10087)

The old url file doesn't exist anymore. must be changed to the new one.

* [dashboards] sync grafana dashboards

* [typo*] (aptos-labs#9912)

* [typo*]Update delegation-pool-operations.md

* [typo*]Update run-a-fullnode-on-gcp.md

* [typo*]Update run-a-fullnode-on-gcp.md

* [typo*]Update glossary.md

* [Spec] Ensures for stake.move (aptos-labs#9700)

* 1

* cannot finish hp

* remove some wrong statements

* hp1-3

* rewrite hp2

* rewrite hp2 again

* hp1-3

* init

* fix

* fix ensures

* fix ensure in the update_sat

* fix comment

* fix md

* fix new comment

* update comment

* fix indent

* fix timeout

* fix timeout

* fix timeout

---------

Co-authored-by: chan-bing <[email protected]>

* Update fullnode-source-code-or-docker.md (aptos-labs#9914)

* [typo*]Update index.md (aptos-labs#9913)

* [typo*]Update index.md

* Update index.md

---------

Co-authored-by: Christian Sahar <[email protected]>

* [link*Update fullnode-source-code-or-docker.md (aptos-labs#9978)

* Update aptos-bitvec/src/lib.rs (aptos-labs#10045)

There is a mistake in the comment

* Update delegation-pool-operations.md (aptos-labs#10171)

remove deprecated function

* trivial: fix rocksdb property reporter

* forge: ability to override resource ask in Rust

* [CLI] Add txn stream to local testnet (aptos-labs#10101)

* [dashboards] sync grafana dashboards

* [TS SDK V2] Add `General` api class (aptos-labs#10185)

* add general api queries

* add general api queries

* [move] make vm clonable

* Pass down resolver to MoveVmExt::new()

In preparation for caching a warm VM

* Move VM warm up into adapter

* WarmVmCache

* Fix circular dependency

* [object] refactor burn/unburn (aptos-labs#9785)

* Fix Issue 9717 by breaking critical edges in CFG in move-compiler (v1) (aptos-labs#10064)

Fix issue aptos-labs#9717 by breaking critical edges in the CFG in move-compiler (v1).

While issue 9717 mentions "inline", this is a red herring as inlining a vector loop just makes it more likely to have a reference parameter on the stack, which leads to a failure to correctly drop dead references in move-compiler/src/cfgir/liveness/mod.rs. I've left the initial test case bug_9717.move and added bug_9717_looponly.move that just has a loop illustrating the problem, along with several variants on the loop (one of them, break2, also exhibiting the problem).

Anyway, the failure to properly place the reference drop happens when the constructed CFG has critical edges (edges from a node with multiple outgoing edges to a node with multiple ingoing edges). Such a situation is well-known in compiler literature to make it difficult to place instructions precisely based on certain analyses.

Fortunately, this seems to be easily fixed by adding a small pass to add a node on each such critical edge, so that the drop can be placed properly even in the case of a break.

Unfortunately, later passes can't deal with the resulting deep expression trees and chains of direct jumps that result in some cases, so I had to fix hlir/translate.rs to reduce stack depth for a given expression, and cfgir/optimize/inline_blocks.rs to properly remove the unneeded jumps.

New tests also reveal misplaced warnings/errors about unused variables in the presence of inlining, so that was also fixed.

* [Doc] Update build e2e dapp docs (aptos-labs#10165)

* Update build e2e dapp docs

* update js -> jsx

* update

* Update developer-docs-site/docs/tutorials/build-e2e-dapp/4-fetch-data-from-chain.md

* f

* f

---------

Co-authored-by: David Wolinsky <[email protected]>

* implement memoize high order function (aptos-labs#10110)

* [Executor] Metadata and exists support (in Block-STM/executor) (aptos-labs#10170)

* [move-vm] Refactor Loader (aptos-labs#9320)

* Rename variant

* [move-vm] Store ability in runtime type

* fixup! [move-vm] Store ability in runtime type

* fixup! fixup! [move-vm] Store ability in runtime type

* fixup! fixup! fixup! [move-vm] Store ability in runtime type

* [move_vm] Split loader into multiple files

* [move_vm] Cache struct type for modules

* [move_vm] Move depth cache to type cache

* [move-vm] Use name to replace type index

* [move-vm] Remove function index

* [move-vm] Inline functions

* [move-vm] Remove cached index from definition

* [move-vm] Check cross module linking before loading

* [move-vm] Remove global struct cache

* [move-vm] Inline struct name to avoid excessive memory allocation

* [move-vm] Split function in loader

* [move-vm] Split out module cache

* [e2e-test] Add randomized test for loader

* Fix Lint

* [move-vm] Cache signature resolution

* [move-vm] Removed unneeded signature token

* [move-vm] Arc-ed type argument

* Fix Zekun's comments

* fixup! [e2e-test] Add randomized test for loader

* add comments for the test strategy

* Rename struct name

* More renaming

* Addressing more comments

* [consensus] Dedicated channel for proposal buffering and batch process

* revert 9ed1da8 (aptos-labs#10256)

* update move run to include more params (aptos-labs#9932)

* [CLI] Update CLI binary doc site to include openssl3 instruction (aptos-labs#9964)

* Update CLI binary doc site to include openssl3 instruction

* Update developer-docs-site/docs/tools/aptos-cli/install-cli/download-cli-binaries.md

Co-authored-by: Christian Sahar <[email protected]>

* Update developer-docs-site/docs/tools/aptos-cli/install-cli/download-cli-binaries.md

Co-authored-by: Christian Sahar <[email protected]>

---------

Co-authored-by: David Wolinsky <[email protected]>
Co-authored-by: Christian Sahar <[email protected]>

* [dag] split notifier into Order and Proof Notifier

[dag] additional ledger info verification checks

[dag] separate out highest committed round provider

[dag] introduce a ledger info provider trait

* [CLI] Fix faucet component of local testnet without --force-restart (aptos-labs#10214)

* Update Docker images (aptos-labs#10208)

Co-authored-by: gedigi <[email protected]>

* [ts-sdk-v2] Add .npmrc to ensure pnpm lockfile is consistent (aptos-labs#10251)

* Update dependabot.yml to disable version update PRs (aptos-labs#10249)

* CLI version bump to 2.1.1 (aptos-labs#10272)

* Get Drand Move example to compile (aptos-labs#10196)

* change names of drand and veiledcoin

* drand compiles

* drand tests compile

* remove diff.txt

* add retrieve lottery winner function

* Update CLI dry run text (aptos-labs#10273)

* add partial transaction queries (aptos-labs#10275)

* [TSSDKv2][1/n] add Ed25519 classes (aptos-labs#10157)

* add PrivateKey, PublicKey, and Signature classes

* Add helper.ts and fixes based on comments

* add asymmetric_crypto file which includes all crypto base classes as abstract. And have all concrete crypto classes to extend from based abstract classes

* Update crypto classes name to include Ed25519 prefix

* [TS SDK V2] Add Indexer account queries (aptos-labs#10216)

* implement account indexer queries

* add indexer account api queries

* address feedback

* [TSSDKv2][3/n] Account and AuthenticationKey Classes (aptos-labs#10210)

* Add helper.ts and fixes based on comments

* add asymmetric_crypto file which includes all crypto base classes as abstract. And have all concrete crypto classes to extend from based abstract classes

* Update crypto classes name to include Ed25519 prefix

* fixes based on comments

* Add account and auth key classes

* update Account class to accept abstract PublicKey and PrivateKey

* update methods' comment

* [ts-sdk-v2] Add getTransactions to the sdk-v2 transaction API (aptos-labs#10288)

* Add getTransactions to the sdkv2 transaction API

* Update ecosystem/typescript/sdk_v2/src/internal/transaction.ts

Co-authored-by: Maayan <[email protected]>

* update after merge

---------

Co-authored-by: Greg Nazario <[email protected]>
Co-authored-by: Maayan <[email protected]>

* [aptos-stdlib] Cleanup error codes for divide by 0

* [marketplace-example] Fix royalties edge conditions

There were two bugs with royalties and listings.  One was that in
v1, royalties could have a denominator 0, or be greater than 100%
in legacy NFTs.  This means that it's possible that payouts don't
work correctly, or take more than expected.  Now, royalties are
bounded to 0-100%.

The second issue was that commission was taken after royalties,
but didn't consider that royalties could be 100%.  Now, royalties
are taken first, and commission is taken out of the remainder.
This does mean the marketplace may not have any commission if the
royalties are set to 100%.

* [Spec] Fix spec (aptos-labs#10215)

* fix spec

* staking_contract spec

* [Rust] Upgrade to Rust version 1.72.1

* [Forge][Chaos] remove jitter, make inter-region BW 300 Mbps (aptos-labs#10277)

### Description

The changes are based on observations while measuring network performance and reading more into the dataset used.
* Jitter: My understanding is that re-ordering of packets should be pretty rare in the real world, while the previous jitter configs would introduce re-ordering quite frequently. Unless we have a strong belief that jitter is present in our networks, we shouldn't mess with this.
* Inter-region bitrate: The numbers this was based on are iperf with a single TCP stream. The results correlate strongly with RTT, which suggests that RTT is the limiting factor, so there's no real reason to constrain BW itself. For now, 300 Mbps is as fast as our network stack will go for 100+ ms RTT.

* [Forge][PFN] remove epoch changes from some tests (aptos-labs#10278)

### Description

This is in an effort to reduce noise in the PFN tests. Epoch changes are an obvious source of noise, although it's unclear how much it contributes to noise difference between runs.

To still have coverage of epoch changes, the tests without chaos will still do 5 minute epoch changes.

### Test Plan

Run ad-hoc forge run, observe no epoch changes.

* [Forge][netbench] split into large and small messages for two region test (aptos-labs#10283)

### Description

We found that the throughput for large and small messages is very different with large latencies. We update the test to run with large messages and then split out a small messages test. The large messages are useful for sanity checking the network setup, the small messages is something that we can hopefully improve upon.

* [crypto] add secp256k1 support to aptos-crypto

Pretty straight forward, but I think our crypto apis are really not
great. The amount of traits that should really be compressed down into a
single trait for PublicKey, PrivateKey, and Signature. This would make
maintaining and adding new libraries so much easier.

* [types] Add support for secp256k1 authenticator

with this we can now send secp256k1 signed transactions to the
blockchain...

I'm going to do some code refactoring in authenticators and transactions
before resuming the end-to-end testing and the feature gating of this feature.

* [openapi] update openapi with secp256k1

* [types] remove AuthenticationKeyPreimage

This was adding extra code and adding complexity to what is already a
complex space. If we aren't going to use the preimage, we have no need
to write the code.

We need to be more dilligent about removing this type of unnecessary
code from the codebase, because it really impairs our ability to move
fast.

* [types] remove prefix and rename derived_address on AuthenticationKey

* AuthenticationKey and Address are 1:1, so all this code is legacy
  based upon some weird goal of trying to compress the account address
  into an insecure size back in Libra. Even the authors of this code
  have since moved to 32-bytes in their own blockchain.
* prefix is never used and removed.
* derived_address -> account_address because there's no derivation it is
  literally 1:1

* [api] end to end test for secp256k1 ecdsa

* [features] add secp256k1 ecdsa

* [Aptos Data Poller] Improve peer polling logic.

* [Aptos Data Client] Update tests for new poller.

* [exp] make it work for any upstream repo (aptos-labs#10016)

* [indexer-grpc] k6 loadtest (aptos-labs#9493)

* [Aptos Data Client] Improve selection for optimistic fetch and
subscriptions.

* [Aptos Data Client] Add new tests for peer selection logic.

* Make InMemoryStateCalculatorV2 work with state sync chunks. (aptos-labs#10263)

* [release-builder] Increase lockup before executing proposals

Increase the lockup before executing proposals

Currently the release flow started consistently failing because we need to increase the lockup

This ensures that the lockup is sufficient before executing transactions

Test Plan: not sure how to test other than by running against testnet???

Please advise

* [Network] Replace RwLock with ArcSwap for trusted peers.

* [Network] Reduce lock contention for peer metadata using cache.

* replay-verify.yaml not reference workflows by @main

* replay-verify: not cancel other sub-jobs on first failure

* recalibrate single node benchmark for perf regression aptos-labs#10298

* [move-model] fix internal assertion violation in definition analysis (aptos-labs#10292)

Co-authored-by: Aalok Thakkar <[email protected]>

* [ts-sdk-v2] Rename endpoint to path in client

* [ts-sdk-v2] Rename originMethod to name for API requests

* [ts-sdk-v2] Simplify fullnode get requests

* [ts-sdk-v2] Make Post requests simpler

* [ts-sdk-v2] Update format and lint for SDK

This fixes most of the SDK lints except for the pieces around
the static functions for deserialize.

* [ts-sdk-v2] Fix broken tests

* [ts-sdk-v2] Replace "Generic error"

* [ts-sdk-v2] Add derivation path invalid test

* [ts-sdk-v2] Cleanup Account and crypto for code reuse

1. Allows deriving public key from private key
2. Detaches accounts from Single Ed25519
3. Adds some consistency to input naming

* [ts-sdk-v2] Unify authentication key creation

* [ts-sdk-v2] Add Authentication Key scheme enum

* [ts-sdk-v2] Add ability to derive authkeys from non-key schemes

* [ts-sdk-v2] Add docs to AuthenticationKey

* [ts-sdk-v2] Cleanup documentation on Ed25519

* [ts-sdk-v2] Add documentation to asymmetric crypto

* [ts-sdk-v2] Add docs, cleanup multi-ed25519

* [ts-sdk-v2] Move paginate with cursor to client

* [ts-sdk-v2] Cleanup unused lint ignores

* [ts-sdk-v2] Authentication key testing and message improvements

* [ts-sdk-v2] Rename some client inputs and add documentation

* [dag] epoch manager integration; dag is here

* [dashboards] sync grafana dashboards

* fix WarmVmCache

Natives are stateful so must be covered by the WarmVmId.

1. add TimedFeaturesBuilder and convert TimedFeatures to an array of
   booleans.
2. add SafeNativeBulder::id_bytes() to fix the bug
3. rebuild vm (and hence all the natives) only when the builder id_bytes() change.

* use s5cmd for downloading files in replay-verify

official aswcli experiencing random crashes

* [e2e-tests] Fix compilation error (aptos-labs#10318)

* Drop frozen root after make checkpoint. (aptos-labs#10327)

* redistribute mainnet replay sub-job ranges

* [TS-SDK v2] Updating the `Deserializable` interface and making `Serializable` an abstract class (aptos-labs#10307)

* Removing export from Deserializable to facilitate using a static `deserialize` method, fixing error messages in unit tests for multi_ed25519, and changing Serializable to an abstract class that implements `bcsToBytes()`. Removed abstract deserialize from public/private key classes.

* Re-adding doc comments

* Adding `serialize` and `deserialize` and corresponding unit tests to the AccountAddress class

* Fixing multi_ed25519 error messages

* Updating doc comments for Deserializable to clarify what its purpose is.

* [move unit tests] Run extended checker as part of unit tests (aptos-labs#10309)

* [move unit tests] Run extended checker as part of unit tests

Closes aptos-labs#9251

This runs the extended checker as part of Aptos unit tests (either our own Rust integrated tests or from the CLI). It uses the same technique as we already used for native extensions specific to Aptos: a hook is defined where additional, move-model based validations can be run. This is hook is then connected to the extended checker when running Aptos tests.

The implementation also optimizes the construction of the move model: if that one is already needed by abi generation (which is the default), it is not constructed a 2nd time for the extended checker -- both  for the existing build step and the new test step. This should avoid one full additional compilation (source -> bytecode -> model run).

* Extended checks until now excluded test code, leading to wrong usage of entry functions and attributes marked as test-only. Because fixing this is a breaking change, this commit adds the behavior to check test code via a new CLI option `--check-test-code`. This flag should eventually become default behavior.

Also fixes some reviewer comments.

* implement forge links to axiom (aptos-labs#10330)

* [gas-calibration][simple] ignore terms with 0-coefficients (aptos-labs#9742)

* [indexer][api] update the metrcis for api gateway consumption. (aptos-labs#10322)

* [dag] add various counters

* [TS-SDK v2] Adding `serializeVector` and `deserializeVector` to SDK v2 (aptos-labs#10347)

* Adding `serializeVector` and `deserializeVector` to serializer.ts and deserializer.ts as well as a unit test for each

* Adding documentation for each function

* Removing redundant line in doc comment

* Removing redundant line in deserializer doc comment too

* Allow using skip_index_and_usage on state sync code path. (aptos-labs#10303)

* [dag] add a few structured logging

* jin_fix_ed25519_derive_publickey (aptos-labs#10357)

* [ts-sdk-v2] Add MIME types, and convert input types from string to Enum (aptos-labs#10308)

* Remove expensive counters (aptos-labs#10188)

* [TS SDK V2] Port over transaction types (aptos-labs#10364)

* transaction types

* address comments

* [CLI] Restructure local testnet code (aptos-labs#10252)

* [release-builder] Fix increase lockup

The previous PR was converting the private key arg to string incorrectly...

ED25519 key type has a very silly to_string that returns debug output

I suppose this is to prevent dumping the key, but really i think it should just
not have a to_string and instead only have a debug method that does the same
thing, and a very explicit to hex method so that you dont accidentally dump these.

Test Plan: framework upgrade test succeeds, or at least doesnt fail on this
error (there is still some underlying flakiness in the forge test itself)

* [Sharded-Execution] Fix a race condition while fetching the state values on a shard from a remote stateview (aptos-labs#10320)

[Sharded-Execution] Fix a race condition while fetching the state values on a shard from a remote stateview

* [SDKv2] derive publickey unit test (aptos-labs#10358)

* jin_fix_ed25519_derive_publickey

* Add unit test for publicKey() derivate method

* [framework] Turn on test-only checking and fix errors (aptos-labs#10368)

This turns on running extended checks also on test-only code in the framework unit tests. A few errors discovered this way are fixed.

Changes to function declarations in this PR do not effect compatibility because only test-only functions are effected which are stripped before deployment.

Related to aptos-labs#10335, but more needs to be done to make this behavior the default. This cannot happen before the next framework release.

* Add CI to check for banned CLI deps (aptos-labs#10338)

* refactor: replace multiply_then_divide using math64::mul_div (aptos-labs#10047)

Co-authored-by: Kevin <[email protected]>

* [dag] hardening message verification

* [TS SDK V2] Port over account and transaction authenticator, signed transaction type (aptos-labs#10367)

* transaction types

* address comments

* authenticators

* previewnet flow in the benchmark (aptos-labs#10305)

* [ts sdk-v2] Mini get/post request refactor (aptos-labs#10380)

* [Sharded-Execution-GRPC] Add GRPC communication for sharded execution. (aptos-labs#10274)

[Sharded-Execution-GRPC] Add GRPC communication for sharded execution

In this commit we replace the existing socket based communication (that is message send and message recv) with GRPC.
Here we get the basic GRPC reliability.

More reliability and better performance to come in subsequent commits

* [forge] Fix fullnode override in forge (aptos-labs#10382)

### Description

Fixing a mistake made in a previous PR, that made fullnode configs not apply (and overwrite validator configs)

### Test Plan

Ran a fullnode test and manually check the config that is logged.

* Update delegation-pool-operations.md (aptos-labs#10299)

reorganize page

* [indexer grpc] update the metrics for usage analysis. (aptos-labs#10383)

* Update staking-pool-operations.md (aptos-labs#10300)

* Update staking-pool-operations.md

reorganize page

* Update staking-pool-operations.md

add step for owner account

* Update staking-pool-operations.md

* Update staking-pool-operations.md

remove heading

---------

Co-authored-by: Christian Sahar <[email protected]>

* [Storage][Pruner] Set min_readable_version and metrics. (aptos-labs#10381)

* rename enum variants and transaction arguments (aptos-labs#10384)

* [indexer grpc] More fields to short connection metric (aptos-labs#10385)

* [Tutorials] Replacing the old "Your First NFT" tutorial entry function calls to use aptos_token.move and new indexer queries (aptos-labs#9424)

* Overwriting the old your-first-nft tutorial in the docs to replace the 0x3 contract calls with the aptos_token contract calls. Also added corresponding documentation tags to referenced indexer queries and the new token client calls.

* Changing simple-aptos-token.py to simple_aptos_token.py and updating the Makefile to include it as an example

* Changing `create_token` to `mint_token` in python tutorial and `createToken` in typescript to `mint`

* Updating typescript example to work correctly if indexer/fullnode chainId aren't in sync

* [Python SDK] Use AccountAddress.from_str_relaxed when parsing addresses from the node API

---------

Co-authored-by: rtmtree <[email protected]>
Co-authored-by: rustielin <[email protected]>
Co-authored-by: Vladislav ~ cryptomolot <[email protected]>
Co-authored-by: Zorrot Chen <[email protected]>
Co-authored-by: chan-bing <[email protected]>
Co-authored-by: Freesson ~ cryptomolot <[email protected]>
Co-authored-by: Christian Sahar <[email protected]>
Co-authored-by: Jiege <[email protected]>
Co-authored-by: michelle-aptos <[email protected]>
Co-authored-by: aldenhu <[email protected]>
Co-authored-by: Daniel Porteous (dport) <[email protected]>
Co-authored-by: Maayan <[email protected]>
Co-authored-by: Zekun Li <[email protected]>
Co-authored-by: Aaron <[email protected]>
Co-authored-by: Brian R. Murphy <[email protected]>
Co-authored-by: Oliver He <[email protected]>
Co-authored-by: David Wolinsky <[email protected]>
Co-authored-by: Rati Gelashvili <[email protected]>
Co-authored-by: runtianz <[email protected]>
Co-authored-by: Jin <[email protected]>
Co-authored-by: Balaji Arun <[email protected]>
Co-authored-by: Gerardo Di Giacomo <[email protected]>
Co-authored-by: gedigi <[email protected]>
Co-authored-by: Greg Nazario <[email protected]>
Co-authored-by: Michael Straka <[email protected]>
Co-authored-by: Teng Zhang <[email protected]>
Co-authored-by: Josh Lind <[email protected]>
Co-authored-by: Brian (Sunghoon) Cho <[email protected]>
Co-authored-by: Rustie Lin <[email protected]>
Co-authored-by: Guoteng Rao <[email protected]>
Co-authored-by: Perry Randall <[email protected]>
Co-authored-by: igor-aptos <[email protected]>
Co-authored-by: aalok-t <[email protected]>
Co-authored-by: Aalok Thakkar <[email protected]>
Co-authored-by: Matt <[email protected]>
Co-authored-by: Wolfgang Grieskamp <[email protected]>
Co-authored-by: Christian Theilemann <[email protected]>
Co-authored-by: Victor Gao <[email protected]>
Co-authored-by: larry-aptos <[email protected]>
Co-authored-by: Sital Kedia <[email protected]>
Co-authored-by: Manu Dhundi <[email protected]>
Co-authored-by: 0xbe1 <[email protected]>
Co-authored-by: Kevin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants