Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Agent service #744

Merged
merged 7 commits into from
Jun 15, 2023
Merged

Implement Agent service #744

merged 7 commits into from
Jun 15, 2023

Conversation

XAMPPRocky
Copy link
Collaborator

This PR adds a new service called "Agent", the goal of this service to provide a service more designed for relay setups by only having configuration forwarding rather than also being a management server, and this also has a QCMP service that allows it singable, which will be used in to be able to act as pingable "beacon" as it were in a given cluster.

Following this PR I intend to make a follow-up that removes the --relay flag from the management server (as it's not really needed anymore with agent), and moves the proxy's QCMP implementation to be its own port using the same task the agent uses to bring those services in line with this one.

@XAMPPRocky
Copy link
Collaborator Author

@markmandel For whenever you're back, I can't replicate this test failing locally, and I don't have permissions to retry the cloud build.

@markmandel
Copy link
Member

If you ever want to retry one of your builds:

git commit --amend --date=now
git push --force-with-lease

Will restart the build with no changes to the code 👍🏻

But if you get a chance to fix the conflict, let's see if the error repeats. Might be a flake.

@markmandel
Copy link
Member

Rebuilding, let's see what happens.

@markmandel
Copy link
Member

Okay, looks like it's a pretty consistent failure.


     Running tests/qcmp.rs (target/build-image/debug/deps/qcmp-3ba9e4cd8ebd4580)

running 2 tests
test proxy_ping ... ok
thread 'agent_ping' panicked at 'called `Result::unwrap()` on an `Err` value: Elapsed(())', tests/qcmp.rs:61:10
stack backtrace:
   0: rust_begin_unwind
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:142:14
   2: core::result::unwrap_failed
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/result.rs:1785:5
   3: core::result::Result<T,E>::unwrap
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/result.rs:1107:23
   4: qcmp::ping::{{closure}}
             at ./tests/qcmp.rs:59:21
   5: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/future/mod.rs:91:19
   6: qcmp::agent_ping::{{closure}}
             at ./tests/qcmp.rs:47:20
   7: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/future/mod.rs:91:19
   8: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/future/future.rs:124:9
   9: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/future/future.rs:124:9
  10: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}::{{closure}}::{{closure}}
             at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.28.2/src/runtime/scheduler/current_thread.rs:541:57
  11: tokio::runtime::coop::with_budget
             at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.28.2/src/runtime/coop.rs:107:5
  12: tokio::runtime::coop::budget
             at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.28.2/src/runtime/coop.rs:73:5
  13: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}::{{closure}}
             at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.28.2/src/runtime/scheduler/current_thread.rs:541:25
  14: tokio::runtime::scheduler::current_thread::Context::enter
             at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.28.2/src/runtime/scheduler/current_thread.rs:350:19
  15: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}
             at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.28.2/src/runtime/scheduler/current_thread.rs:540:36
  16: tokio::runtime::scheduler::current_thread::CoreGuard::enter::{{closure}}
             at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.28.2/src/runtime/scheduler/current_thread.rs:615:57
  17: tokio::macros::scoped_tls::ScopedKey<T>::set
             at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.28.2/src/macros/scoped_tls.rs:61:9
  18: tokio::runtime::scheduler::current_thread::CoreGuard::enter
             at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.28.2/src/runtime/scheduler/current_thread.rs:615:27
  19: tokio::runtime::scheduler::current_thread::CoreGuard::block_on
             at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.28.2/src/runtime/scheduler/current_thread.rs:530:19
  20: tokio::runtime::scheduler::current_thread::CurrentThread::block_on
             at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.28.2/src/runtime/scheduler/current_thread.rs:154:24
  21: tokio::runtime::runtime::Runtime::block_on
             at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.28.2/src/runtime/runtime.rs:302:47
  22: qcmp::agent_ping
             at ./tests/qcmp.rs:47:5
  23: qcmp::agent_ping::{{closure}}
             at ./tests/qcmp.rs:38:7
  24: core::ops::function::FnOnce::call_once
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/ops/function.rs:248:5
  25: core::ops::function::FnOnce::call_once
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/ops/function.rs:248:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
test agent_ping ... FAILED

failures:

failures:
    agent_ping

Next step will be for me to attempt to replicate locally

@markmandel
Copy link
Member

Can replicate this locally as well.

❯ cargo test --package quilkin --test qcmp agent_ping
    Finished test [unoptimized + debuginfo] target(s) in 0.23s
     Running tests/qcmp.rs (target/debug/deps/qcmp-350391975c8925f8)

running 1 test
test agent_ping ... FAILED

failures:

---- agent_ping stdout ----
thread 'agent_ping' panicked at 'called `Result::unwrap()` on an `Err` value: Elapsed(())', tests/qcmp.rs:61:10
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    agent_ping

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 1 filtered out; finished in 1.00s

error: test failed, to rerun pass `-p quilkin --test qcmp`

I'll see if I can narrow down where this is failing, but what do you see on your end when you run the above command?

tests/qcmp.rs Outdated
};
let server_config = std::sync::Arc::new(quilkin::Config::default());
let (_, rx) = tokio::sync::watch::channel(());
tokio::spawn(async move { agent.run(server_config, rx).await });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tokio::spawn(async move { agent.run(server_config, rx).await });
tokio::spawn(async move { agent.run(server_config, rx).await.expect("Agent should run") });

This showed me the issue that:

called `Result::unwrap()` on an `Err` value: Elapsed(())
thread 'agent_ping' panicked at 'Agones should run: Address already in use (os error 98)

Location:
    /home/mark/workspace/quilkin/src/cli/agent.rs:98:26', tests/qcmp.rs:50:14
stack backtrace:

Doesn't look like qcmp is starting.

src/protocol.rs Outdated

tracing::info!(%port, "spawning IPv4 and IPv6 QCMP sockets");
for (address, socket) in [
(ipv4, tokio::net::UdpSocket::bind(ipv4).await?),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is the issue - attempting to bind both an ipv4 address and an ipv6 address to the same port is causing it to already have the same port in use.

If you want to do this, I expect you'll want to use / extend utils::net::socket_with_reuse so that you can reuse the port. You will want to listen to both sockets, but that should be handle-able with a select!

@markmandel
Copy link
Member

markmandel commented Jun 9, 2023

I was messing about with this PR, and started taking a stab at a UdpSocket we could reuse that would work for both ipv6 and ipv4 local ports.

Let me know if you think it's a good idea, and I'll finish it off:

https://github.com/markmandel/quilkin/blob/a6d104ed89ea2d19be37d70ecc830d141142a8ea/src/utils/net.rs#L54-L87

Pretty sure it's what we'll need for #690

@markmandel
Copy link
Member

I got it all working - filing a PR against this PR. I rebased against main to fix up conflicts so it's a bit messy, apologies for that - but it all works now.

@github-actions github-actions bot added size/l and removed size/m labels Jun 15, 2023
@quilkin-bot
Copy link
Collaborator

Build Succeeded 🥳

Build Id: ce64722a-ae1d-46f1-9b3d-218861425ebe

The following development images have been built, and will exist for the next 30 days:

To build this version:

git fetch [email protected]:googleforgames/quilkin.git pull/744/head:pr_744 && git checkout pr_744
cargo build

@XAMPPRocky XAMPPRocky enabled auto-merge (squash) June 15, 2023 07:22
@XAMPPRocky XAMPPRocky merged commit 57244b4 into main Jun 15, 2023
@markmandel markmandel deleted the ep/agent-service branch June 15, 2023 16:45
@markmandel markmandel added kind/feature New feature or request area/user-experience Pertaining to developers trying to use Quilkin, e.g. cli interface, configuration, etc labels Jul 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/user-experience Pertaining to developers trying to use Quilkin, e.g. cli interface, configuration, etc kind/feature New feature or request size/l
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants