Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add arguments to quilkin run #574

Merged
merged 21 commits into from
Sep 22, 2022
Merged
10 changes: 0 additions & 10 deletions cloudbuild.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -46,16 +46,6 @@ steps:
args:
- build
id: build
# Run the built images for 5 seconds to make sure that the entrypoint and default config works out of the box
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replicated the issue as well with running the container from this PR. I'd recommend leaving in this test, as it also will replicate the breaking issue with the Agones integration tests, but output the logs within cloud build, and make it much easier to debug if there is a problem with the base image.

We may want to consider running the container in a few different common configuration options might also be useful in a few tests like the above, just to prove it all still runs even without any traffic flowing through.

- name: gcr.io/cloud-builders/docker
dir: ./build
entrypoint: bash
args:
- '-c'
- 'timeout --signal=INT --preserve-status 5s docker run --rm ${_REPOSITORY}quilkin:$(make version)'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than deleting, could make it (something like)

'timeout --signal=INT --preserve-status 5s docker run --rm ${_REPOSITORY}quilkin:$(make version)' run

To make it run in run mode.

Although one could argue the integration tests also test this (although this would provide a nicer error message if something basic was broken). I could go either way on this one.

id: test-quilkin-image
waitFor:
- build
- name: us-docker.pkg.dev/$PROJECT_ID/ci/make-docker
dir: ./build
args:
Expand Down
24 changes: 6 additions & 18 deletions docs/src/quickstart-netcat.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,31 +21,19 @@ This routes all UDP packets that `ncat` receives to the local `cat` process, whi

## 2. Start Quilkin

Next, let's configure Quilkin, with a static configuration that points at the udp echo service we just started.

Open a new terminal and copy the following to a file named `proxy.yaml`:

```yaml
version: v1alpha1
clusters:
default:
- endpoints:
- address: 127.0.0.1:8000
```

This configuration will start Quilkin on the default port of 7000, and it will redirect all incoming UDP traffic to
a single endpoint of 127.0.0.1, port 8000.

Let's start Quilkin with the above configuration:
Next let's configure Quilkin to with a static configuration that points at the
UDP echo service we just started.

```shell
quilkin --config proxy.yaml run
quilkin run --to localhost:8000
```

This configuration will start Quilkin on the default port of 7000, and it will
redirect all incoming UDP traffic to a single endpoint of 127.0.0.1, port 8000.

You should see an output like the following:

```shell
$ quilkin --config proxy.yaml run
{"msg":"Starting Quilkin","level":"INFO","ts":"2021-04-25T19:27:22.535174615-07:00","source":"run","version":"0.1.0-dev"}
{"msg":"Starting","level":"INFO","ts":"2021-04-25T19:27:22.535315827-07:00","source":"server::Server","port":7000}
{"msg":"Starting admin endpoint","level":"INFO","ts":"2021-04-25T19:27:22.535550572-07:00","source":"proxy::Admin","address":"[::]:9091"}
Expand Down
45 changes: 18 additions & 27 deletions docs/src/using.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,44 +5,35 @@ There are two choices for running Quilkin:
* Binary
* Container image

For each version there is both a release version, which is optimised for production usage, and a debug version that
has debug level logging enabled.

## Binary

The release binary can be downloaded from the
[Github releases page](https://github.com/googleforgames/quilkin/releases).

Quilkin needs to be run with an accompanying [configuration file](./proxy-configuration.md), like so:

`quilkin --config="configuration.yaml" run`

To view debug output, run the same command with the `quilkin-debug` binary.

You can also use the shorthand of `-c` instead of `--config` if you so desire.

## Container Image

For each release, there are both a release and debug container image built and hosted on Google Cloud
[Artifact Registry](https://cloud.google.com/artifact-registry) listed for
each [release](https://github.com/googleforgames/quilkin/releases).
For each release, there are both a release and debug container image built and
hosted on Google Cloud [Artifact Registry](https://cloud.google.com/artifact-registry)
listed for each [release](https://github.com/googleforgames/quilkin/releases).

The production release can be found under the tag:

`us-docker.pkg.dev/quilkin/release/quilkin:{version}`

Whereas, if you need debugging logging, use the following tag:

`us-docker.pkg.dev/quilkin/release/quilkin:{version}-debug`
```
us-docker.pkg.dev/quilkin/release/quilkin:{version}
```

Mount your [configuration file](./proxy-configuration.md) at `/etc/quilkin/quilkin.yaml` to configure the Quilkin
instance inside the container.
## Command-Line Interface

A [default configuration](https://github.com/googleforgames/quilkin/blob/main/image/quilkin.yaml)
is provided, such the container will start without a new configuration file, but it is configured to point to
`127.0.0.1:0` as a no-op configuration.
Quilkin provides a variety of different commands depending on your use-case.
The primary entrypoint of the process is `run`, which runs Quilkin as a reverse
UDP proxy. To see a basic usage of the command-line interface run through the
[netcat with Quilkin quickstart](./quickstart-netcat.md). For more advanced
usage, checkout the [`quilkin::Cli`] documentation.

What's next:
## Logging
By default Quilkin will log `INFO` level events, you can change this by setting
the `RUST_LOG` environment variable. See [`log` documentation][log-docs] for
more advanced usage.

* Run through the [netcat with Quilkin quickstart](./quickstart-netcat.md)
* Review our [example integration architectures](./integrations.md)
[log-docs]: https://docs.rs/env_logger/0.9.0/env_logger/#enabling-logging
[`quilkin::Cli`]: ../api/quilkin/struct.Cli.html
123 changes: 89 additions & 34 deletions src/cli.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,38 +18,50 @@ mod generate_config_schema;
mod manage;
mod run;

use std::path::PathBuf;
use std::{
path::{Path, PathBuf},
sync::Arc,
};

use tokio::{signal, sync::watch};

use crate::Config;

pub use self::{
generate_config_schema::GenerateConfigSchema,
manage::{Manage, Providers},
run::Run,
};

const VERSION: &str = env!("CARGO_PKG_VERSION");
const ETC_CONFIG_PATH: &str = "/etc/quilkin/quilkin.yaml";

/// The Command-Line Interface for Quilkin.
#[derive(clap::Parser)]
#[non_exhaustive]
pub struct Cli {
#[clap(
short,
long,
env = "QUILKIN_CONFIG",
default_value = "quilkin.yaml",
help = "The YAML configuration file."
)]
config: PathBuf,
#[clap(
short,
long,
env,
help = "Whether Quilkin will report any results to stdout/stderr."
)]
quiet: bool,
/// The path to the configuration file for the Quilkin instance.
#[clap(short, long, env = "QUILKIN_CONFIG", default_value = "quilkin.yaml")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually thinking of coming back around and fleshing out the help for the rest of the commands for a nice command line reference.

Curious if you didn't want a command line reference here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait - when I run the command, I see help everywhere,

Wait.... does clap take the help from the comment above it? If so, that's a super neat trick.

pub config: PathBuf,
/// The port to bind for the admin server
#[clap(long, env = "QUILKIN_ADMIN_ADDRESS")]
pub admin_address: Option<std::net::SocketAddr>,
/// Whether to spawn the admin server or not.
#[clap(long, env = "QUILKIN_NO_ADMIN")]
pub no_admin: bool,
/// Whether Quilkin will report any results to stdout/stderr.
#[clap(short, long, env)]
pub quiet: bool,
#[clap(subcommand)]
command: Commands,
pub command: Commands,
}

/// The various Quilkin commands.
#[derive(clap::Subcommand)]
enum Commands {
Run(run::Run),
GenerateConfigSchema(generate_config_schema::GenerateConfigSchema),
Manage(manage::Manage),
pub enum Commands {
Run(Run),
GenerateConfigSchema(GenerateConfigSchema),
Manage(Manage),
}

impl Cli {
Expand Down Expand Up @@ -78,25 +90,68 @@ impl Cli {
"Starting Quilkin"
);

let config = Arc::new(Self::read_config(self.config)?);
let _admin_task = if self.no_admin {
config.admin.remove();
None
} else {
if let Some(address) = self.admin_address {
config
.admin
.store(Arc::new(crate::config::Admin { address }));
}
Some(tokio::spawn(crate::admin::server(config.clone())))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure this is the offending line causing the double creation of the admin server, since both Proxy::run and ControlPlane::from_arc both create admin instances and run them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing some debugging (code), it looks like the Config that comes from this line:

markmandel@09de12a#diff-b2812f19576dd53d0c35b107a322f58a00fce6977f4b5976e1961853982af3ccR93

let config = Arc::new(Self::read_config(self.config)?);

Has a None admin field, but looking at the code I'm not sure why it's None when it should have the default admin port 🤔

You can see it in my debug logs:

{"timestamp":"2022-09-22T04:41:50.829061Z","level":"DEBUG","fields":{"message":"Main Config","no_admin":false,"config":"Config { admin: None, clusters: None, filters: None, management_servers: None, proxy: None, version: None, metrics: Metrics { active_clusters: GenericGauge { v: Value { desc: Desc { fq_name: \"cluster_active\", help: \"Number of currently active clusters.\", const_label_pairs: [], variable_labels: [], id: 6550223101314893, dim_hash: 9568000379080501117 }, val: AtomicI64 { inner: 0 }, val_type: Gauge, label_pairs: [] } }, active_endpoints: GenericGauge { v: Value { desc: Desc { fq_name: \"cluster_active_endpoints\", help: \"Number of currently active endpoints.\", const_label_pairs: [], variable_labels: [], id: 9175583387471564798, dim_hash: 12641526325508385022 }, val: AtomicI64 { inner: 0 }, val_type: Gauge, label_pairs: [] } } } }"},"target":"quilkin::cli"}

I'll keep digging into this tomorrow unless you get to it first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've found the issue. I'll send a PR to your PR today.

};

let (shutdown_tx, shutdown_rx) = watch::channel::<()>(());

#[cfg(target_os = "linux")]
let mut sig_term_fut = signal::unix::signal(signal::unix::SignalKind::terminate())?;

tokio::spawn(async move {
#[cfg(target_os = "linux")]
let sig_term = sig_term_fut.recv();
#[cfg(not(target_os = "linux"))]
let sig_term = std::future::pending();

let signal = tokio::select! {
_ = signal::ctrl_c() => "SIGINT",
_ = sig_term => "SIGTERM",
};

tracing::info!(%signal, "shutting down from signal");
// Don't unwrap in order to ensure that we execute
// any subsequent shutdown tasks.
shutdown_tx.send(()).ok();
});

match &self.command {
Commands::Run(runner) => runner.run(&self).await,
Commands::Manage(manager) => manager.manage(&self).await,
Commands::Run(runner) => runner.run(config, shutdown_rx.clone()).await,
Commands::Manage(manager) => manager.manage(config).await,
Commands::GenerateConfigSchema(generator) => generator.generate_config_schema(),
}
}

/// Searches for the configuration file, and panics if not found.
fn read_config(&self) -> Config {
std::fs::File::open(&self.config)
.or_else(|error| {
if cfg!(unix) {
std::fs::File::open("/etc/quilkin/quilkin.yaml")
} else {
Err(error)
fn read_config<A: AsRef<Path>>(path: A) -> Result<Config, eyre::Error> {
let path = path.as_ref();
let from_reader = |file| Config::from_reader(file).map_err(From::from);

match std::fs::File::open(path) {
Ok(file) => (from_reader)(file),
Err(error) if error.kind() == std::io::ErrorKind::NotFound => {
tracing::debug!(path=%path.display(), "provided path not found");
match cfg!(unix).then(|| std::fs::File::open(ETC_CONFIG_PATH)) {
Some(Ok(file)) => (from_reader)(file),
Some(Err(error)) if error.kind() == std::io::ErrorKind::NotFound => {
tracing::debug!(path=%path.display(), "/etc path not found");
Ok(Config::default())
}
Some(Err(error)) => Err(error.into()),
None => Ok(Config::default()),
}
})
.map_err(eyre::Error::from)
.and_then(|file| Config::from_reader(file).map_err(From::from))
.unwrap()
}
Err(error) => Err(error.into()),
}
}
}
21 changes: 8 additions & 13 deletions src/cli/generate_config_schema.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,21 +14,16 @@
* limitations under the License.
*/

/// Generates JSON schema files for known filters.
#[derive(clap::Args)]
pub struct GenerateConfigSchema {
#[clap(
short,
long,
default_value = ".",
help = "The directory to write configuration files."
)]
output_directory: std::path::PathBuf,
#[clap(
min_values = 1,
default_value = "all",
help = "A list of one or more filter IDs to generate or 'all' to generate all available filter schemas."
)]
filter_ids: Vec<String>,
/// The directory to write configuration files.
#[clap(short, long, default_value = ".")]
pub output_directory: std::path::PathBuf,
/// A list of one or more filter IDs to generate or 'all' to generate all
/// available filter schemas.
#[clap(min_values = 1, default_value = "all")]
pub filter_ids: Vec<String>,
}

impl GenerateConfigSchema {
Expand Down
40 changes: 20 additions & 20 deletions src/cli/manage.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,38 +14,38 @@
* limitations under the License.
*/

/// Runs Quilkin as a xDS management server, using `provider` as
/// a configuration source.
#[derive(clap::Args)]
pub struct Manage {
/// The configuration source for a management server.
#[clap(subcommand)]
provider: Providers,
pub provider: Providers,
}

/// The available xDS source providers.
#[derive(clap::Subcommand)]
enum Providers {
pub enum Providers {
/// Watches Agones' game server CRDs for `Allocated` game server endpoints,
/// and for a `ConfigMap` that specifies the filter configuration.
Agones {
#[clap(
short,
long,
default_value = "default",
help = "Namespace under which the proxies run."
)]
/// The namespace under which the configmap is stored.
#[clap(short, long, default_value = "default")]
config_namespace: String,
#[clap(
short,
long,
default_value = "default",
help = "Namespace under which the game servers run."
)]
/// The namespace under which the game servers run.
#[clap(short, long, default_value = "default")]
gameservers_namespace: String,
},

File,
/// Watches for changes to the file located at `path`.
File {
/// The path to the source config.
path: std::path::PathBuf,
},
}

impl Manage {
pub async fn manage(&self, cli: &crate::Cli) -> crate::Result<()> {
let config = std::sync::Arc::new(cli.read_config());

pub async fn manage(&self, config: std::sync::Arc<crate::Config>) -> crate::Result<()> {
let provider_task = match &self.provider {
Providers::Agones {
gameservers_namespace,
Expand All @@ -55,8 +55,8 @@ impl Manage {
config_namespace.clone(),
config.clone(),
)),
Providers::File => {
tokio::spawn(crate::config::watch::fs(config.clone(), cli.config.clone()))
Providers::File { path } => {
tokio::spawn(crate::config::watch::fs(config.clone(), path.clone()))
}
};

Expand Down
Loading