Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The pk secrets env command for meeting Development Environment Usecase #31

Closed
CMCDragonkai opened this issue Jul 9, 2021 · 28 comments · Fixed by #129
Closed

The pk secrets env command for meeting Development Environment Usecase #31

CMCDragonkai opened this issue Jul 9, 2021 · 28 comments · Fixed by #129
Assignees
Labels
development Standard development r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management

Comments

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Jul 9, 2021

Specification

  1. The pk secrets env command should operate similar to the unix env command
  2. We can use pk secrets env to do double duty: inject environment variables into a new subcommand, or allow sourcing of environment variables into an existing subshell
  3. The task for the development environment usecase is specified in https://github.com/MatrixAI/Polykey-Desktop/issues/77
  4. The pk secrets env someprogram arg1 arg2 should do a proper process replacement when possible

Additional context

Copied excerpt:

Now when we want to start development on the project, we have 2 options:

  1. Open up a shell with Polykey sourcing the environment variables
  2. Use Polykey to inject environment variables into a sub-shell

Either result is the same, because you'll be running a shell with environment variables set so that your test runs of your software can inherit this environment configuration. Let's demonstrate both.

In the first case, we will reuse the source command and ask pk to constuct the equivalent of the .env file but entirely in-memory and only for this specific usage. We must also use process substitution <(...) so that the pk command output can be redirected into a temporary file descriptor that is read by the source command.

# one secret
source <(pk secrets env my-software-project:AWS_ACCESS_KEY_ID)

# multiple secrets
source <(pk secrets env my-software-project:AWS_ACCESS_KEY_ID my-software-project:GOOGLE_MAPS_API_KEY)

# globbing style (you can use globstar as well, this will only export immediate files)
source <(pk secrets env my-software-project:*)

# use the -e flag to export all variables
source <(pk secrets env -e my-software-project:*)
# use the -- to separate so you can export just one out of many
source <(pk secrets env -- my-software-project:AWS_ACCESS_KEY_ID -e my-software-project:GOOGLE_MAPS_API_KEY)
source <(pk secrets env -e -- my-software-project:AWS_ACCESS_KEY_ID my-software-project:GOOGLE_MAPS_API_KEY)

Now your shell has the relevant environment variables set by Polykey, and they will exist for as long as this current shell is alive.

In the second case, we ill use pk secrets env command to run a subshell, it can run any subprogram, but in the context of a development environment, you usually want a shell.

# one secret
pk secrets env my-software-project:AWS_ACCESS_KEY_ID bash

# multiple secrets
pk secrets env my-software-project:AWS_ACCESS_KEY_ID my-software-project:GOOGLE_MAPS_API_KEY bash

# globbing style
pk secrets env my-software-project:* bash

# the -e flag has no effect when using it this way
# this is because the subprogram determines whether to export variable or not
# it usually exports the variable
pk secrets env -e my-software-project:* bash

Now you have a subshell that has the environment variables configured, it will also automatically export it to child programs. When you have finished your development, you can just exit the shell, and the environment variables are gone!

Tasks

  1. Ensure that pk secrets env can be used to inject environment variables and run a subprocess
  2. Ensure that pk secrets env can be used to source environment variables and output something that can be used as a file to be sourced by a shell
  3. Deal with the fact that nodejs doesn't have an exec that can replace the current node process: Ability to replace current Node process with another nodejs/node#21664
    • You can deal with this the way babel deals with it
@CMCDragonkai CMCDragonkai added the development Standard development label Jul 9, 2021
@CMCDragonkai
Copy link
Member Author

Note that the kexec library has been abandoned but there's a recent PR: jprichardson/node-kexec#40 that updates it to be usable. We may need to bring in that source code into our own source tree to simplify usage if upstream doesn't update. This will introduce our own binary/C++ code here which can be useful to gain experience with writing native code in this form.

@scottmmorris
Copy link

So would the kexec.cc file need to be rewritten in JS for it to be usable inside js-polykey?

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Jul 14, 2021 via email

@CMCDragonkai
Copy link
Member Author

At this point, the main points of review are:

  • Globbing library
  • The usage of kexec
  • And UX of this library
  • Testing of env command

Testing of the env subcommand requires a separate process context because eventually it should be using a fork exec pattern. That means it should be replacing the current process context. Because it's running inside a testing process, that would break the tests.

So we will need to use a variant of cli utility to run a subprocess. But the exact way in which this is done is a bit complicated.

The other challenge with testing this, is that if you are testing a subshell like bash or sh, you'll need https://github.com/nodejitsu/nexpect.

However if you write a subprogram like this:

#!/usr/bin/env sh

echo $INPUTVARIABLE

During the testing:

pk secrets env vault1:INPUTVARIABLE ./myscript.sh

The $INPUTVARIABLE will be output to STDOUT. This allows you test that the env variables are actually passed in.

The above script uses sh, sh is always available on Linux platforms. But on Windows, they don't exist.

There are also platform requirements. For example on Windows, you'll need to use a powershell script.

Let's put the platform specific testing into a separate issue later.

@CMCDragonkai
Copy link
Member Author

Put on hold until we have client-refactoring merged.

@CMCDragonkai
Copy link
Member Author

I found my old comment from last year talking about this: MatrixAI/Polykey#55 (comment). It also has a "chording" idea that may be relevant.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Jun 29, 2022

Example of this being done for Hashicorp Vault: https://github.com/channable/vaultenv. Note that process replacement doesn't exist on windows. It will always be using a child process.

@CMCDragonkai CMCDragonkai added the r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management label Jul 24, 2022
@CMCDragonkai CMCDragonkai self-assigned this Jul 10, 2023
@CMCDragonkai
Copy link
Member Author

Wanted to mention that this command will be essential to obsolete the usage of .env files.

The usage of .env files is quite dangerous due to how easy it is leak these secrets during screensharing.

Screen sharing can occur during meetings, recorded work sessions, or when requesting external help.

The .env might be open in an IDE somewhere, or just sitting in a terminal somewhere.

We've seen this happen a few times, and I think there was a major leak that occurred with this.

@CMCDragonkai
Copy link
Member Author

MatrixAI/Polykey#505 was closed because this now has to be implemented in Polykey-CLI. However it can be reviewed for notes and design ideas.

@CMCDragonkai
Copy link
Member Author

New update https://github.com/StorytellerCZ/node-kexec.

I think we should be able to integrate this into our own library js-exec.

We can incorporate windows support with this understanding:

The closest equivalent to the exec system call in Linux, which replaces the current process image with a new one, in Windows is the CreateProcess API. Unlike exec, CreateProcess starts a new process rather than replacing the current one. However, you can work around this by terminating the original process after CreateProcess successfully launches the new one.

It's worth noting that the behavior of exec and CreateProcess are fundamentally different. In the Linux/Unix model, using exec in a process replaces the current process image with a new one but keeps the same process ID. Any file descriptors that are marked as close-on-exec are closed, otherwise they remain open. This allows for process chaining and replacement without process tree disruption.

In contrast, Windows' CreateProcess spawns a new process with a different process ID. It's more like a combination of fork and exec in Linux, but without the inherent cloning of the parent process. File handles are not automatically inherited unless specified. Thus, it creates a new entity in the process tree rather than replacing an existing one.

So, although CreateProcess is the closest native API Windows has to offer, it doesn't provide an exact one-to-one match with the exec system call in Linux.

@tegefaulkes
Copy link
Contributor

On request I'm adding this issue to the #40 epic

@tegefaulkes
Copy link
Contributor

Some notes about the scope.

  1. There are two styles this command would work. First is the process replacement style where the process of the env command is replaced with the command it's invoking. The second style will just be a normal exec of invoked command. The second style will only be used if the replacement isn't supported. We'll be looking at https://github.com/StorytellerCZ/node-kexec for this.
  2. The main goal of the env command is to run the invoked command with environment variables taken from vaults.
  3. We'll need to work out how these environment variables will be structured within the vaults and secrets. I can think of two formats, the first being a Record<string, string> JSON object encoding the env key and value. Alternatively we can store the key:value mapping with the secrets where the secret name is the key and the contents the value. This will have to be worked out with consideration of how vaults schema will work. I'm thinking keeping to the structure of vaults and secrets without structure on top of that may work best here?
  4. We need a way to pick and choose what env variables we want to use. Also have some way of mixing them together arbitrarily by listing what envs we want from what vaults and possibly using globstar patterns.
  5. We can have an alternative usage where the env command spits out the environment variables without running a command. So It can be composed in other ways.

So there are a few components to this that need to be addressed and specced out.

  1. We need some way to list out the env variables we want to use from vaults using the cording method. We'll need the ability to reference multiple env names across multiple vaults and possibly with globbing/regex to avoid listing out every variable. Commander may have a way to specify a flag multiple times so we end up with an array of flag parameters. Or its all passed in as one flag and we have a complex parser to handle it.
  2. We need to look into schemas or create a stopgap schema for storing env variables in vaults. Either way we need to spec out how to structure the envs in the vaults.
  3. We need a way to output the env variables in a structure that can be composed with other commands or in scripts.
  4. We need a way to exec a command we provide with the envs we want. On linux based platforms we want to replace the env command process with the invoked command. if that is not supported then we fall back to exec.

@tegefaulkes
Copy link
Contributor

I'll start with scaffolding the command.

@CMCDragonkai
Copy link
Member Author

I think put a simple C++ or rust code into it to expose the proper exec. For windows it's not possible so we won't bother. The rust is likely far simpler than the C++ scaffold with the bonus of being similar to js-quic and subsequently deno.

@CMCDragonkai
Copy link
Member Author

Note that @amydevs has experience with the setup of a partial native module.

@tegefaulkes
Copy link
Contributor

Will we need to do any native code if we use https://github.com/StorytellerCZ/node-kexec ?

@tegefaulkes
Copy link
Contributor

As for specifying the enviroment variable we want with the command. Commander provides a varadic option. https://github.com/tj/commander.js?tab=readme-ov-file#variadic-option This will allow us some freedom in specifying the enviroment variables that we will get as an array that can be parsed.

@CMCDragonkai
Copy link
Member Author

I don't want to use other people's native code. It will be flexible. This is a simple enough function for our own js-exec.

@amydevs
Copy link
Member

amydevs commented Feb 15, 2024

I think cpp would be better to use than Rust here. If we were to use Rust, we would need to bring in native bindings to apis and writing unsafe code anyways. Also memory safety shouldn't be hard with the limited complexity of it anyways.

@tegefaulkes
Copy link
Contributor

I'd lean towards not deviating from how we do native bindings in other repos. If only to reduce entropy by having everything stick to a standard way of handling it.

@CMCDragonkai
Copy link
Member Author

I think cpp would be better to use than Rust here. If we were to use Rust, we would need to bring in native bindings to apis and writing unsafe code anyways. Also memory safety shouldn't be hard with the limited complexity of it anyways.

We should be migrating to rust - for demo compatiblity. Plus the js-quic is much more modern package distribution compared to js-db. Everything new native should be rust.

@CMCDragonkai
Copy link
Member Author

@tegefaulkes is there anything in the original issue spec that you're changing? As per this comment #31 (comment)

@tegefaulkes
Copy link
Contributor

It's mostly the same except for the following.

  1. I'm not implementing regex or globbing when selecting secrets. But mixing matching specific secrets and directories of secrets is allowed.
  2. I'm adding extra functionality to generate .env style output or a="secret" b="secret" style output.

@CMCDragonkai
Copy link
Member Author

Is there a need for PR for PK too? Link them here too.

@CMCDragonkai
Copy link
Member Author

It's important for us to be able to source the secrets too so we can use it inside the shell hook @brynblack.

@CMCDragonkai
Copy link
Member Author

The js-exec repo on gitlab is configured - however the CI isn't correct, it needs to be setup to not build for Windows. Make sure to refer to the js-mdns that @amydevs setup.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Feb 21, 2024

We need to look into schemas or create a stopgap schema for storing env variables in vaults. Either way we need to spec out how to structure the envs in the vaults.

I don't think that's necessary yet, simply referring to the file path is enough.

We need a way to output the env variables in a structure that can be composed with other commands or in scripts.

Yea that's just source <(pk secrets env my-software-project:AWS_ACCESS_KEY_ID).

Without the command, it can just print out a consumable shell settings. And I believe .env.example is the format.

That will set it to the current shell, but not necessarily export them.

To do that:

set -o allexport
source <(...)
set +0 allexport

We do this already in our shellhook.

Note that on Windows, the syntax is not compatible. So we would have to have a switch when outputting to powershell format: https://stackoverflow.com/a/51247258.

$variable = "abc"
$env:variable = "abc"

That's a bit more complicated because the $env has to be used to export them to the environment... And this would require a flag like --windows for this command or something. Or I guess powershell. Like --powershell to switch it around because CMD is actually different.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Feb 21, 2024

Someone came up with a PS script that can interpret unix style syntax: https://stackoverflow.com/a/74839464.

That's another way, which is to simply provide that execution and embed the above content like a heredoc.

But still doesn't solve the ability to auto-export. It seems weird to have to always set the --export option, but that could be another flag that even enables export X=Y.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management
4 participants