Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Elasticsearch keystore passphrase for startup scripts #1

Conversation

williamrandolph
Copy link

This is a work in progress that builds on a branch in this repo. Once the relevant work gets merged to the main repo, I assume we can reject or repoint this PR.

The work here covers the tar, systemd, and docker use cases. It does not include any Windows work.

The approach here is as outlined in this comment: elastic#32691 (comment)

The bin/elasticsearch script handles reading a keystore password either from standard input, from a file, or from a FIFO, and provides the password to the Elasticsearch runtime via standard input. Elasticsearch's bootstrap code simply reads from standard input when it detects a password-protected keystore.

I found some bugs in our keystore CLI code that relate to reading standard input. The problem appeared when I was running commands like bin/elasticsearch-keystore add -x bootstrap.password with a password-protected keystore. In short, we were using several different BufferedReader objects to read from standard input, and whichever BufferedReader read first would slurp more input into its buffer than needed, leaving nothing for the later BufferedReader to read. My solution was to wrap a single BufferedReader in the SystemTerminal class to handle all the input in non-interactive cases. I added unit tests for these cases, but my approach may be overly intrusive (a TestSystemTerminal class). I would be open to simpler approaches.

I also ran into a classic bug on OSX where every CLI command took five seconds to initialize logging. This meant that I couldn't use bin/elasticsearch-keystore list to check whether a keystore required a password or not. Instead, I used dd to check the value of the particular byte that serves as the encryption flag in the keystore file. It looked like I could only parameterize startup scripts by rpm/tar/zip, rather than darwin/linux/windows, so I went ahead and used dd in all the tar builds.

I have some notes for testing this work that may be useful as documentation.

Relates elastic#32691, elastic#38498

This is initial and exporatory work for incorporating keystore passwords
in the Elasticsearch startup process. The "user interface" that prompts
for and reads a password is in the bash startup script. The bash script
uses existing command-line utilities to test whether the keystore exists
and a password is needed; if so, the password is given to the Java
startup process using a "here string" and a new command-line lag that
alerts Java to data coming in over standard input.

Known issues: Java code that reads from standard input is not
thread-safe, initial keystore verification is unacceptably slow.

Relates elastic#32691, elastic#38498
We now use the "Terminal" class, which encapsulates thread-safe access
to stdin.

Additionally, we now test for a particular "encrypted" byte flag in the
keystore file rather than running the slow keystore utility, and we
don't test the password before passing it to Elasticsearch, which avoids
a second call to the keystore utility.
In a discussion meeting, it was pointed out to me that the keystore
wapper object's hasPassword() method makes the standard input flag that
I added redundant, so I'm backing out that fairly intrusive change to
the Java code.
It's convenient to use the bin/elasticsearch-keystore list command to
test whether or not the keystore is encrypted. When no password is
supplied to an encrypted keystore, the command has a non-zero exit
status. However, many JVMs on OSX with default networking configuration
will take at least five seconds to run an elasticsearch-keystore
command. This is because we have to initialize log4j 2, and that
initialization code will call InetAddress.getLocalHost().getHostname(),
which is the bit of code with the bug. Since all our OSX distributions
come in tarballs, I've parameterized bin/elasticsearch so that we check
the encrypted flag byte with "dd". It might be nice to do this just for
Darwin distributions. For RPMs and Debian distros, we use the keystore
tool for the check.

I've also added a little bit more output to the bin/elasticsearch
command, and improved handling of the case where the script fails to
read any lines from standard input.
Java's classic buffered readers are efficient because they read more
than is strictly needed for an operation and store it in a buffer. The
implication of this is that you shouldn't use multiple buffered readers
on the same input stream. Unfortunately, our CLI tools were doing this
in a few places. Tests didn't catch this because our mock Terminal class
doesn't reveal the problem.

First, the SystemTerminal's readText method was creating a new
BufferedReader per invocation, which meant that it cannot reliably read
two lines in a row. There is a new TerminalTest that covers this case.
The code fix was to lazily initialize a new BufferedReader the first
time we try to read from standard input.

Second, the AddStringKeyStoreCommand class created its own
BufferedReader when the -x/--stdin option was supplied. I'm not sure
what this flag really accomplishes, other than perhaps not printing a
prompt. Without the flag, the input still comes in over stdin. At any
rate, I've replaced the BufferedReader with an invocation of the
Terminal and added a test in AddStringKeystoreCommandTests.
I've reworked the unit tests in accordance with gradlew check. Instead
of setting System.in, I've created a test class specifically for the
SystemTerminal. Unlike the MockTerminal we already have, it doesn't
verify output, but it does exercise some code paths that the
MockTerminal skips. I had to add a new protected method to the Terminal
class, which is less than ideal. Perhaps these tests should be
integration tests rather than unit tests.
This work makes it possible to mount a password-protected keystore in a
docker image and provide that keystore's password as a docker
environment variable. There is a little bit of trickery around dirty
output from the keystore-list command, and the control flow might have
gotten a little too tangled and nested, but I hope it works for a first
cut.
@williamrandolph
Copy link
Author

I'm closing this in order to re-issue it against the main Elasticsearch repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant