Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reset elastic password cli tool #74892

Merged
merged 22 commits into from
Jul 20, 2021

Conversation

jkakavas
Copy link
Member

@jkakavas jkakavas commented Jul 5, 2021

This change introduces a new CLI tool that allows users to reset
the password for the elastic user, setting it to a user defined
or an auto-generated value.

Resolves: #70113
Depends on: #74890

jkakavas added 5 commits July 2, 2021 21:16
This change introduces a CLI tool that can be used to create
enrollment tokens. It doesn't require credentials, but simply
write access to the local filesystem of a node. It uses an
auto-generated user in the file-realm with superuser role.

For this purpose, this change also introduces a base class for a
CLI tool that can be used by any CLI tool needs to perform actions
against an ES node as a superuser without requiring credentials
from the user. It is worth noting that this doesn't change our
existing thread model, because already an actor with write access
to the fs of an ES node, can become superuser (again, by
adding a superuser to the file realm, albeit manually).
This change introduces a new CLI tool that allows users to reset
the password for the elastic user, setting it to a user defined
or an auto-generated value.

Resolves: elastic#70113
@jkakavas jkakavas added >enhancement :Security/Authentication Logging in, Usernames/passwords, Realms (Native/LDAP/AD/SAML/PKI/etc) v8.0.0 labels Jul 5, 2021
@elasticmachine elasticmachine added the Team:Security Meta label for security team label Jul 5, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

* retries as the file realm might not have reloaded the users file yet in order to authenticate our
* newly created file realm user.
*/
private void checkClusterHealthWithRetries(Environment env, Terminal terminal, int retries) throws Exception {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were chatting around this with @albertzaharovits today. This seems reasonable as a base check but were wondering whether a more nuanced approach is necessary/helpful.

The concern is that for some tools that extend this command and have specialized purpose and needs, this check might be overly coarse grained. Point in case, for the reset password tool, we might not be interested if any data index is red( making the cluster health red ), as long as .security is green. On the other hand, we want this to be useful even if the .security index is not yet created so we'd need to check for the existence of the .security index before the health check so that it won't time out. We can't use the exists API in 8.0 for system indices, so we'd have to use _cat/indices or _cat/aliases and parse the response.

The question is whether we need this nuanced approach, and if we do, do we implement it here in the base class or do we abstract it and let the extending classes implement this as they see fit. Given in the two classes that extend this (CreateEnrollmentTokenTool and ResetElasticPasswordTool) so far, we only care about the security index, I lean on doing the former.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_cat/indices and/or _cat/aliases are not that friendly for parsing the output. Another idea would be to not attempt to do the health check up front, but depend on the error message that we would get from Elasticsearch in order to determine if the security index health is RED. There are a couple of issues with this too:

  • We don't consistently check the health of the security index in the SecurityIndexManager before making operations on it. Some methods do manual checks like isAvailable() and then call checkIndexVersionThenExecute, some call prepareIndexIfNeededThenExecute() directly which doesn't call isAvailable(). So we would need to depend on the behavior of the actual indexing of the document in the index.
  • Not all implementing classes of BaseRunAsSuperuser will be calling APIs with the CommandLineHttpClient themselves. Reset password does, but CreateEnrollmentTokenTool does not, instead depends on CreateEnrollmentToken methods that call the APIs internally and will wrap exceptions.

It should still be doable but some of it sounds like guesswork.

Is "if your cluster health is RED, you shouldn't be making additional operations on it (like using these tools) unless it's about resolving the issue, so the tool doesn't work if cluster health is RED" an appropriate stance or do we think that a nuanced approach ( i.e. each tool defines it's own health/index checks as it sees fit and the base class just validates file realm auto-generated credentials ) is helpful and necessary ?

Opening up the discussion, @tvernum I know from previous discussions that you have opinions on this type of behavior by our CLI tools, care to chime in?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disclaimer: Speaking in principle, haven't checked the code too carefully:
Any check that we do before the actual operation has limited usefulness because the condition can change in between the time of check and the operation time (time of check, time of use).
I think it's worth investing in some tests that "pin" the errors (ideally consistent across APIs) for an unavailable/red .security index.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We chatted with Tim a little bit about this and we came to a similar conclusion which was that until we can identify all the ways this can fail in an actionable and informative way, replicating the behavior of other tools ( warn if cluster health is red and allow to override ) is preferable to complicate the checks for handling some of the cases. I plan on moving forward as is and open an issue to track the effort to do what you suggest, if you don't think this is blocking for the current PR

protected final void execute(Terminal terminal, OptionSet options, Environment env) throws Exception {
validate(terminal, options, env);
KeyStoreWrapper keyStoreWrapper = keyStoreFunction.apply(env);
final Environment newEnv;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you're building an environment here.
There is the EnvironmentAwareCommand to handle that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BaseRunAsSuperuserCommand extends KeyStoreAwareCommand which extends EnvironmentAwareCommand already. What I need to do here is to to factor in the secure settings we might have read from the keystore , thus the new environment. Not sure how EnvironmentAwareCommand can handle that for me ,let me know if I'm missing something !

}
users = new HashMap<>(users);
users.put(username, hasher.hash(password));
FileUserPasswdStore.writeFile(users, passwordFile);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think guaranteeing that the user has roles assigned when it exists is important?

To make it appear atomic, maybe it would be better to first update the roles file and then the users file (otherwise there are states where the user exists but doesn't have a role, and we ought to handle this case as a retry).
In addition, we would also have to change the server code to ensure that if the roles file is changed before the users file, changes to the roles file are always considered when updating the users.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think guaranteeing that the user has roles assigned when it exists is important?

Yes!Thanks.

To make it appear atomic, maybe it would be better to first update the roles file and then the users file

I think that's a good idea nevertheless

In addition, we would also have to change the server code to ensure that if the roles file is changed before the users file, changes to the roles file are always considered when updating the users.

Not sure I follow this, maybe we can quickly chat in our sync

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the order of execution here and added 403 to the list of error codes we should retry upon

if (fileRealmSettings.size() > 1) {
throw new UserException(
ExitCodes.CONFIG,
"Multiple file realms are configured. "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, lets factor out the code in the server code that tells the file realm configuration is correct and that a file realm is enabled and its order.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is already happening on server side

Copy link
Member Author

@jkakavas jkakavas Jul 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove the check if you think it's cleaner. The base class assumes that any tool that extends it already needs a running cluster and if a node attempts to start with a configuration file that has more than one file realms, then the node would fail to start because of that, so in theory we would never stumble upon this error in the tool.

CheckedFunction<Environment, KeyStoreWrapper, Exception> keyStoreFunction,
CheckedFunction<Environment, CreateEnrollmentToken, Exception> createEnrollmentTokenFunction
) {
super(clientFunction, keyStoreFunction, "Creates enrollment tokens for elasticsearch nodes and kibana instances");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand this supposed to be a help message (Command$printHelp)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description of the command is printed when the command is run with -h , yes. Can you elaborate on what your comment was about ? Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect a description of allowed parameters (like node/kibana)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect a description of allowed parameters (like node/kibana)

We don't put these in the description of the command. All of our CLI tools extend Command and it calls printHelp when the -h parameter is passed ( or on error ). printHelp prints the parameters and their description.

FWIW, the ouput of the -h is

Creates enrollment tokens for elasticsearch nodes and kibana instances

Option (* = required)  Description                                             
---------------------  -----------                                             
-E <KeyValuePair>      Configure a setting                                     
-f, --force            Use this option to force execution of the command       
                         against a cluster that is currently unhealthy.        
-h, --help             Show help                                               
* -s, --scope          The scope of this enrollment token, can be either "node"
                         or "kibana"                                           
-v, --verbose          Show verbose output        


Map<String, char[]> users = FileUserPasswdStore.parseFile(passwordFile, null, env.settings());
if (users == null) {
throw new UserException(ExitCodes.CONFIG, "File realm configuration file [" + passwordFile + "] is missing");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we try to clean UserRolesStore anyway?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes good idea. I originally thought that worst case scenario is that the roles files remain but reference a non-existent user which is ok. It is a bit sloppy and it's a similar comment to what Albert made on the original PR about trying to make these 2 changes atomic. I will rework this

users = new HashMap<>(users);
char[] passwd = users.remove(username);
if (passwd != null) {
// No need to overwrite, if the user was already removed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the password be null? If yes, the user will stay in the file

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See java.util.Map#remove and FileUserPasswdStore#parseFile. In short, parseFile wlll not return a Map that has a null value because we would have bailed on parsing it. As such, if remove returns null is because the users Map did not contain an entry with the username key.

userRoles = new HashMap<>(userRoles);
String[] roles = userRoles.remove(username);
if (roles != null) {
// No need to overwrite, if the user was already removed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

protected ResetElasticPasswordTool(
Function<Environment, CommandLineHttpClient> clientFunction,
CheckedFunction<Environment, KeyStoreWrapper, Exception> keyStoreFunction) {
super(clientFunction, keyStoreFunction, "Resets the password or the elastic built-in user");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or => of
Also, I believe the description is supposed to be the help message

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I believe the description is supposed to be the help message

It is !

@jkakavas
Copy link
Member Author

I repeated some comments on Base class from the create enrollment token tool PR, pls ignore them if you address them there

Thanks ! Let's keep things contained in one of the PRs. Ideally, as discussed, we can review and merge CreateEnrollmentTokensTool first and then I'll rebase the changes here. No need to have the same discussion in multiple places.

@jkakavas jkakavas requested a review from BigPandaToo July 12, 2021 09:06
@jkakavas jkakavas requested a review from lockewritesdocs July 12, 2021 14:52
Copy link
Contributor

@BigPandaToo BigPandaToo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment about the help message, I may be wrong about my expectation from the -help, so I am leaving it to you
LGTM otherwise

@jkakavas
Copy link
Member Author

I added a comment about the help message, I may be wrong about my expectation from the -help, so I am leaving it to you
LGTM otherwise

I explained in #74892 (comment)

@lockewritesdocs
Copy link
Contributor

@elasticmachine run elasticsearch-ci/docs

Copy link
Contributor

@lockewritesdocs lockewritesdocs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a number of comments and suggested changes. A few options seemed missing, so I added those or left questions about them.

Copy link
Contributor

@albertzaharovits albertzaharovits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

One minor question: the namespace for ResetElasticPasswordTool looks to me a bit generic. I would've assigned the same namespace as the SetupPasswordTool.

@jkakavas
Copy link
Member Author

the namespace for ResetElasticPasswordTool looks to me a bit generic. I would've assigned the same namespace as the SetupPasswordTool.

SGTM, I did the change, thanks!

@jkakavas jkakavas merged commit 13c8ee1 into elastic:master Jul 20, 2021
ywangd pushed a commit to ywangd/elasticsearch that referenced this pull request Jul 30, 2021
This change introduces a CLI tool that can be used to create
enrollment tokens. It doesn't require credentials, but simply
write access to the local filesystem of a node. It uses an
auto-generated user in the file-realm with superuser role.

For this purpose, this change also introduces a base class for a
CLI tool that can be used by any CLI tool needs to perform actions
against an ES node as a superuser without requiring credentials
from the user. It is worth noting that this doesn't change our
existing thread model, because already an actor with write access
to the fs of an ES node, can become superuser (again, by
adding a superuser to the file realm, albeit manually).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Security/Authentication Logging in, Usernames/passwords, Realms (Native/LDAP/AD/SAML/PKI/etc) Team:Security Meta label for security team v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

elastic user password reset CLI tool
7 participants