Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User Access in Presto with File-Based Authentication #797

Closed
wants to merge 1 commit into from

Conversation

rupamk
Copy link
Member

@rupamk rupamk commented May 21, 2019

FileBasedAuthentication is added where usernames and passwords are provided to Presto through a file which contains user credentials in a standard format and users submitting the query are authenticated using this information. More Details

  • For enabling it, the following steps should be followed
  1. Add in /etc/config.properties :

http-server.https.enabled=true
http-server.https.port=8443
http-server.https.keystore.path=<path-to-keystore>/<keystore>
http-server.https.keystore.key=<key>
http-server.authentication.type=PASSWORD

Note: For generating the keystore, refer here

  1. Create: /etc/password-authenticator.properties containing:
    password-authenticator.name=file
    file.config-file=/usr/lib/presto/etc/file_auth
    file.refresh-period=10ms
    file.bcrypt.min-cost=8
    file.pbkdf2.min-iterations=1000
    file.auth-token-cache-max-size=1000

  2. Create a "user:password" in /etc/file_auth containing:
    userPBKDF2:1000:5b4240333032306164:f38d165fce8ce42f59d366139ef5d9e1ca1247f0e06e503ee1a611dd9ec40876bb5edb8409f5abe5504aab6628e70cfb3d3a18e99d70357d295002c3d0a308a0
    userBCrypt:$2y$10$BqTb8hScP5DfcpmHo5PeyugxHz5Ky/qf3wrpD7SNm8sWuA3VlGqsa

  3. After starting presto server, execute:

presto-cli/target/presto-cli-*-executable.jar --server https://127.0.0.1:8443 --user userBCrypt --password

  • With correct password, and then “show catalog“ runs without error.

  • With wrong password, and then “show catalog“ yields:

Error running command: Authentication failed: Access Denied: Invalid credentials

  • With blank (pressing enter in the password step) password, and then “show catalog“ yields:

Error running command: Authentication failed: Malformed decoded credentials

  • Also added product test: PrestoCliFileAuthTests:

presto-product-tests/bin/run_on_docker.sh fileauth -t io.prestosql.tests.cli.PrestoCliFileAuthTests

2019-10-11 06:14:14 INFO: Completed 6 tests
2019-10-11 06:14:14 INFO: 6 SUCCEEDED / 0 FAILED / 0 SKIPPED
2019-10-11 06:14:14 INFO: Tests execution took 1 minutes and 3 seconds

  • Added Unit tests: TestFileAuthConfig

  • Modifications in the design:

--> Cache Implementation:
Periodically reload the password file, so that administrators don't need to restart the server when modifying the user list. (configurable - file.refresh-period passed as a parameter in /etc/password-authenticator.properties)
For the encryption algorithms, the number of rounds or iterations must be chosen. A larger number means it is more expensive to compute the hash, and thus slows down an attacker that is cracking the password file. Currently, file.bcrypt-min-cost and file.pbkdf2-min-iterations takes as input the values for enforcing the restriction on the two hashing algorithms used.

LRU cache is implemented to reuse the decoded passwords for a certain period of time (bounded by file.refresh-period, since anyhow they are going to get cleared when the file reloads). Also, since to restrict the LRU cache to explode a size of 1000 is kept fixed and after that least used entries (typically the negative entries) are evicted.

** Overall changes summary **

  1. Maintain a Supplier for reloading the file auth file every after file_refresh_period to facilitate modifying /updating the file_auth file without restarting the presto server.
  2. Maintain an LRU cache for storing the authentication token (to avoid hashing the input password for every request) which is bounded by the lifetime of the Filebasedauthenticator.class (ideally everytime 1 refreshes 2 gets cleared too)
  3. The only two hashing algorithms used are BCRYPT and PBKDF2.
  4. Now once the Filebasedauthenticator.class gets invoked, it reads the file and creates a Authentication Map of {username: User object}.
  5. While the User object is getting created, two functionalities are performed: a. hashing algo is determined b. hashing restrictions are checked (for eg. for bcrypt password if the cost is < 8 (configurable value) there will be exceptions and the server will fail, for PBKDF2 the min iterations allowed for password creation is 1000 (configurable value) )
  6. Note, the Users.readUsers has been updated by using Files.stream().
  7. Once the authenticate call is initiated from PasswordAuthenticator a new User object is created. Now as a first round of checking, the Authentication Map created in step 4 while reading the file is checked to find whether the user name exists, if yes credentials are authenticated via User.authenticate to verify the password.
  8. Unit Tests added to verify each one of the functionality.
  9. Product tests updated to account for the current changes

Addresses part of #1065

@cla-bot
Copy link

cla-bot bot commented May 21, 2019

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Rupam Kundu.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@cla-bot
Copy link

cla-bot bot commented May 21, 2019

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please submit the signed CLA to [email protected]. For more information, see https://github.com/prestosql/cla.

1 similar comment
@cla-bot
Copy link

cla-bot bot commented May 21, 2019

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please submit the signed CLA to [email protected]. For more information, see https://github.com/prestosql/cla.

@rupamk rupamk force-pushed the FileBasedAuthenticatorPlugin branch from 0de3905 to b0413f6 Compare May 21, 2019 04:10
@cla-bot
Copy link

cla-bot bot commented May 21, 2019

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please submit the signed CLA to [email protected]. For more information, see https://github.com/prestosql/cla.

@electrum
Copy link
Member

Thanks for the contribution.

One initial thought is that we shouldn't support insecure password hashing options. Anything that only hashes the password once is not secure as it is vulnerable to brute force cracking attacks. The primary secure options for password hashing are bcrypt and PBKDF2. The former seems to be more popular, and possibly have a slight security advantage, while the latter is included with the JDK and is a NIST standard. Supporting either or both of these is reasonable. This Java bcrypt implementation looks good.

The password file format you chose is good, since it's the same format as the Apache HTTP Server and thus can be generated using the htpasswd command line program (with the -B flag for bcrypt). Unfortunately, htpasswd doesn't support PBKDF2, so supporting bcrypt is required for this user convenience.

For either of these algorithms, the number of rounds or iterations must be chosen. A larger number means it is more expensive to compute the hash, and thus slows down an attacker that is cracking the password file. We should probably enforce a minimum value for security purposes. The htpasswd program uses 5 rounds by default, which seems woefully inadequate, as it only takes about 0.1 milliseconds.

On the flip side, this means it is more expensive for Presto to validate the password, which adds latency to requests and increases CPU load on the coordinator. We will probably want caching, since with the current protocol design, authentication is performed for every request.

An unrelated consideration is that we should periodically reload the password file, so that administrators don't need to restart the server when modifying the user list. This is easily accomplished by using a cache with a configurable expiration time (maybe default to 5 seconds).

@kokosing
Copy link
Member

An unrelated consideration is that we should periodically reload the password file, so that administrators don't need to restart the server when modifying the user list. This is easily accomplished by using a cache with a configurable expiration time (maybe default to 5 seconds).

See how it was implemented for file based access control: io.prestosql.security.FileBasedSystemAccessControl.Factory#create(java.util.Map<java.lang.String,java.lang.String>)

@rupamk
Copy link
Member Author

rupamk commented Jun 6, 2019

@electrum I have addressed the other changes and @kokosing thanks a lot for your suggestion. But before I update the PR, I have a question regarding the comment : "We should probably enforce a minimum value for security purposes." for the number of rounds of hashing. Currently, org.apache.commons.codec.digest.Md5Crypt uses by default 1000 rounds, is that enough?

@electrum
Copy link
Member

electrum commented Jun 6, 2019

@rupamk Thanks for updating the PR. Allow me to restate my previous objections:

  • We should only allow bcrypt and PBKDF2. The other algorithms are insecure.
  • For bcrypt, we should use this library which is more modern.

I'd like to hear other's opinions about minimum iterations. For bcrypt, a work factor of 5 is the absolute minimum we should allow. However, to prevent users from making mistakes, enforcing 8 as the minimum seems better. I got the decimal point wrong previously. On my laptop, 5 takes 1ms, 6 takes 3ms, 7 takes 5ms, and 8 takes 11ms. If we implement caching, we can mitigate the CPU increase on the coordinator of repeatedly validating the same passwords. We should apply the same policy to PBKDF2.

Copy link
Member

@electrum electrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it's good to have product tests, and thanks for adding them, we should also have unit tests for the plugin. The product tests should cover basic end-to-end integration tests, answering the question, does the plugin actually work? While the user tests should cover the full functionality. Unit tests are closer to the code and easier to run.

@cla-bot cla-bot bot added the cla-signed label Jun 24, 2019
@rupamk rupamk force-pushed the FileBasedAuthenticatorPlugin branch from f85acfa to 1644591 Compare June 24, 2019 22:47
@rupamk rupamk force-pushed the FileBasedAuthenticatorPlugin branch 4 times, most recently from 27eeead to 8c3eb4d Compare June 29, 2019 01:45
Copy link
Member

@kokosing kokosing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some comments, review in progress...

@rupamk rupamk force-pushed the FileBasedAuthenticatorPlugin branch 5 times, most recently from c6ca106 to 4b92c7c Compare August 21, 2019 19:25
@rupamk
Copy link
Member Author

rupamk commented Aug 23, 2019

@electrum @kokosing Thanks again for your detailed comments, I have addressed all of those.

Copy link
Member

@electrum electrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking much better and is nearly ready to merge. The only major concern is the way refresh is handled in FileBasedAuthenticatorFactory. Otherwise, lots of minor comments.

@electrum electrum self-assigned this Oct 9, 2019
@electrum
Copy link
Member

@rupamk I notice some of these comments say "done" but I don't see any new commits. Please let me know when the latest changes are pushed.

@electrum electrum removed their assignment Oct 10, 2019
@rupamk rupamk force-pushed the FileBasedAuthenticatorPlugin branch 2 times, most recently from 431ba9a to aae846c Compare October 11, 2019 00:21
@rupamk
Copy link
Member Author

rupamk commented Oct 11, 2019

@electrum I have pushed my changes.

@rupamk rupamk force-pushed the FileBasedAuthenticatorPlugin branch 2 times, most recently from 2b613ba to d0424a0 Compare October 11, 2019 04:40
FileBasedAuthentication Plugin is added where usernames and passwords are provided to Presto through a file which contains user credentials in a standard format and users submitting the query are authenticated using this information.

** Overall changes summary **

1. Maintain a Supplier for reloading the file auth file after every configurable refresh_period to facilitate modifying /updating the file containing the user:password without restarting the presto server.

2. Maintain an LRU cache for storing the authentication token (to avoid hashing the input password for every request) which is bounded by the lifetime of the FileBasedAuthenticator.class (ideally everytime 1 refreshes 2 gets cleared too)

3. The only two hashing algorithms used are BCRYPT and PBKDF2.

4. Once the Filebasedauthenticator.class gets invoked, it reads the file and creates a Authentication Map containing User objects against a unique username. This map is created as a Supplier so that the user profiles expires after the provided refresh_period.

5. While the User object is getting created, two functionalities are performed: a. hashing algo is determined and stored for that user profile b. hashing restrictions are checked (for eg. for bcrypt password if the cost is < 8 (configaurable value) there will be exceptions and the server will fail, for PBKDF2 the min iterations allowed for password creation is 1000 (configurable value))

6. During query execution, once the authenticate call is initiated with a set of (username, password), as a first round of checking, the Authentication Map created in step 4 while reading the file is checked to find whether the user name exists, if yes credentials are authenticated to verify the password.

7. Unit Tests added to verify each one of the functionality.

8. Product tests updated to account for the current changes

* Recent Changes *
1. Minor changes
2. Refresh Logic is now moved inside FileBasedAuthenticator.class
@rupamk rupamk force-pushed the FileBasedAuthenticatorPlugin branch from d0424a0 to c62927d Compare October 14, 2019 09:26
@rupamk rupamk requested a review from electrum October 17, 2019 20:43
@oneonestar
Copy link
Member

I suggest adding user documentation for this new feature.

@rupamk
Copy link
Member Author

rupamk commented Oct 21, 2019

@electrum How should I create the documentation? For example, write it up in a doc and share it here?

@oneonestar
Copy link
Member

@rupamk You may refer to these:
7f4629a#diff-edb3cc046cfd52edd599530b6ef34deeR16
7f4629a#diff-4b87c64d82de81b33ca3406105e4376a

@electrum
Copy link
Member

Replaced by #1912

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants