This document describes a simple API for authenticating users over HTTP and (strongly preferred) HTTPs using JSON to exchange messages.
This API was heavily inspired, and considers the various security implications outlined by the SCRAM SASL specification.
This API follows the rationale of SCRAM and replaces the four binary messages described there with two pair of HTTP request and response pairs.
This API also extends SCRAM by introducing the ability to use different key derivation functions, allowing the extension of requests and responses exchanged between client and server, support for time based or counter based one-time passwords as defined by RFC-6238 (TOTP) and RFC-4226 (HOTP).
-
Terminology ==============
The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in RFC-2119.
For the purpose of this specification, the term JSON always refers to the data interchange format standardized by RFC-4627 and ECMA-404 and described at JSON.ORG and standardized
Also, the term BASE-64-URL encoding always refers to the URL-safe
and filename-safe Base64 encoding described in
RFC-4648, Section 5, with the
(non URL-safe) =
padding characters omitted, as permitted by the same RFC's
Section 3.2.
Finally, the term JWS and JWE always respectively refer to the structures proposed in the JSON Web Signature and JSON Web Encryption standard drafts.
- Core Definitions ===================
This specification relies on few core definitions as follows:
-
version
: The version described by this specification is number 1 -
user
: A user is identified by a non-empty sequence of Unicode characters. Individual implementations of this specification are free to determine how eachuser
can be mapped to a known entity to authenticate (for example, case normalization, unicode normalization or whitespace stripping). -
password
: A password is a sequence of unicode characters. This specification dictates that its binary representation for hashing and key derivation is the non-normalized UTF-8 encoding of such sequence.
-
CONCAT( param1, param2, ...)
: This defines the concatenation of the byte sequences represented by the variousparam1
,param2
, ..., in order. -
XOR( param1, param2)
: This defines the exclusive OR operation of the two byte sequences represented byparam1
andparam2
which MUST be of equal length. -
HASH(data)
: See section 4.1 -
HMAC(key, message)
: See section 4.2 -
KDF(password, ...)
: See section 5
- Request content types ========================
Server implementations MUST accept both application/json
and
application/x-www-form-urlencoded
content types as its request.
As this specification does not define complex JSON structures in its requests (no objects nested within objects), it is defined that a document like:
{
"key1": 1,
"key2": [ "value2A", "value2B" ],
"key3": true
}
Can be represented in its URL-encoded format as:
key1=1&key2=value2A&key2=value2B&key3=true
Or more formally:
-
strings: should be encoded in UTF-8] and then URL-encoded. For example
[email protected]
will be represented asuser%40example.org
while the Japanese name山田太郎
will be represented as%E5%B1%B1%E7%94%B0%E5%A4%AA%E9%83%8E
. -
numbers: should be represented precisely as in JSON. For example
1
,-123.456
and123e45
are all valid numbers -
booleans: should be identified as
true
orfalse
. -
arrays: the same key should be transmitted multiple times, as in the example JSON object above.
Additionally, while it might be possible to specify parameters encoded in
application/x-www-form-urlencoded
in the query part of the request URLs,
this method MUST NOT be supported, as request URLs are normally logged by
servers and proxies, and thus potentially disclosing sensitive information
inadvertently.
- Hashing ==========
Throughout this document we will refer to hashing algorithms.
Hashing algorithms are identified their name, case insensitive, and this specification recognizes the following names:
MD5
: from RFC-1321SHA1
: from RFC-3174SHA224
,SHA256
,SHA384
andSHA512
: from RFC-4634SHA3-224
,SHA3-256
,SHA3-384
,SHA3-512
: from FIPS 202.
Other hashing functions MAY be supported by clients and/or server but their identification is left unspecified in this document.
In the Session Creation phase of the algorithm the client and server negotiate a hashing algoritm to perform the steps necessary in the Session Authentication.
This specification requires that all client and server implementations MUST
support SHA256
and SHA512
as exchange hashes, while MD5
and SHA1
SHOULD NOT be used, due to their relative vulnerabilities which led to their
deprecation by NIST and other sources.
The HASH(data)
function used throughout this document identifies the hashing
function associated with the algorithm negotiated between client and server
during the Session Creation phase.
The HMAC(key, message)
function used throughout the document is described by
RFC-2104 and its underlying hashing
function is identified by the exchange hash negotiated between client and
server during the Session Creation phase and
described above.
The two parameters key
and message
respectively identify the K
and text
variables outlined in RFC-2104's section 2
- Key Derivation Functions ===========================
SCRAM forces the use of PBKDF2, and while this specification does not define the way in which credentials and passwords are hashed, it defines the formal way in which some of the most used functions should be represented.
The details of how a password was originally hashed by the server MUST be transmitted to the client in the Session Creation phase.
We will identify this as a KDF Specification and this document defines it to
be a JSON structure containing at least the key function
, whose value (case
insensitive) uniquely identifies the Key Derivation Function to use and its
parameters.
Additionally, the salt
key (optional, but always specified in the three
examples below) is defined to be some random data that was used as additional
input to the function when hashing the password.
The KDF(password, salt)
function used throughout this document identifies
the key derivation function associated with the specification transmitted by
the server during the Session Creation phase (this
assumes that all key derivation functions require a salt).
Depending on the KDF Algorithm the KDF Specification JSON MAY include additional keys, and this specification outlines three of such methods.
For PBKDF2 the formal definition of its KDF Specification is as follows:
function
: The stringPBKDF2
(case insensitive).hash
: The hasing function used to derive the key, as one of the hashing algorithms described in section 4.salt
: The random data that was used as additional input to the function when hashing the password, encoded in BASE-64-URL.iterations
: The number of iterations emploeyed by the function.derived_key_length
: The number of bytes of the derived key.
For example, the representation of the last test vector described in RFC-6070 would be:
"kdf_specification": {
"function": "PBKDF2",
"hash": "SHA1",
"salt": "c2EAbHQ",
"iterations": 4096,
"derived_key_kength": 16
}
For SCrypt the formal definition of its KDF Specification is as follows:
function
: The stringSCRYPT
(case insensitive).hash
: The hasing function used to derive the key, as one of the hashing algorithms described in section 4.salt
: The random data that was used as additional input to the function when hashing the password, encoded in BASE-64-URL.cost
: The CPU and memory cost parameterN
.block_size
: The block size parameterR
.parallelization
: The parallelization parameterP
.derived_key_length
: The number of bytes of the derived key.
For example, the representation of the last test vector described in its specification would be:
"kdf_specification": {
"function": "SCRYPT",
"hash": "SHA256",
"salt": "U29kaXVtQ2hsb3JpZGU",
"cost": 1048576,
"block_size": 8,
"parallelization": 1,
"derived_key_length": 64
}
For BCrypt the formal definition of its KDF Specification is as follows:
function
: The stringBCRYPT
(case insensitive).hash
: (optional) The hasing function applied to the password before it is passed to the BCrypt function, as one of the hashing algorithms described in section 4. Defaults to no pre-hashing.salt
: The random (128 bit long) data that was used as additional input to the function when hashing the password, encoded in BASE-64-URL.cost
: A number representing BCrypt's cost parameter.
The hash
parameter in this particular case is optional, but widely used in
several implementations, as the BCrypt
function limits its password input to either 56 or 72 bytes.
"kdf_spec": {
"function": "BCRYPT",
"salt": "st3dXjLkbOzhbPWFxDvf9g",
"cost": 10
}
- Digital Signatures and Encryption ====================================
The SCRAM API relies on channel binding for additional verification of the server (in other words the whole exchange can be verified using the SSL exchange of the transport).
In HTTPS this might be unrealistic as API requests might be proxied (think for example of CDN deployments, load balancers, reverse proxies, ...) and the server performing authentication MIGHT not have access to the private key securing the connection to the client.
This specification, therefore, relies on the proposed standard for JSON Web Signatures and JSON Web Encryption in order to validate (or encrypt) the exchange between server and clients.
Servers implementing this specification SHOULD always sign the required
values using a proper alg
from the Json Web Algorithms specification under
section 3.1
(for example RS256
, or ES256
), while clients MAY use the none
algorithm
to identify that they are not configured to sign the values exchanged (see
Unsecured JWS).
With regards to encryption, this specification does not describe a way how encryption keys can be exchanged between server and clients. Such exchange is out of the scope of this API.
The typ
parameter in the scope of this specification shall always be the
string json
or application/json
, representing the nature of the payload
that will be signed.
// TODO
Finally, the kid
parameter shall represent the SHA1 fingerprint of the
X.509 (DER) encoding of the public key which can be used to verify the JWS.
The payload to be signed shall always be a JSON object, and its UTF-8 encoding MUST be used as the binary input for signature genration.
- Server Configuration =======================
It is assumed that the server providing authentication is configured with the following data:
-
exchange_hash
: The hashing algorithm that clients should use when authenticating, as outlined in section 4.1. -
shared_key
: A sequence of bytes used by the clients to authenticate themselves; this document recommends a random sequence of at least the same number of bytes as produced by theexchange_hash
function. The SCRAM specification defines this to be the hard-coded constantClient Key
. -
signing_key
: A sequence of bytes used by the clients to validate responses from the server; this document recommends a random sequence of at least the same number of bytes as produced by theexchange_hash
function. The SCRAM specification defines this to be the hard-coded constantServer Key
. -
private_key
andpublic_key
: A keypair used to sign the response by the server, as outlined in section 6
Please note that exchange_hash
, shared_key
are actually transmitted by the
server to the client by this API.
The signing_key
is never transmitted by the API, but SHOULD be configured in
clients to perform server proof verification.
In the same way, the public_key
used to validate signatures SHOULD also be
configured into clients. The server only transmits its SHA1 fingerprint, as
outlined in section 6.
The private_key
should never be exposed beyond the realms of the server.
- Password Storage ===================
In order to protect the integrity of the original plain-text passwords the server MUST only store cryptographically secure values derived from it.
In addition to this, as the result of the Key Derivation Function over the plain-text password could be used as a password equivalent in the SCRAM negotiation, the server SHOULD NOT store this value directly, but rather only keep values derived from it.
Assuming that the server is configured to use specific hashing and key
derivation functions, and has generated a random salt
value, it should
proceed as follows:
salted_password := KDF ( password, salt )
client_key := HMAC ( salted_password, shared_key )
stored_key := HASH ( client_key )
server_key := HMAC ( salted_password, signing_key )
The server can then simply store the stored_key
, which will be sufficient for
authentication of clients, and the server_key
that will be used to proove to
clients its knowledge of the original password.
- API Overview ===============
The first HTTP request and response pair (later called Session Creation)
is the equivalent of the client-first
and server-first
message exchange in
SCRAM.
The second HTTP request and respons pair (later called Session
Authentication) is the equivalent of the client-final
and server-final
message exchange in SCRAM.
During this first interaction, client informs the server of the desire to perform an authentication operation sending a POST to a well-known URL transmitting the following:
version
: Always1
, the version number of this specification.request
: A JWS or JWE structure enclosing and signing or encrypting a JSON object containing:user
: A unique identifier for the user to be authenticated as outlined in section 2.client_nonce
: A random sequence of at least 32 bytes generated by the client that will be used to derive a key protecting the exchange, encoded in BASE-64-URL.x-...
: (optional) Any extra information the client needs to transfer to the server in order to perform authentication, as described in section 11
The response from the server MUST be either one of:
201 Created
: A session was created by the server and the client shall continue attempting authentication using the URL specified in theLocation
header of the response This is known as the Session URL.400 Bad Request
: A required parameter was not specified or was not valid.401 Unauthorized
: The server failed to verify the signature of the JWS or decrypt the JWArequest
token.405 Method Not Allowed
: If the HTTP method was notPOST
.503 Service Unavailable
: The server is rejecting the session creation operation, for example when rate limiting is in place. ARetry-After
header CAN be included in the response to inform the client of such time restrictions.
The body of a successful 201 Created
response includes the following:
version
: Always1
, the version number of this specification.response
: A JWS or JWE structure enclosing and signing or encrypting a JSON object containing:exchange_hash
: The hashing function to be used outlined in section 4.1.kdf_specification
: The specification for a one-way function that hashes a password as typed by an end user (a Key Derivation Function) as outlined in section 5.server_nonce
: A random sequence of at least the same number of bytes produced by theexchange_hash
function (or 32 bytes, whichever is greater) generated by the server that will be used to derive a key protecting the exchange, encoded in BASE-64-URL.shared_key
: The key used by the server to protect the stored information, outlined in section 7 encoded in BASE-64-URL.require_otp
: (optional) A boolean value (defaults tofalse
) indicating that the server requires the additional validation of a one time password as detailed in section 10. Iffalse
or not present, no such requirement exists.x-...
: (optional) Any extra information the server needs to transfer to the client in order to perform authentication, as described in section 11
The server SHOULD issue a 400 Bad Request
response only in the following
scenarios:
- The
version
was not specified, or was not1
. - The
request
token was not a valid JWS or JWE. - The
user
was not specified, or was an empty string. - The
client_nonce
was not specified, or was not properly encoded in BASE-64-URL, or it was less than 32 bytes.
In order not to disclose potentially sensitive information, the server SHOULD
respond with a 201 Created
response (with placeholder or invalid data) if
the user
was not recognized by the server.
A client Session Creation request will look like:
POST /login HTTP/1.1
Content-Type: application/json; charset=UTF-8
{
"version": 1,
"request": "...(the JWS/JWE structure for session creation request)..."
}
The content enclosed and signed by/encrypted in the request
JWS/JWE:
{
"user": "...(the user identifier)...",
"client_nonce": "...(at least 32 bytes encoded in BASE-64-URL)...",
"x-...": "...(any additional information to be transmitted to the server)..."
}
A valid response from the server:
HTTP/1.1 201 Created
Location: /login/sessions/...(the unique identifier of the session)...
Content-Type: application/json; charset=UTF-8
{
"version": 1,
"response": "...(the JWS/JWE structure for session creation response)..."
}
The content enclosed and signed by/encrypted in the response
JWS/JWE:
{
"exchange_hash": "...(the hash to use for validation)...",
"kdf_specification": {
"function": "...(the kdf function used to hash the password)...",
"salt": "...(the salt used to hash the password, if needed)...",
"...": "...(any other parameter to drive key derivation)..."
},
"server_nonce": "...(at least N bytes encoded in BASE-64-URL)...",
"shared_key": "...(the shared_key known by the server, encoded in BASE-64-URL)...",
"x-...": "...(any additional information to be transmitted to the client)..."
}
Once a session is initialized, and the client has retrieved the required hash
,
server_nonce
, kdf_specification
, and obtained a password
from the user,
the client should compute the following values:
# The "user" and "client_nonce" were transmitted during session creation
# The "server_nonce" was included in the server's reply to session creation
auth_message := CONCAT ( user, client_nonce, server_nonce )
# The resulting hashed password, applying a key derivation function
salted_password := KDF ( password, salt )
# The client key (and derived stored key) the server will use to authenticate
client_key := HMAC ( salted_password, shared_key )
derived_stored_key := HASH ( client_key )
# Per-session masking of the derived stored key
client_signature := HMAC ( derived_stored_key, auth_message )
client_proof := XOR ( client_key, client_signature )
The client_proof
is the value that will need to be transmitted to the server,
and to do so the client will prepare a JSON object containing:
version
: Always1
, the version number of this specification.request
: A JWS or JWE structure enclosing and signing or encrypting a JSON object containing:user
: The same identifier specified in the Session Creation phase.client_nonce
: The same value specified in the Session Creation phase.server_nonce
: The sameserver_nonce
value as received from the server in the Session Creation phase.client_proof
: The proof derived by the client from the original (or salted) password encoded in BASE-64-URL.client_otp_proof
: (optional) The proof derived from a one time password, as outlined in section 10x-...
: (optional) Any extra information the client needs to transfer to the server in order to perform authentication, as described in section 11, including all the extra keys that were specified in the Session Creation phase.
The server SHOULD be validating the appropriateness of the Session URL
at which the request was received (see section 9.4 below),
and the correctness of the request
JWS or JWE structure specified by the
client.
After retrieving the stored_key
from its underlying passwords storage as
outlined in section 8, it should compute the following
in order to authenticate the session:
# The "user" and "client_nonce" were received during session creation
# The "server_nonce" was included in the server's reply to session creation
auth_message := CONCAT ( user_id, client_nonce, server_nonce )
# The "stored_key" is known by the server
server_signature := HMAC ( stored_key, auth_message )
# The "client_proof" was received from the client in session authentication
derived_client_key := XOR ( client_proof, server_signature )
# The derived stored key to match for authentication
derived_stored_key := HASH ( derived_client_key )
If the calculated derived_stored_key
matches exactly the stored_key
known
by the server, we can guarantee that the client derived correctly (or was aware
of the) salted_password
associated with the user.
If the authentication is successful, the server MAY need to compute the a
server_proof
which COULD be required by the client in order to trust
that the server had (at one point) access to the same salted password the
client calculated. It therefore retrieves the server_key
outlined in
section 8 and calculates:
server_proof := HMAC ( server_key, auth_message )
At this point, the server prepares a response, which can be one of:
200 Ok
: The session was authenticated by the server.400 Bad Request
: A required parameter was not specified or was not valid.401 Unauthorized
: If the session could not be authenticated, either because of an invalid JWS signature/JWE encryptopm, or invalidclient_proof
, or because the Session URL was unknown to (or could not be verified by) the server.
The body of a succesful 200 Ok
response will be a JSON including:
version
: Always1
, the version number of this specification.response
: A JWS or JWE structure enclosing and signing or encrypting a JSON object containing:server_proof
: (optional) The calculated proof informing the client that the server had (at one point) access to the salted passwordserver_otp_proof
: (optional) The calculated proof informing the client that the server can generate the same one-time password as the client.x-...
: (optional) Any extra information the server needs to transfer to the client, as described in section 11.
The server only needs to transmit the server_proof
and server_otp_proof
if
it knows the client requires them (for example, after validating the request's
JWS signature).
The server SHOULD issue a 400 Bad Request
response only in the following
scenarios:
- The
version
was not specified, or was not 1. - The
request
token was not a valid JWS or JWE. - The
user
key was not specified, or its value was an empty string. - One of the required
client_nonce
,server_nonce
, orclient_proof
parameters was not specified, or was not properly encoded in BASE-64.
A 404 Not Found
SHOULD NOT be used as a response for invalid sessions, as
such a response MAY be used to potentially harvest sensitive information from
the server.
A client Session Authentication request will look like:
POST /login/session/...(the unique identifier of the session)... HTTP/1.1
Content-Type: application/json; charset=UTF-8
{
"version": 1,
"request": "...(the JWS/JWE structure for session authentication request)..."
}
The content enclosed and signed by/encrypted in the request
JWS/JWE:
{
"user": "...(the user identifier from session creation)...",
"client_nonce": "...(the client nonce from session creation)...",
"server_nonce": "...(the the same nonce received from the server)...",
"client_proof": "...(the proof calculated by the client in _BASE-64-URL_)...",
"client_otp_proof": "...(the optional client one time password proof)...",
"x-...": "...(any additional information to be transmitted to the server)..."
}
A valid response from the server:
HTTP/1.1 200 Ok
Content-Type: application/json; charset=UTF-8
{
"version": 1,
"response": "...(the JWS/JWE structure for session authentication response)..."
}
The content enclosed and signed by/encrypted in the response
JWS/JWE:
{
"server_proof": "...(the proof calculated by the server in BASE-64-URL)...",
"server_otp_proof": "...(the one time password proof calculated by the server)...",
"x-...": "...(any additional information to be transmitted to the client)..."
}
Clients willing to do so MAY be able to verify the authenticity of the
server_proof
transmitted by the server.
This step requires the client's knowledge of the signing_key
, which is not
transmitted by this API but should be exchanged via a different channel.
The client can generate a derived_server_proof
using said signing_key
and
the salted_password
calculated in section 9.2:
derived_server_key := HMAC ( salted_password, signing_key )
devived_server_proof := HMAC ( derived_server_key, auth_message )
If the derived_server_proof
matches the server_proof
received
from the server the client can be sure that at one point the server had access
to the calculated salted_password
and signing_key
.
Please note that in order to improve security, the Session URL returned by the server SHOULD NOT be predictable (in other words, they should not be derived from a timestamp or a sequential counter).
Session URLa SHOULD also be valid only for a restricted amount of time, after which they MUST be considered invalid.
Server implementations MAY choose to return session URLs containing a
verifiable fingerprint of the exchange, as all the keys transmitted by the
client during this phase, and the returned server_nonce
will be retransmitted
verbatim again during the Session Authentication phase.
If servers choose to use such an approach, they should be also employing a way to timeout sessions after a pre-determined amount of time and blacklist sessions at once after a successful (or failed) authentcation is performed against them.
For example:
session_secret := ... a secret known only to the server...
session_expiration := ... a timestamp when the session expores...
# The "user", "client_nonce" and "server_nonce" will be returned by the client,
# while "session_expiration" will be encoded in the session URL below
session_id := CONCAT ( user, client_nonce, server_nonce, session_expiration )
session_signature := HMAC ( session_secret, session_id )
# Combine the values to return a URL
session_url := CONCAT ( "/login/session/", session_expiration, ".", BASE-64-URL(session_signature) )
The server could trivially verify the validity of such a Session URL and
would only need to blacklist the session_signature
value (for example in
a cache) until its expiration.
- One-Time Password Support =============================
When a server requires the additional verification of a one time password,
specifying a true
(boolean) value for the require_otp
key of the Session
Creation response, the client should obtain such value.
This specification defines the otp_password
as the UTF-8 encoding of the
string produced by the TOTP or
HOTP algorithms (henceforth, a sequence
of numbers of a pre-agreed length, derived from a shared secret).
Before sending its Session Authentication packet, the client computes the following values:
# The client key (and derived stored key) the server will use to authenticate
client_otp_key := HMAC ( otp_password, shared_key )
# Per-session masking of the derived stored key
client_otp_signature := HMAC ( client_otp_key, auth_message )
client_otp_proof := XOR ( client_otp_key, client_otp_signature )
It therefore performs the same computation for the client_proof
, simply
replacing the salted_password
with the otp_password
.
The client_otp_proof
, encoded in BASE-64-URL is then added to the
JSON enclosed in the request
JWS or JWE as previously described in
section 9.2.
The server then determines the original value of the one-time password using the same mechanism and secret used by the end user in order to generate one (or multiple) possible values.
Multiple values MAY be generated by the server in order to accomodate for time drift (in case of TOTP) or counter drift (in case of HOTP).
For each of those values (otp_password
below) it then computes the following:
# Performed for each otp password, the shared key is the same used for passwords
server_otp_key := HMAC ( otp_password, shared_key )
server_otp_signature := HMAC ( server_otp_key, auth_message )
derived_client_otp_key := XOR ( client_otp_proof, server_otp_signature )
If one of the computed derived_client_otp_key
matches the server_otp_key
then the server can be sure that the client had access to the original
otp_password
.
With regards to the server_otp_proof
, once the server has found the correct
server_otp_value
from the multiple ones it MAY have generated, the proof
is calculated as follows:
signed_otp_key := HMAC ( otp_password, signing_key )
server_otp_proof := HMAC ( signed_otp_key, auth_message )
Any client configured with the signing_key
as outlined previously in
section 9.3 can apply the same calculation
and validate that the server_otp_proof
received from the server matches.
- Non-Standard Extensions ===========================
As outlined above outlining the session creation phase
and the session authentication one, request and
response entities MAY include additional x-...
prefixed keys.
While this specification does not determine what kind of information may be exchanged in those values, it permits their presentce as non-standard extensions.
Client and server implementations are free to exchange any additional piece
of information as long as the JSON keys identifying them in the request
and response
entities are prefixed by x-...
.