Skip to content
liliakai edited this page Mar 24, 2013 · 10 revisions

The TextSecure encrypted messaging protocol is derivative of "OTR Messaging." The major difference being the use of ECC keys instead of standard DSA, as well as the compression of some data structure formats.

Common Structures

struct {
  opaque version[1];
} TextSecure_Version;

struct {
  opaque key_id[3];
  opaque key[33];
} TextSecure_Key;
  • "version" is a 8 bit value that represents both the currently speaking as well as maximum supported version of the protocol a client supports. The top 4 bits represent the version of the current message to follow, and the bottom 4 bits represent the maximum supported protocol version the sending client speaks.
  • "key_id" is the key ID number.
  • "key" is an ECC key from the P-256 curve, encoded down to 33 bytes using point compression.

Key Exchange Messsage Format

struct {
  TextSecure_Version version;
  TextSecure_Key key;
} TextSecure_KeyExchange;

Data Message Format

struct {
  TextSecure_Version version;
  opaque sender_key_id[3];
  opaque recipient_key_id[3];
  TextSecure_Key next_key;
  opaque counter[3];
  opaque encrypted_message[...];
  opaque mac[10];
} TextSecure_Message; 
  • "sender_key_id" The ID of the sender's asymmetric key being used for this message.
  • "recipient_key_id" The ID of the recipient's asymmetric key being used for this message.
  • "next_key" An ID and key for the sender's next public key.
  • "counter" The top half of the counter to use for CTR mode encryption of this message. This value should increase and be unique for each (sender key ID, recipient key ID) tuple used.
  • "encrypted_message" The contents of the message, encrypted using AES-128 in CTR mode. The plaintext is always padded out with 0x00 bytes such that the total packet is an exact multiple of the maximum SMS size.
  • "mac" HMAC-SHA1 of all the previous fields, truncated to 10 bytes. Note that this is the encrypt-then-authenticate.

Transport Format

It is not possible to rely on the multipart message facility provided by the SMS UDH in order to send long messages. It does not work on CDMA networks, so we have to provide for message chaining ourselves. Messages are fragmented using the following format:

struct {
  opaque identifier[3];
  TextSecure_Version version;
  opaque fragment_count[1];
  opaque multipart_message_id[1]; (optional)
  TextSecure_Message message;
} TextSecure_Fragment;
  • "identifier" Everything following this field is appended to an identifier (the string "?TSK" for a key exchange message or the string "?TSM" for a data message) and hashed iteratively with SHA1 for 1000 iterations. The result is then truncated to 3 bytes.
  • "fragment_count" The top 4 bits represent the index of this fragment (0-based) and the bottom 4 bits represent the total number of fragments.
  • "multipart_message_id" This is a unique identifier to group message fragments together. It is only present if there is more than one message fragment in total.
  • "message" The TextSecure_Message, fragmented at the SMS size boundary. In order to avoid wasting space in the first fragment, when the complete encrypted and MAC'd application-level message is handed down to the transport layer, the version number is stripped out of the TextSecure_Message and put on the front of each transport-level fragment piece. Once the fragments are re-assembled, the version number is put back on the front of the TextSecure_Message and passed along to the application-layer as a fully reconstructed message.

The entire TextSecure_Fragment is Base64 encoded, without trailing padding.

Sending A Message

When composing a message, the total headerfied, encrypted, and mac'd data message is handed to the transport layer, which then splits the data message into fragments small enough to fit into individual SMS messages. Once the receiving end has received all the fragments, it reassembles them into one message again before verifying the MAC and decrypting.

Some knowledge of the transport layer sizing requirements is needed by the application layer, since it is responsible for padding the message out. What is handed to the transport layer should be sized such that when split into fragments and fully base64 encoded, it will exactly fill the maximum available payload space in each SMS message (140 bits, 160 characters).

In the end, there are 60 characters available in the first fragment, and 115 in each subsequent fragment.

Verifying And Decrypting A Message

A shared secret is generated by doing ECDH using the keys corresponding to the key ids in the message. All of the ECC parameters are chosen from the NSA Suite B specification.

From the shared secret (S0) we need two 128 bit AES keys (C1, C2) and two 160 bit HMAC-SHA1 keys (H1, H2).

Each client determines whether it is the "low" client or the "high" client by comparing the numerical values of their two public keys.

The two AES keys are computed by taking bytes from SHA256(S0 + 0x00) and SHA256(S0 + 0x01). The "low" client uses the former as their send key, where as the "high" client uses the latter as their send key. The receive keys are set vice-versa.

Mac key H1 = SHA1(C1), H2 = SHA1(C2). TextSecure session key derivation

Starting And Verifying A Session

Either client can begin a sesison by sending a key exchange message. When keys have been exchanged in both directions, a SHA1 fingerprint of each entire key exchange message is displayed which users can verify over the phone.

A session rolls forward just like an OTR session, where only two keys for the sender and recipient are maintained at any time. Keys previous to that are forgotten.

Local Encryption

Should clients choose to have "non-ephemeral" conversations, which is to say that the messages they send and receive are stored to disk, they need to be re-encrypted since the original keys used to encrypt them may vanish as the protocol rolls forward.

Clients generate a 128bit AES key and 160bit MAC key at install time, which are hashed and written to disk using PBKDF2.

Locally encrypted messages are then encrypted using AES-CBC$ with HMAC-SHA1 in the encrypt-then-authenticate paradigm. So the format is:

struct {
  opaque random_iv[16];
  opaque encrypted_message[...];
  opaque mac[20];
} TextSecure_LocalMessage;
  • "random_iv" is a random IV.
  • "encrypted_message" is the ciphertext of AES-CBC$(plaintext)
  • "mac" is HMAC-SHA1 of the ciphertext and IV.