Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

indexeddb: fix bugs in serialization improvement #3651

Merged
merged 7 commits into from
Jul 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion crates/matrix-sdk-indexeddb/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# UNRELEASED

- Improve the efficiency of objects stored in the crypto store. ([#3645](https://github.com/matrix-org/matrix-rust-sdk/pull/3645))
- Improve the efficiency of objects stored in the crypto store.
([#3645](https://github.com/matrix-org/matrix-rust-sdk/pull/3645), [#3651](https://github.com/matrix-org/matrix-rust-sdk/pull/3651))

- Add new method `IndexeddbCryptoStore::open_with_key`. ([#3423](https://github.com/matrix-org/matrix-rust-sdk/pull/3423))

Expand Down
145 changes: 102 additions & 43 deletions crates/matrix-sdk-indexeddb/src/crypto_store/indexeddb_serializer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ use base64::{
engine::{general_purpose, GeneralPurpose},
Engine,
};
use gloo_utils::format::JsValueSerdeExt;
use matrix_sdk_crypto::CryptoStoreError;
use matrix_sdk_store_encryption::{EncryptedValueBase64, StoreCipher};
use serde::{de::DeserializeOwned, Deserialize, Serialize};
Expand Down Expand Up @@ -177,7 +178,7 @@ impl IndexeddbSerializer {
// - `MaybeEncrypted::Encrypted` becomes a JS object with properties {`version`,
// `nonce`, `ciphertext`}.
//
// - `MaybeEncrypted::Unencrypted` becomes a JS string.
// - `MaybeEncrypted::Unencrypted` becomes a JS string containing base64 text.
//
// Otherwise, it probably uses our old serialization format:
//
Expand All @@ -186,30 +187,73 @@ impl IndexeddbSerializer {
// array of JSON bytes. Net result is a JS array.
//
// - Unencrypted values were serialized to JSON, then deserialized into a
// javascript object. (Hopefully not one that can be turned into a
// `MaybeEncrypted::Encrypted`.)
// javascript object/string/array/bool.
//
// Note that there are several potential ambiguities here:
//
// - A JS string could either be a legacy unencrypted value, or a
// `MaybeEncrypted::Unencrypted`. However, the only thing that actually got
// stored as a string under the legacy system was `backup_key_v1`, and that is
// special-cased not to use this path — so if we can convert it into a
// `MaybeEncrypted::Unencrypted`, then we assume it is one.
//
// - A JS array could be either a legacy encrypted value or a legacy unencrypted
// value. We can tell the difference by whether we have a `cipher`.
//
// - A JS object could be either a legacy unencrypted value or a
// `MaybeEncrypted::Encrypted`. We assume that no legacy JS objects have the
// properties to be successfully decoded into a `MaybeEncrypted::Encrypted`.

// First check if it looks like a `MaybeEncrypted`, of either type.
if let Ok(maybe_encrypted) = serde_wasm_bindgen::from_value(value.clone()) {
return Ok(self.maybe_decrypt_value(maybe_encrypted)?);
}

// Check for legacy encrypted format.
if let (true, Some(cipher)) = (value.is_array(), &self.store_cipher) {
// `value` is a JS-side array containing the byte values. Turn it into a
// rust-side Vec<u8>.
let value: Vec<u8> = serde_wasm_bindgen::from_value(value)?;
// Otherwise, fall back to the legacy deserializer.
self.deserialize_legacy_value(value)
}

return Ok(cipher.decrypt_value(&value).map_err(CryptoStoreError::backend)?);
}
/// Decode a value that was encoded with an old version of
/// `serialize_value`.
///
/// This should only be used on values from an old database which are known
/// to be serialized with the old format.
pub fn deserialize_legacy_value<T: DeserializeOwned>(
&self,
value: JsValue,
) -> Result<T, IndexeddbCryptoStoreError> {
match &self.store_cipher {
Some(cipher) => {
if !value.is_array() {
return Err(IndexeddbCryptoStoreError::CryptoStoreError(
CryptoStoreError::UnpicklingError,
));
}

// Looks like legacy encrypted format.
//
// `value` is a JS-side array containing the byte values. Turn it into a
// rust-side Vec<u8>, then decrypt.
let value: Vec<u8> = serde_wasm_bindgen::from_value(value)?;
Ok(cipher.decrypt_value(&value).map_err(CryptoStoreError::backend)?)
}

// Check for legacy unencrypted format
if value.is_object() && self.store_cipher.is_none() {
return Ok(serde_wasm_bindgen::from_value(value)?);
None => {
// Legacy unencrypted format could be just about anything; just try
// JSON-serializing the value, then deserializing it into the
// desired type.
//
// Note that the stored data was actually encoded by JSON-serializing it, and
// then deserializing the JSON into Javascript objects — so, for
// example, `HashMap`s are converted into Javascript Objects
// (whose keys are always strings) rather than Maps (whose keys
// can be other things). `serde_wasm_bindgen::from_value` will complain about
// such things. The correct thing to do is to go *back* to JSON
// and then deserialize into Rust again, which is what `JsValue::into_serde`
// does.
Ok(value.into_serde()?)
}
}

// Can't figure out what this is.
Err(IndexeddbCryptoStoreError::CryptoStoreError(CryptoStoreError::UnpicklingError))
}

/// Decode a value that was previously encoded with
Expand Down Expand Up @@ -250,8 +294,9 @@ impl IndexeddbSerializer {

#[cfg(all(test, target_arch = "wasm32"))]
mod tests {
use std::sync::Arc;
use std::{collections::BTreeMap, sync::Arc};

use gloo_utils::format::JsValueSerdeExt;
use matrix_sdk_store_encryption::StoreCipher;
use matrix_sdk_test::async_test;
use serde::{Deserialize, Serialize};
Expand Down Expand Up @@ -293,33 +338,26 @@ mod tests {
/// an old implementation of `serialize_value`, when a cipher is in use.
#[async_test]
async fn test_deserialize_old_serialized_value_with_cipher() {
// An example of an object which was serialized using the old-format
// `serialize_value`.
let serialized = [
123, 34, 118, 101, 114, 115, 105, 111, 110, 34, 58, 49, 44, 34, 99, 105, 112, 104, 101,
114, 116, 101, 120, 116, 34, 58, 91, 52, 53, 44, 49, 49, 54, 44, 49, 54, 57, 44, 49,
57, 49, 44, 49, 49, 54, 44, 57, 53, 44, 49, 50, 54, 44, 52, 51, 44, 54, 49, 44, 50, 52,
55, 44, 50, 50, 52, 44, 57, 54, 44, 49, 49, 51, 44, 49, 50, 50, 44, 49, 57, 50, 44, 52,
55, 44, 50, 51, 52, 44, 50, 52, 51, 44, 55, 44, 49, 55, 55, 44, 49, 57, 44, 50, 48, 57,
44, 50, 52, 50, 44, 49, 48, 56, 44, 49, 52, 54, 44, 49, 52, 44, 49, 52, 52, 44, 50, 54,
44, 50, 52, 56, 44, 49, 54, 55, 44, 49, 53, 44, 50, 51, 48, 44, 49, 52, 48, 44, 56, 51,
44, 50, 48, 52, 44, 51, 55, 44, 49, 56, 51, 44, 55, 50, 93, 44, 34, 110, 111, 110, 99,
101, 34, 58, 91, 49, 52, 48, 44, 49, 50, 56, 44, 50, 50, 52, 44, 50, 49, 52, 44, 56,
54, 44, 49, 53, 51, 44, 50, 50, 52, 44, 49, 55, 56, 44, 49, 54, 54, 44, 49, 48, 49, 44,
48, 44, 54, 54, 44, 49, 57, 55, 44, 53, 55, 44, 50, 57, 44, 50, 52, 51, 44, 57, 53, 44,
50, 49, 53, 44, 56, 54, 44, 50, 48, 54, 44, 54, 53, 44, 49, 53, 54, 44, 49, 49, 57, 44,
53, 57, 93, 125,
]
.into_iter()
.map(JsValue::from)
.collect::<js_sys::Array>()
.into();

let serializer = IndexeddbSerializer::new(Some(Arc::new(test_cipher())));
let cipher = test_cipher();
let obj = make_test_object();

// Follow the old format for encoding:
// 1. Encode as JSON, in a Vec<u8> of bytes
// 2. Encrypt
// 3. JSON-encode to another Vec<u8>
// 4. Turn the Vec into a Javascript array of numbers.
let data = serde_json::to_vec(&obj).unwrap();
let data = cipher.encrypt_value_data(data).unwrap();
let data = serde_json::to_vec(&data).unwrap();
let serialized = JsValue::from_serde(&data).unwrap();

// Now, try deserializing with `deserialize_value`, and check we get the right
// thing.
let serializer = IndexeddbSerializer::new(Some(Arc::new(cipher)));
let deserialized: TestStruct =
serializer.deserialize_value(serialized).expect("could not deserialize");

assert_eq!(make_test_object(), deserialized);
assert_eq!(obj, deserialized);
}

/// Test that `deserialize_value` can decode a value that was encoded with
Expand All @@ -328,7 +366,8 @@ mod tests {
async fn test_deserialize_old_serialized_value_no_cipher() {
// An example of an object which was serialized using the old-format
// `serialize_value`.
let serialized = js_sys::JSON::parse(&(json!({"id":0,"name":"test"}).to_string())).unwrap();
let json = json!({ "id":0, "name": "test", "map": { "0": "test" }});
let serialized = js_sys::JSON::parse(&json.to_string()).unwrap();

let serializer = IndexeddbSerializer::new(None);
let deserialized: TestStruct =
Expand All @@ -337,6 +376,21 @@ mod tests {
assert_eq!(make_test_object(), deserialized);
}

/// Test that `deserialize_value` can decode an array value that was encoded
/// with an old implementation of `serialize_value`, when no cipher is
/// in use.
#[async_test]
async fn test_deserialize_old_serialized_array_no_cipher() {
let json = json!([1, 2, 3, 4]);
let serialized = js_sys::JSON::parse(&json.to_string()).unwrap();

let serializer = IndexeddbSerializer::new(None);
let deserialized: Vec<u8> =
serializer.deserialize_value(serialized).expect("could not deserialize");

assert_eq!(vec![1, 2, 3, 4], deserialized);
}

/// Test that `deserialize_value` can decode a value encoded with
/// `maybe_encrypt_value`, when a cipher is in use.
#[async_test]
Expand Down Expand Up @@ -399,10 +453,15 @@ mod tests {
struct TestStruct {
id: u32,
name: String,

// A map, whose keys are not strings. This is an edge-case we previously got wrong. Maps
// are represented differently in JSON from Javascript objects, and that particularly
// matters when their keys are not strings.
map: BTreeMap<u8, String>,
}

fn make_test_object() -> TestStruct {
TestStruct { id: 0, name: "test".to_owned() }
TestStruct { id: 0, name: "test".to_owned(), map: BTreeMap::from([(0, "test".to_owned())]) }
}

/// Build a [`StoreCipher`] using a hardcoded key.
Expand Down
106 changes: 90 additions & 16 deletions crates/matrix-sdk-indexeddb/src/crypto_store/migrations/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ use crate::{

mod old_keys;
mod v0_to_v5;
mod v10_to_v11;
mod v5_to_v7;
mod v7;
mod v7_to_v8;
Expand Down Expand Up @@ -106,8 +107,12 @@ pub async fn open_and_upgrade_db(
v8_to_v10::schema_delete(name).await?;
}

if old_version < 11 {
v10_to_v11::data_migrate(name, serializer).await?;
}

// Open and return the DB (we know it's at the latest version)
Ok(IdbDatabase::open_u32(name, 10)?.await?)
Ok(IdbDatabase::open_u32(name, 11)?.await?)
}

async fn db_version(name: &str) -> Result<u32, IndexeddbCryptoStoreError> {
Expand Down Expand Up @@ -178,12 +183,16 @@ mod tests {
use matrix_sdk_store_encryption::StoreCipher;
use matrix_sdk_test::async_test;
use ruma::{room_id, OwnedRoomId, RoomId};
use serde::Serialize;
use tracing_subscriber::util::SubscriberInitExt;
use web_sys::console;

use super::{v0_to_v5, v7::InboundGroupSessionIndexedDbObject2};
use crate::{
crypto_store::{keys, migrations::*, InboundGroupSessionIndexedDbObject},
crypto_store::{
indexeddb_serializer::MaybeEncrypted, keys, migrations::*,
InboundGroupSessionIndexedDbObject,
},
IndexeddbCryptoStore,
};

Expand Down Expand Up @@ -472,7 +481,7 @@ mod tests {
fetched_backed_up_session: InboundGroupSession,
) {
let db = IdbDatabase::open(&db_name).unwrap().await.unwrap();
assert_eq!(db.version(), 10.0);
assert!(db.version() >= 10.0);
let transaction = db.transaction_on_one("inbound_group_sessions3").unwrap();
let raw_store = transaction.object_store("inbound_group_sessions3").unwrap();
let key = store.serializer.encode_key(
Expand Down Expand Up @@ -559,19 +568,9 @@ mod tests {
serializer.encode_key(old_keys::INBOUND_GROUP_SESSIONS_V1, (room_id, session_id));
let pickle = session.pickle().await;

let serialized_session = if let Some(cipher) = &store_cipher {
// Old-style serialization/encryption. First JSON-serialize into a byte array...
let data = serde_json::to_vec(&pickle).unwrap();
// ... then encrypt...
let encrypted = cipher.encrypt_value_data(data).unwrap();
// ... then JSON-serialize into another byte array ...
let value = serde_json::to_vec(&encrypted).unwrap();
// and finally, turn it into a javascript array.
JsValue::from_serde(&value).unwrap()
} else {
JsValue::from_serde(&pickle).unwrap()
};

// Serialize the session with the old style of serialization, since that's what
// we used at the time.
let serialized_session = serialize_value_as_legacy(&store_cipher, &pickle);
sessions.put_key_val(&key, &serialized_session).unwrap();
}
txn.await.into_result().unwrap();
Expand All @@ -581,8 +580,83 @@ mod tests {
db.close();
}

/// Test migrating `backup_keys` data from store v10 to latest,
/// on a store with encryption disabled.
#[async_test]
async fn test_v10_v11_migration_unencrypted() {
test_v10_v11_migration_with_cipher("test_v10_migration_unencrypted", None).await
}

/// Test migrating `backup_keys` data from store v10 to latest,
/// on a store with encryption enabled.
#[async_test]
async fn test_v10_v11_migration_encrypted() {
let cipher = StoreCipher::new().unwrap();
test_v10_v11_migration_with_cipher("test_v10_migration_encrypted", Some(Arc::new(cipher)))
.await;
}

/// Helper function for `test_v10_v11_migration_{un,}encrypted`: test
/// migrating `backup_keys` data from store v10 to store v11.
async fn test_v10_v11_migration_with_cipher(
db_prefix: &str,
store_cipher: Option<Arc<StoreCipher>>,
) {
let _ = make_tracing_subscriber(None).try_init();
let db_name = format!("{db_prefix:0}::matrix-sdk-crypto");

// delete the db in case it was used in a previous run
let _ = IdbDatabase::delete_by_name(&db_name);

// Given a DB with data in it as it was at v5
let db = create_v5_db(&db_name).await.unwrap();

let txn = db
.transaction_on_one_with_mode(keys::BACKUP_KEYS, IdbTransactionMode::Readwrite)
.unwrap();
let store = txn.object_store(keys::BACKUP_KEYS).unwrap();
store
.put_key_val(
&JsValue::from_str(old_keys::BACKUP_KEY_V1),
&serialize_value_as_legacy(&store_cipher, &"1".to_owned()),
)
.unwrap();
db.close();

// When I open a store based on that DB, triggering an upgrade
let store =
IndexeddbCryptoStore::open_with_store_cipher(&db_prefix, store_cipher).await.unwrap();

// Then I can read the backup settings
let backup_data = store.load_backup_keys().await.unwrap();
assert_eq!(backup_data.backup_version, Some("1".to_owned()));
}

async fn create_v5_db(name: &str) -> std::result::Result<IdbDatabase, DomException> {
v0_to_v5::schema_add(name).await?;
IdbDatabase::open_u32(name, 5)?.await
}

/// Emulate the old behaviour of [`IndexeddbSerializer::serialize_value`].
///
/// We used to use an inefficient format for serializing objects in the
/// indexeddb store. This replicates that old behaviour, for testing
/// purposes.
fn serialize_value_as_legacy<T: Serialize>(
store_cipher: &Option<Arc<StoreCipher>>,
value: &T,
) -> JsValue {
if let Some(cipher) = &store_cipher {
// Old-style serialization/encryption. First JSON-serialize into a byte array...
let data = serde_json::to_vec(&value).unwrap();
// ... then encrypt...
let encrypted = cipher.encrypt_value_data(data).unwrap();
// ... then JSON-serialize into another byte array ...
let value = serde_json::to_vec(&encrypted).unwrap();
// and finally, turn it into a javascript array.
JsValue::from_serde(&value).unwrap()
} else {
JsValue::from_serde(&value).unwrap()
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@
// See the License for the specific language governing permissions and
// limitations under the License.

#[cfg(doc)]
use crate::crypto_store::keys::BACKUP_VERSION_V1;

/// Old format of the `inbound_group_sessions` store which lacked indexes or
/// a sensible structure
pub const INBOUND_GROUP_SESSIONS_V1: &str = "inbound_group_sessions";
Expand All @@ -20,3 +23,6 @@ pub const INBOUND_GROUP_SESSIONS_V1: &str = "inbound_group_sessions";
/// JSON-encoding and arrays of ints instead of base64.
/// Also lacked the `backed_up_to` property+index.
pub const INBOUND_GROUP_SESSIONS_V2: &str = "inbound_group_sessions2";

/// An old name for [`BACKUP_VERSION_V1`].
pub const BACKUP_KEY_V1: &str = "backup_key_v1";
Loading
Loading