Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent invalid implementation of PrimaryKey and Prefixer outside of storage-plus #10

Open
hashedone opened this issue Sep 15, 2021 · 9 comments
Labels
documentation Improvements or additions to documentation
Milestone

Comments

@hashedone
Copy link
Contributor

hashedone commented Sep 15, 2021

Implementing PrimaryKey and Prefixer outside of storage-plus doesn't seems scary at all (you just want to use your type as a key, why not to?), but there is an issue - if done improperly it may break things. Trivial example is making a type like:

struct FancyKeyType {
  secondary: U32Key,
  primary: U32Key,
};

With an Ord implementation which compares firstly on primary, then on secondary. Then when calling .range(..) on such key, I expect them being sorted with my Ord implementation, but they come in different one, because they are sorted basing on their internal binary representation (basically bitwise lexografical order is used). And even worse, when using Bounds on such key I would get random items.

I have 3 proposal to work it around:

  • Making PrimaryKey and Prefixer unsafe - this would not prevent from implementing them externally, but it would bring attention to the reasons why they are marked as not safe to implement. In the documentation there should be an additional explanation about how keys are compared and how to properly implement them.
  • Making PrimaryKey and Prefixer sealed traits which is trivial, but prevents implementing them outside the crate at all. (which might be useful for some custom keys when done correctly)
  • Making internal comparisons using actual Ord, but it would require reversing conversions of key to &[u8], so it is blocked by Better return values from range/prefix cw-plus#198.
@maurolacy
Copy link
Contributor

I like bullet point three, but it can be difficult to implement. I'll take a stab at CosmWasm/cw-plus#198 soon, by the way.

A fourth option could be to discourage / forbid using multi-field structs as keys. But I think we don't want that, right?

@hashedone
Copy link
Contributor Author

@maurolacy it wont help, consider:

struct MyWorkaround {
  secondary: String,
  primary: String,
}

struct MyKey(MyWorkaround);

Now you have a single-field struct as a key ;) Not mentioning, that String is itself multifield. And TBH I have no idea how to forbid such thing and if it is even possible. You can discourage whatever you want, but this kind of arbitrary discouraging leads to situation when noone knows why do you discourage something, and it loses its purpose. If you really want to discourage something because you strongly rely on underlying implementation, then you may mark is as unsafe, and in doc have a proper description why it is unsafe (so reasons are not forgot).

@ethanfrey
Copy link
Member

I would just add some docs on this. Not try to prevent it.

@ethanfrey ethanfrey added the documentation Improvements or additions to documentation label Dec 14, 2021
@uint uint transferred this issue from CosmWasm/cw-plus Oct 17, 2022
@0xForerunner
Copy link
Contributor

0xForerunner commented Dec 15, 2022

adding onto this because I think it's relevant. It would be great if there were docs or examples on how to implement Prefixer, PrimaryKey, Bounder and KeyDeserialize for an enum. If someone here can confirm that this is the correct implementation, I would be happy to add something like this to the docs and submit a PR. @ethanfrey @hashedone

#[derive(Serialize, Deserialize, Clone, Debug, Hash, PartialEq, Eq, JsonSchema, PartialOrd, Ord)]
#[serde(rename_all = "snake_case")]
#[repr(u8)]
pub enum AssetInfo {
    Token { contract_addr: Addr },
    NativeToken { denom: String },
}

impl<'a> PrimaryKey<'a> for AssetInfo {
    type Prefix = String;
    type SubPrefix = ();
    type Suffix = u8;
    type SuperSuffix = Self;

    fn key(&self) -> Vec<Key> {
        match self {
            Self::Token { contract_addr: addr } => {
                vec![Key::Ref(addr.as_bytes()), Key::Val8([0])]
            }
            Self::NativeToken { denom } => {
                vec![Key::Ref(denom.as_bytes()), Key::Val8([1])]
            }
        }
    }
}

impl<'a> Prefixer<'a> for AssetInfo {
    fn prefix(&self) -> Vec<Key> {
        match self {
            Self::Token { contract_addr: addr } => {
                vec![Key::Ref(addr.as_bytes()), Key::Val8([0])]
            }
            Self::NativeToken { denom } => {
                vec![Key::Ref(denom.as_bytes()), Key::Val8([1])]
            }
        }
    }
}

impl<'a> From<AssetInfo> for Bound<'a, AssetInfo> {
    fn from(val: AssetInfo) -> Self { Bound::exclusive(val) }
}

impl<'a> Bounder<'a> for AssetInfo {
    fn inclusive_bound(self) -> Option<Bound<'a, Self>> { Some(Bound::inclusive(self)) }

    fn exclusive_bound(self) -> Option<Bound<'a, Self>> { Some(Bound::exclusive(self)) }
}

impl KeyDeserialize for AssetInfo {
    type Output = Self;

    #[inline(always)]
    fn from_vec(mut value: Vec<u8>) -> StdResult<Self::Output> {
        let mut split = value.split_off(2);

        match split.pop().unwrap() {
            0 => Ok(Self::Token { contract_addr: Addr::from_vec(split)? }),
            1 => Ok(Self::NativeToken { denom: String::from_vec(split)? }),
            _ => Err(StdError::GenericErr { msg: "Failed deserializing.".into() }),
        }
    }
}

This implementation is working A1, but it would be great if someone with better knowledge on the subject could confirm.

@0xForerunner
Copy link
Contributor

@ethanfrey @hashedone Would be awesome to get some closure on this! If either of you are able to confirm that the above is a reasonable implementation then I'd love to add it to the docs!

@maurolacy
Copy link
Contributor

maurolacy commented Jan 5, 2023

This is certainly a possible implementation of PrimaryKey and related traits for an enum. Some comments below.

#[derive(Serialize, Deserialize, Clone, Debug, Hash, PartialEq, Eq, JsonSchema, PartialOrd, Ord)]
#[serde(rename_all = "snake_case")]
#[repr(u8)]
pub enum AssetInfo {
    Token { contract_addr: Addr },
    NativeToken { denom: String },
}

impl<'a> PrimaryKey<'a> for AssetInfo {
    type Prefix = String;
    type SubPrefix = ();
    type Suffix = u8;
    type SuperSuffix = Self;

    fn key(&self) -> Vec<Key> {
        match self {
            Self::Token { contract_addr: addr } => {
                vec![Key::Ref(addr.as_bytes()), Key::Val8([0])]
            }
            Self::NativeToken { denom } => {
                vec![Key::Ref(denom.as_bytes()), Key::Val8([1])]
            }

What I don't like here is that you're putting the enum identifier at the end. I think this will mix both enum types (depending on the lexicographical order of the values) when doing a range query. It will be better IMO to put this id at the beginning. That way you can get an iterator that will list all the assets of the first type first, then all of the 2nd type, etc.

That will also allow proper prefixing of the enum types. With a custom convention, of course. You can add some helper constants to make that clear (const ASSETINFO_TOKEN: u8 = 0, const ASSETINFO_NATIVE: u8 = 1). Using those constants in the code looks like a good idea too.

    }
}

}

impl<'a> Prefixer<'a> for AssetInfo {
fn prefix(&self) -> Vec {
match self {
Self::Token { contract_addr: addr } => {
vec![Key::Ref(addr.as_bytes()), Key::Val8([0])]
}
Self::NativeToken { denom } => {
vec![Key::Ref(denom.as_bytes()), Key::Val8([1])]
}
}

I think this follows the strategy / pattern of Prefixer for tuples. So, it is correct.

}

}

impl<'a> From for Bound<'a, AssetInfo> {
fn from(val: AssetInfo) -> Self { Bound::exclusive(val) }
}

This can be a little obscure, and is not really needed.

impl<'a> Bounder<'a> for AssetInfo {
fn inclusive_bound(self) -> Option<Bound<'a, Self>> { Some(Bound::inclusive(self)) }

fn exclusive_bound(self) -> Option<Bound<'a, Self>> { Some(Bound::exclusive(self)) }

}

impl KeyDeserialize for AssetInfo {
type Output = Self;

#[inline(always)]
fn from_vec(mut value: Vec<u8>) -> StdResult<Self::Output> {
    let mut split = value.split_off(2);

    match split.pop().unwrap() {
        0 => Ok(Self::Token { contract_addr: Addr::from_vec(split)? }),
        1 => Ok(Self::NativeToken { denom: String::from_vec(split)? }),
        _ => Err(StdError::GenericErr { msg: "Failed deserializing.".into() }),
    }
}

}


This implementation is working A1, but it would be great if someone with better knowledge on the subject could confirm.

Looks good. What I would like to see is extensive tests for it. To confirm it works as intended.

@0xForerunner
Copy link
Contributor

@maurolacy Thanks for taking the time to write that up! Really appreciate it. Glad to have a second set of eyes on that. I chose to put the enum discriminate as a trailing value simply because of the nicer .pop semantics when deserializing, but what you mention about ordering is definitely more important than that.

That will also allow proper prefixing of the enum types. With a custom convention, of course. You can add some helper constants to make that clear (const ASSETINFO_TOKEN: u8 = 0, const ASSETINFO_NATIVE: u8 = 1).

That's a good suggestion, will definitely make that change.

Looks good. What I would like to see is extensive tests for it. To confirm it works as intended.

I have been using it in my pre-production code base for a few months without any issue and already have some preexisting tests I'd love to share. Is this a good place to share that? Perhaps I should start a PR for the docs?

@maurolacy
Copy link
Contributor

maurolacy commented Jan 5, 2023

Regarding tests, I said it just as a suggestion. You can share the link to your repo if it's public code.

Feel free to submit some doc changes, and we'll consider adding it to the docs.

@0xForerunner
Copy link
Contributor

0xForerunner commented Jan 7, 2023

some basic tests here that all pass! I implemented the changes you suggested and sorting is indeed working correctly.

    #[test]
    fn test_key_serialize_deserialzie() {
        let mut owned_deps = mock_dependencies();
        let deps = owned_deps.as_mut();
        pub const ASSETS: cw_storage_plus::Map<&AssetInfo, String> = cw_storage_plus::Map::new("assets");

        let native_token_1 = AssetInfo::NativeToken { denom: "utest1".into() };
        let native_token_2 = AssetInfo::NativeToken { denom: "utest2".into() };
        let token_1 = AssetInfo::Token { contract_addr: Addr::unchecked("my_address1") };
        let token_2 = AssetInfo::Token { contract_addr: Addr::unchecked("my_address2") };

        ASSETS.save(deps.storage, &token_1, &"token_1".into()).unwrap();
        ASSETS.save(deps.storage, &token_2, &"token_2".into()).unwrap();
        ASSETS.save(deps.storage, &native_token_1, &"native_token_1".into()).unwrap();
        ASSETS.save(deps.storage, &native_token_2, &"native_token_2".into()).unwrap();

        assert_eq!(ASSETS.load(deps.storage, &native_token_1).unwrap(), "native_token_1");
        assert_eq!(ASSETS.load(deps.storage, &native_token_2).unwrap(), "native_token_2");
        assert_eq!(ASSETS.load(deps.storage, &token_1).unwrap(), "token_1");
        assert_eq!(ASSETS.load(deps.storage, &token_2).unwrap(), "token_2");

        let list = read_map(deps.as_ref(), None, None, ASSETS).unwrap();
        assert_eq!(list.len(), 4);
        // native tokens have a discriminate of 0 so are sorted first
        assert_eq!(list[0].0, native_token_1);
        assert_eq!(list[1].0, native_token_2);
        assert_eq!(list[2].0, token_1);
        assert_eq!(list[3].0, token_2);
    }

Will work on some changes to the docs when I've got some more tests written.

@uint uint added this to the 3.0 milestone Sep 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

5 participants