-
-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(cache,model)!: parse image hashes #1405
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Parse image hashes as 128 bit integers. Image hashes from Discord are base 16, 32 character length strings, and so we can reliably parse and fit them into a 128 bit integer, with an additional bit (padded to 8 bits) to store whether the image hash is animated, totalling 17 bytes to store an image hash. This is an alternative over storing image hashes as heap allocated Strings, which are stored as a pointer (usize), a length (usize), and capacity (usize), totalling 24 bytes on most systems, in addition to the value on the heap. This will give us a win in terms of memory usage. In terms of CPU usage, we do lose out a little by parsing image hashes compared to simply allocating the content: ``` imagehash string time: [191.64 ns 191.94 ns 192.26 ns] imagehash parse time: [271.75 ns 271.84 ns 271.96 ns] ``` This can likely be improved in a later PR by improving optimizations in the primary parsing loop. In total, this removes 45 Strings from the cache and model crates, effectively removing about 1/7th of Strings in the model crate alone. The API of this looks like: ```rust /// Efficient storage for Discord image hashes. #[derive(Clone, Copy, Debug, Eq, Hash, PartialEq)] struct ImageHash { /// Instantiate a new hash from its raw parts. const fn new(bytes: [u8; 16], animated: bool) -> Self; /// Parse an image hash into an efficient integer-based storage. const fn parse(value: &[u8]) -> Result<Self, ImageHashParseError>; /// Efficient packed bytes of the hash. const fn bytes(self) -> [u8; 16]; /// Whether the hash is for an animated image. const fn is_animated(self) -> bool; /// Create an iterator over the nibbles of the hexadecimal image hash. const fn nibbles(self) -> Nibbles; } impl<'de> Deserialize<'de> for ImageHash; impl Display for ImageHash; impl FromStr for ImageHash; impl Serialize for ImageHash; impl TryFrom<&'_ [u8]> for ImageHash; impl TryFrom<&'_ str> for ImageHash; /// Iterator over the nibbles of a hexadecimal image hash. #[derive(Debug)] struct Nibbles; impl DoubleEndedIterator for Nibbles; impl ExactSizeIterator for Nibbles; impl Iterator for Nibbles; ``` BREAKING CHANGES: cache and model fields and methods that are or return image hashes are now `twilight_model::util::ImageHash`es instead of Strings. Signed-off-by: Zeyla Hellyer <[email protected]>
Signed-off-by: Zeyla Hellyer <[email protected]>
Signed-off-by: Zeyla Hellyer <[email protected]>
Signed-off-by: Zeyla Hellyer <[email protected]>
Signed-off-by: Zeyla Hellyer <[email protected]>
Signed-off-by: Zeyla Hellyer <[email protected]>
zeylahellyer
added
c-cache
Affects the cache crate
c-model
Affects the model crate
m-breaking change
Breaks the public API.
t-feature
Addition of a new feature
labels
Dec 31, 2021
Signed-off-by: Zeyla Hellyer <[email protected]>
Signed-off-by: Zeyla Hellyer <[email protected]>
Erk-
previously requested changes
Dec 31, 2021
Signed-off-by: Zeyla Hellyer <[email protected]>
Signed-off-by: Zeyla Hellyer <[email protected]>
Signed-off-by: Zeyla Hellyer <[email protected]>
Signed-off-by: Zeyla Hellyer <[email protected]>
Signed-off-by: Zeyla Hellyer <[email protected]>
Signed-off-by: Zeyla Hellyer <[email protected]>
github-actions
bot
added
c-all
Affects all crates or the project as a whole
c-command-parser
c-gateway
Affects the gateway crate
c-http
Affects the http crate
c-lavalink
Affects the lavalink crate
c-standby
Affects the standby crate
c-util
Affects the util crate
c-validate
Affects the validate crate
t-ci
Anything to do with CI.
labels
Dec 31, 2021
Signed-off-by: Zeyla Hellyer <[email protected]>
github-actions
bot
removed
c-http
Affects the http crate
c-command-parser
c-util
Affects the util crate
c-lavalink
Affects the lavalink crate
c-standby
Affects the standby crate
c-gateway
Affects the gateway crate
c-validate
Affects the validate crate
labels
Dec 31, 2021
AEnterprise
approved these changes
Jan 1, 2022
BlackHoleFox
approved these changes
Jan 1, 2022
7596ff
added a commit
that referenced
this pull request
Jan 23, 2022
Changes All types and method signatures have been updated to use the new `Id<T>` syntax ([#1260] - [@zeylahellyer]). The MSRV has been updated to 1.57 ([#1402] - [@zeylahellyer]). Methods that return an image hash, such as `CachedGuild::banner`, now return an `ImageHash` instead of a string ([#1405] - [@zeylahellyer]). The Rust edition has been updated to 2021 ([#1412] - [@vilgotf]). `UpdateCache` trait is now sealed ([#1431] - [@vilgotf]). [#1260]: #1260 [#1402]: #1402 [#1405]: #1405 [#1412]: #1412 [#1431]: #1431 [@vilgotf]: https://github.com/vilgotf [@zeylahellyer]: https://github.com/zeylahellyer
7596ff
added a commit
that referenced
this pull request
Jan 23, 2022
`Id<T>` IDs are now a unified type (`Id`) with marker generics (`ApplicationMarker`, ...) ([#1260] - [@zeylahellyer]). The new type implements all of what each type used to implement, as well as `FromStr`, `TryFrom<u64>`, and `TryFrom<i64>`, and others. `Id::cast` aids in converting between IDs without copying. See the PR and the documentation for more details. Additions Support scheduled events ([#1347] - [@7596ff]). Adds the following types: `EntityMetadata`, `EntityType`, `GuildScheduledEventUser`, `GuildScheduledEvent`, `PrivacyLevel`, and `Status`. Changes All types and method signatures have been updated to use the new `Id<T>` syntax ([#1260] - [@zeylahellyer]). The MSRV has been updated to 1.57 ([#1402] - [@zeylahellyer]). Image hashes are now parsed and stored in a more efficient struct, rather than deserializing as a `String` ([#1405] - [@zeylahellyer]). The Rust edition has been updated to 2021 ([#1412] - [@vilgotf]). `StageInstance::discoverable_disabled` and `PrivacyLevel::Public` have been removed, as public stage instances are no longer supported ([#1479] - [@itohatweb]). `GuildWidget::channel_id` is now optional ([#1480] - [@itohatweb]). [#1260]: #1260 [#1402]: #1402 [#1405]: #1405 [#1412]: #1412 [#1479]: #1479 [#1480]: #1480 [@7596ff]: https://github.com/7596ff [@itohatweb]: https://github.com/itohatweb [@vilgotf]: https://github.com/vilgotf [@zeylahellyer]: https://github.com/zeylahellyer
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Parse image hashes as 128 bit integers. Image hashes from Discord are base 16, 32 character length strings, and so we can reliably parse and fit them into a 128 bit integer, with an additional bit (padded to 8 bits) to store whether the image hash is animated, totalling 17 bytes to store an image hash. This is an alternative over storing image hashes as heap allocated Strings, which are stored as a pointer (usize), a length (usize), and capacity (usize), totalling 24 bytes on most systems, in addition to the value on the heap. This will give us a win in terms of memory usage.
In terms of CPU usage, we do lose out a little by parsing image hashes compared to simply allocating the content:
This can likely be improved in a later PR by improving optimizations in the primary parsing loop.
In total, this removes 45 Strings from the cache and model crates, effectively removing about 1/7th of Strings in the model crate alone.
The API of this looks like: