Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix emoji unified/non-qualified version for sending & parsing #4840

Merged
merged 20 commits into from
Oct 7, 2023

Conversation

pajlada
Copy link
Member

@pajlada pajlada commented Sep 24, 2023

Description

  • fix: use unified emoji for sending, & use both unified & nonQualified for parsing
  • refactor: rename parsed shortCodes to shortNames to avoid name clashes
  • refactor: mark emojis name as unused in the emoji set loading
  • refactor: use contains(..) instead of count(..)
  • refactor: const auto the emoji
  • refactor: move anon namespace out to flatten in
  • refactor: constify & rename TONE_NAMES
  • refactor: parseEmoji make shortCode const ref
  • move map include

Fixes #4792

This has gone an extra step to make sure that this version of Chatterino can parse emojis that are sent from old versions of chatterino & from users of web chat

this means slightly higher memory usage which can be resolved by splitting EmojiData into an Emoji and EmojiData, where we mostly only store Emoji (with its value, nonUnified, and Emote values).
We'd just need to figure out how to change emoji sets

@pajlada pajlada mentioned this pull request Sep 24, 2023
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

src/providers/emoji/Emojis.cpp Outdated Show resolved Hide resolved
src/providers/emoji/Emojis.cpp Show resolved Hide resolved
src/providers/emoji/Emojis.cpp Outdated Show resolved Hide resolved
@pajlada
Copy link
Member Author

pajlada commented Sep 25, 2023

this breaks some emojis, gotta test this more

@Felanbird
Copy link
Collaborator

notes:

as of this comment all known broken emojis send correctly
image

heart and comet render correctly, the others do not
image

other emotes such as 😂 render incorrectly now
image

src/providers/emoji/Emojis.cpp Outdated Show resolved Hide resolved
src/providers/emoji/Emojis.cpp Outdated Show resolved Hide resolved
src/providers/emoji/Emojis.cpp Outdated Show resolved Hide resolved
src/providers/emoji/Emojis.cpp Outdated Show resolved Hide resolved
src/providers/emoji/Emojis.cpp Outdated Show resolved Hide resolved
src/providers/emoji/Emojis.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

src/providers/emoji/Emojis.cpp Show resolved Hide resolved
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

benchmarks/src/Emojis.cpp Outdated Show resolved Hide resolved
benchmarks/src/Emojis.cpp Show resolved Hide resolved
benchmarks/src/Emojis.cpp Show resolved Hide resolved
benchmarks/src/Emojis.cpp Show resolved Hide resolved
benchmarks/src/Emojis.cpp Show resolved Hide resolved
benchmarks/src/Emojis.cpp Outdated Show resolved Hide resolved
src/providers/emoji/Emojis.cpp Show resolved Hide resolved
Qt 6.5.3 results:
2023-10-07T11:48:49+02:00
Running ./bin/chatterino-benchmark
Run on (32 X 5500 MHz CPU s)
CPU Caches:
  L1 Data 48 KiB (x16)
  L1 Instruction 32 KiB (x16)
  L2 Unified 2048 KiB (x16)
  L3 Unified 36864 KiB (x1)
Load Average: 5.93, 5.29, 2.99
--------------------------------------------------------------------------------
Benchmark                                      Time             CPU   Iterations
--------------------------------------------------------------------------------
BM_EmojiParsing                            10639 ns        10634 ns        65687
BM_EmojiParsing                            10627 ns        10622 ns        65687
BM_EmojiParsing                            10623 ns        10614 ns        65687
BM_EmojiParsing_mean                       10630 ns        10623 ns            3
BM_EmojiParsing_median                     10627 ns        10622 ns            3
BM_EmojiParsing_stddev                      8.70 ns         10.1 ns            3
BM_EmojiParsing_cv                          0.08 %          0.09 %             3
BM_EmojiParsing2/one_emoji                  1525 ns         1524 ns       457581
BM_EmojiParsing2/one_emoji                  1562 ns         1558 ns       457581
BM_EmojiParsing2/one_emoji                  1555 ns         1553 ns       457581
BM_EmojiParsing2/one_emoji_mean             1547 ns         1545 ns            3
BM_EmojiParsing2/one_emoji_median           1555 ns         1553 ns            3
BM_EmojiParsing2/one_emoji_stddev           19.6 ns         18.4 ns            3
BM_EmojiParsing2/one_emoji_cv               1.27 %          1.19 %             3
BM_EmojiParsing2/two_emoji                  2837 ns         2832 ns       246593
BM_EmojiParsing2/two_emoji                  2836 ns         2832 ns       246593
BM_EmojiParsing2/two_emoji                  2832 ns         2831 ns       246593
BM_EmojiParsing2/two_emoji_mean             2835 ns         2832 ns            3
BM_EmojiParsing2/two_emoji_median           2836 ns         2832 ns            3
BM_EmojiParsing2/two_emoji_stddev           2.46 ns        0.778 ns            3
BM_EmojiParsing2/two_emoji_cv               0.09 %          0.03 %             3
BM_EmojiParsing2/many_emoji               137515 ns       137179 ns         5090
BM_EmojiParsing2/many_emoji               137757 ns       137260 ns         5090
BM_EmojiParsing2/many_emoji               137436 ns       137216 ns         5090
BM_EmojiParsing2/many_emoji_mean          137570 ns       137218 ns            3
BM_EmojiParsing2/many_emoji_median        137515 ns       137216 ns            3
BM_EmojiParsing2/many_emoji_stddev           167 ns         40.7 ns            3
BM_EmojiParsing2/many_emoji_cv              0.12 %          0.03 %             3
BM_EmojiParsing2New/one_emoji               1528 ns         1526 ns       457211
BM_EmojiParsing2New/one_emoji               1530 ns         1529 ns       457211
BM_EmojiParsing2New/one_emoji               1532 ns         1530 ns       457211
BM_EmojiParsing2New/one_emoji_mean          1530 ns         1528 ns            3
BM_EmojiParsing2New/one_emoji_median        1530 ns         1529 ns            3
BM_EmojiParsing2New/one_emoji_stddev        1.72 ns         1.89 ns            3
BM_EmojiParsing2New/one_emoji_cv            0.11 %          0.12 %             3
BM_EmojiParsing2New/two_emoji               2834 ns         2830 ns       246832
BM_EmojiParsing2New/two_emoji               2843 ns         2839 ns       246832
BM_EmojiParsing2New/two_emoji               2829 ns         2827 ns       246832
BM_EmojiParsing2New/two_emoji_mean          2835 ns         2832 ns            3
BM_EmojiParsing2New/two_emoji_median        2834 ns         2830 ns            3
BM_EmojiParsing2New/two_emoji_stddev        7.28 ns         6.16 ns            3
BM_EmojiParsing2New/two_emoji_cv            0.26 %          0.22 %             3
BM_EmojiParsing2New/many_emoji            137688 ns       137630 ns         5095
BM_EmojiParsing2New/many_emoji            137594 ns       137443 ns         5095
BM_EmojiParsing2New/many_emoji            137601 ns       137541 ns         5095
BM_EmojiParsing2New/many_emoji_mean       137628 ns       137538 ns            3
BM_EmojiParsing2New/many_emoji_median     137601 ns       137541 ns            3
BM_EmojiParsing2New/many_emoji_stddev       52.5 ns         93.7 ns            3
BM_EmojiParsing2New/many_emoji_cv           0.04 %          0.07 %             3

Qt 5.12.12 results:
2023-10-07T11:44:36+02:00
Running ./bin/chatterino-benchmark
Run on (32 X 5500 MHz CPU s)
CPU Caches:
  L1 Data 48 KiB (x16)
  L1 Instruction 32 KiB (x16)
  L2 Unified 2048 KiB (x16)
  L3 Unified 36864 KiB (x1)
Load Average: 1.41, 1.37, 1.35
--------------------------------------------------------------------------------
Benchmark                                      Time             CPU   Iterations
--------------------------------------------------------------------------------
BM_EmojiParsing                            12179 ns        12161 ns        57711
BM_EmojiParsing                            12170 ns        12147 ns        57711
BM_EmojiParsing                            12232 ns        12223 ns        57711
BM_EmojiParsing_mean                       12193 ns        12177 ns            3
BM_EmojiParsing_median                     12179 ns        12161 ns            3
BM_EmojiParsing_stddev                      33.4 ns         40.2 ns            3
BM_EmojiParsing_cv                          0.27 %          0.33 %             3
BM_EmojiParsing2/one_emoji                  1774 ns         1771 ns       394534
BM_EmojiParsing2/one_emoji                  1772 ns         1769 ns       394534
BM_EmojiParsing2/one_emoji                  1774 ns         1773 ns       394534
BM_EmojiParsing2/one_emoji_mean             1773 ns         1771 ns            3
BM_EmojiParsing2/one_emoji_median           1774 ns         1771 ns            3
BM_EmojiParsing2/one_emoji_stddev           1.22 ns         2.07 ns            3
BM_EmojiParsing2/one_emoji_cv               0.07 %          0.12 %             3
BM_EmojiParsing2/two_emoji                  3320 ns         3318 ns       210619
BM_EmojiParsing2/two_emoji                  3316 ns         3311 ns       210619
BM_EmojiParsing2/two_emoji                  3308 ns         3302 ns       210619
BM_EmojiParsing2/two_emoji_mean             3315 ns         3311 ns            3
BM_EmojiParsing2/two_emoji_median           3316 ns         3311 ns            3
BM_EmojiParsing2/two_emoji_stddev           6.31 ns         7.98 ns            3
BM_EmojiParsing2/two_emoji_cv               0.19 %          0.24 %             3
BM_EmojiParsing2/many_emoji               146008 ns       145938 ns         4786
BM_EmojiParsing2/many_emoji               146694 ns       146459 ns         4786
BM_EmojiParsing2/many_emoji               146554 ns       146329 ns         4786
BM_EmojiParsing2/many_emoji_mean          146418 ns       146242 ns            3
BM_EmojiParsing2/many_emoji_median        146554 ns       146329 ns            3
BM_EmojiParsing2/many_emoji_stddev           362 ns          271 ns            3
BM_EmojiParsing2/many_emoji_cv              0.25 %          0.19 %             3
BM_EmojiParsing2New/one_emoji               1775 ns         1774 ns       394159
BM_EmojiParsing2New/one_emoji               1772 ns         1772 ns       394159
BM_EmojiParsing2New/one_emoji               1775 ns         1774 ns       394159
BM_EmojiParsing2New/one_emoji_mean          1774 ns         1773 ns            3
BM_EmojiParsing2New/one_emoji_median        1775 ns         1774 ns            3
BM_EmojiParsing2New/one_emoji_stddev        1.45 ns         1.45 ns            3
BM_EmojiParsing2New/one_emoji_cv            0.08 %          0.08 %             3
BM_EmojiParsing2New/two_emoji               3308 ns         3307 ns       210242
BM_EmojiParsing2New/two_emoji               3316 ns         3314 ns       210242
BM_EmojiParsing2New/two_emoji               3304 ns         3302 ns       210242
BM_EmojiParsing2New/two_emoji_mean          3309 ns         3308 ns            3
BM_EmojiParsing2New/two_emoji_median        3308 ns         3307 ns            3
BM_EmojiParsing2New/two_emoji_stddev        6.05 ns         6.07 ns            3
BM_EmojiParsing2New/two_emoji_cv            0.18 %          0.18 %             3
BM_EmojiParsing2New/many_emoji            146275 ns       146050 ns         4790
BM_EmojiParsing2New/many_emoji            146473 ns       146406 ns         4790
BM_EmojiParsing2New/many_emoji            146438 ns       146219 ns         4790
BM_EmojiParsing2New/many_emoji_mean       146396 ns       146225 ns            3
BM_EmojiParsing2New/many_emoji_median     146438 ns       146219 ns            3
BM_EmojiParsing2New/many_emoji_stddev        106 ns          178 ns            3
BM_EmojiParsing2New/many_emoji_cv           0.07 %          0.12 %             3
@pajlada
Copy link
Member Author

pajlada commented Oct 7, 2023

image

@pajlada pajlada changed the title Fix emoji unified/non qualified use Fix emoji unified/non-qualified version for sending & parsing Oct 7, 2023
@pajlada pajlada enabled auto-merge (squash) October 7, 2023 10:07
@pajlada pajlada merged commit 774eaa1 into master Oct 7, 2023
15 checks passed
@pajlada pajlada deleted the fix/emoji-unified-vs-non-qualified branch October 7, 2023 10:21
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -55,3 +55,120 @@ static void BM_ShortcodeParsing(benchmark::State &state)
}

BENCHMARK(BM_ShortcodeParsing);

static void BM_EmojiParsing(benchmark::State &state)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: invalid case style for function 'BM_EmojiParsing' [readability-identifier-naming]

Suggested change
static void BM_EmojiParsing(benchmark::State &state)
eParsing);bmEmojiParsing

benchmarks/src/Emojis.cpp:136:

- 
- sing);
+ 
+ sing);bmEmojiParsing

emojiMap.tryGet("1F427", penguin);
auto penguinEmoji = penguin->emote;

std::vector<TestCase> tests{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: variable 'tests' of type 'std::vector' can be declared 'const' [misc-const-correctness]

benchmarks/src/Emojis.cpp:75:

- ase> tests{
+ ase> const tests{

BENCHMARK(BM_EmojiParsing);

template <class... Args>
static void BM_EmojiParsing2(benchmark::State &state, Args &&...args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: invalid case style for function 'BM_EmojiParsing2' [readability-identifier-naming]

Suggested change
static void BM_EmojiParsing2(benchmark::State &state, Args &&...args)
sing);bmEmojiParsing2

benchmarks/src/Emojis.cpp:166:

- ualNumEmojis;
- ng2, one_emoji, "foo 🐧 bar", 1);
- ing2, two_emoji, "foo 🐧 bar 🐧", 2);
+ ualNumEmojis;
+ ng2, one_emoji, "foo 🐧 bar", 1);
+ ing2bmEmojiParsing2o 🐧 bar 🐧", 2);bmEmojiParsing2

benchmarks/src/Emojis.cpp:170:

- APTURE(
+ APTURE(bmEmojiParsing2

continue;
}
// look in emoji->value
bool match = QStringView{emoji->value}.mid(1) ==
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: variable 'match' of type 'bool' can be declared 'const' [misc-const-correctness]

Suggested change
bool match = QStringView{emoji->value}.mid(1) ==
bool const match = QStringView{emoji->value}.mid(1) ==

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Emoji Inconsistancies
3 participants