Skip to content

Commit

Permalink
Avoid overwrite lookup if no overwrites are set
Browse files Browse the repository at this point in the history
  • Loading branch information
janlelis committed Jan 4, 2023
1 parent 751823f commit 99b6987
Show file tree
Hide file tree
Showing 2 changed files with 45 additions and 6 deletions.
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,10 @@

## 2.4.2 (unreleased)

More performance improvements:

- Optimize lookup of first 4096 codepoints
- Avoid overwrite lookup if no overwrites are set

## 2.4.1

Expand Down
48 changes: 42 additions & 6 deletions lib/unicode/display_width.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,51 @@ class DisplayWidth
FIRST_4096 = decompress_index(INDEX[0][0], 1)

def self.of(string, ambiguous = 1, overwrite = {}, options = {})
# Optimization for ASCII-only strings without certain control symbols
if overwrite.empty? && string.ascii_only?
if string.match?(ASCII_NON_ZERO_REGEX)
res = string.gsub(ASCII_NON_ZERO_REGEX, "").size - string.count("\b")
return res < 0 ? 0 : res
if overwrite.empty?
# Optimization for ASCII-only strings without certain control symbols
if string.ascii_only?
if string.match?(ASCII_NON_ZERO_REGEX)
res = string.gsub(ASCII_NON_ZERO_REGEX, "").size - string.count("\b")
res < 0 ? 0 : res
else
string.size
end
else
return string.size
width_no_overwrite(string, ambiguous, options)
end
else
width_all_features(string, ambiguous, overwrite, options)
end
end

def self.width_no_overwrite(string, ambiguous, options = {})
# Sum of all chars widths
res = string.codepoints.sum{ |codepoint|
if codepoint > 15 && codepoint < 161 # very common
next 1
elsif codepoint < 0x1001
width = FIRST_4096[codepoint]
else
width = INDEX
depth = INITIAL_DEPTH
while (width = width[codepoint / depth]).instance_of? Array
codepoint %= depth
depth /= 16
end
end

width == :A ? ambiguous : (width || 1)
}

# Substract emoji error
res -= emoji_extra_width_of(string, ambiguous) if options[:emoji]

# Return result + prevent negative lengths
res < 0 ? 0 : res
end

# Same as .width_no_overwrite - but with applying overwrites for each char
def self.width_all_features(string, ambiguous, overwrite, options)
# Sum of all chars widths
res = string.codepoints.sum{ |codepoint|
next overwrite[codepoint] if overwrite[codepoint]
Expand Down Expand Up @@ -47,6 +82,7 @@ def self.of(string, ambiguous = 1, overwrite = {}, options = {})
res < 0 ? 0 : res
end


def self.emoji_extra_width_of(string, ambiguous = 1, overwrite = {}, _ = {})
require "unicode/emoji"

Expand Down

0 comments on commit 99b6987

Please sign in to comment.