Performance optimization of text-width computation #1379

kbrandwijk · 2017-12-19T20:57:52Z

Currently, pdfkit is used for text width measurements. I would advise to use fontkit instead (it's also used internally by pdfkit), to cut down on the overhead and increase performance. See: https://github.com/devongovett/fontkit, method: font.layout().

The text was updated successfully, but these errors were encountered:

PyvesB · 2017-12-19T23:04:00Z

Hey there, I could look into this, unless you would be interested in submitting a pull request? 😉

kbrandwijk · 2017-12-20T03:23:42Z

@PyvesB please do, I'm new to the project, and very limited in time.

espadrine · 2017-12-20T18:37:33Z

Thanks for the suggestion!

Personal testing makes me feel like pdfkit's overhead (ie, its word caching mechanism) over fontkit is mostly positive; calling fontkit directly changes the time spent in makeBadge from 0.461492ms to 0.563666ms average for the following benchmark:

node server 1111 >log &
sleep 2
for ((i=0;i<10000;i++)); do
  curl -s http://localhost:1111/badge/coverage-"$i"%-green.svg >/dev/null
done
kill $(jobs -p)
<log grep 'makeBadge total' | \
  grep -Eo '[0-9\.]+' | \
  awk '{s+=$1;n++} END {print s/n}'

If we tweak the benchmark to call /badge/coverage"$i"-"$i"%-green.svg to break the pdfkit cache, I get 0.360752ms, against 0.315269ms for fontkit.

kbrandwijk · 2017-12-20T19:31:42Z

Wow, I totally didn't expect that.

paulmelnikow · 2017-12-21T22:37:55Z

Continuation of #1377 (comment):

Something else we could think about here is multiprocess parallelism: basically doing the badge-width computation in one or more separate processes that don't block. That won't help with a sequential-test metric, but might help if we simulate a production-like load.

Also, and more simply, perhaps a bigger cache, or even a static cache, which contains precomputed widths for lots of common values, like in the example badge values, version numbers, and static badges we know we see a lot.

This might be a naive question: does kerning prevent us from computing the width character by character? Would it be possible to extract kerning rules from the font, so we could reimplement the computation in an even faster way?

kbrandwijk · 2017-12-21T22:54:37Z

With fontkit, and font.layout(), you get information for every glyph, including kerning info, so you could basically run that once, and store those values? That way text-width calculation becomes a static sum? You would have to cache that glyph info per font of course (is it even possible to use a different font with the hosted version, or is that always just Verdana?).

Also, what's wrong with just caching every value you ever come across? It's only text and a measurement, so it doesn't take up any space, even with potentially tens of thousands of entries.

paulmelnikow · 2017-12-22T17:26:58Z

@espadrine It looks like I need to add some logging code for the snippet in #1379 (comment) to work. What would I need to change?

I made a naive implementation and it seems to be working…

espadrine · 2017-12-23T21:44:31Z

@paulmelnikow Here's a diff:

diff --git a/server.js b/server.js
index 990c111..949322d 100644
--- a/server.js
+++ b/server.js
@@ -7468,7 +7468,9 @@ function(data, match, end, ask) {
     if (isValidStyle(data.style)) {
       badgeData.template = data.style;
     }
+    console.time('makeBadge total');
     const svg = makeBadge(badgeData);
+    console.timeEnd('makeBadge total');
     makeSend(format, ask.res, end)(svg);
   } catch(e) {
     log.error(e.stack);

You can add other console.time's for analysis' sake.

espadrine · 2017-12-23T21:51:19Z

basically doing the badge-width computation in one or more separate processes that don't block

Quick note: the OVH offer we are on, VPS SSD 1, has a single vcore.

https://www.ovh.com/us/vps/vps-ssd.xml

paulmelnikow · 2017-12-24T00:09:09Z

Quick note: the OVH offer we are on, VPS SSD 1, has a single vcore.

Ah, gotcha… thanks.

paulmelnikow · 2017-12-24T02:00:50Z

I opened #1390 with the approach I took.

Ref: #1379 This takes a naive approach to font-width computation, the most compute-intensive part of rendering badges. 1. Add the widths of the individual characters. - These widths are measured on startup using PDFKit. 2. For each character pair, add a kerning adjustment - The difference between the width of each character pair, and the sum of the characters' separate widths. - These are computed for each character pair on startup using PDFKit. 3. For a string with characters outside the printable ASCII character set, fall back to PDFKit. This branch averaged 0.041 ms in `makeBadge`, compared to 0.144 ms on master, a speedup of 73%. That was on a test of 10,000 consecutive requests (using the `benchmark-performance.sh` script, now checked in). The speedup applies to badges containing exclusively printable ASCII characters. It wouldn't be as dramatic on non-ASCII text. Though, we could add some frequently used non-ASCII characters to the cached set.

RedSparr0w · 2018-01-15T00:59:35Z

Do you think this could be closed now that we have #1390 merged?
@paulmelnikow

PyvesB mentioned this issue Dec 20, 2017

Badges are unavailable (for GitHub) #1377

Closed

paulmelnikow added the core Server, BaseService, GitHub auth, Shared helpers label Dec 20, 2017

paulmelnikow changed the title ~~Use fontkit instead of pdfkit for text measurements (performance)~~ Performance optimization of text-width computation Dec 21, 2017

paulmelnikow added the performance-improvement Related to performance or throughput of the badge servers label Dec 24, 2017

paulmelnikow mentioned this issue Dec 24, 2017

Speed up font-width computation in most cases #1390

Merged

paulmelnikow closed this as completed Jan 15, 2018

techtonik mentioned this issue Sep 1, 2018

Using pdfkit adambisek/string-pixel-width#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance optimization of text-width computation #1379

Performance optimization of text-width computation #1379

kbrandwijk commented Dec 19, 2017

PyvesB commented Dec 19, 2017

kbrandwijk commented Dec 20, 2017

espadrine commented Dec 20, 2017

kbrandwijk commented Dec 20, 2017

paulmelnikow commented Dec 21, 2017

kbrandwijk commented Dec 21, 2017 •

edited

Loading

paulmelnikow commented Dec 22, 2017

espadrine commented Dec 23, 2017

espadrine commented Dec 23, 2017 •

edited

Loading

paulmelnikow commented Dec 24, 2017

paulmelnikow commented Dec 24, 2017

RedSparr0w commented Jan 15, 2018

Performance optimization of text-width computation #1379

Performance optimization of text-width computation #1379

Comments

kbrandwijk commented Dec 19, 2017

PyvesB commented Dec 19, 2017

kbrandwijk commented Dec 20, 2017

espadrine commented Dec 20, 2017

kbrandwijk commented Dec 20, 2017

paulmelnikow commented Dec 21, 2017

kbrandwijk commented Dec 21, 2017 • edited Loading

paulmelnikow commented Dec 22, 2017

espadrine commented Dec 23, 2017

espadrine commented Dec 23, 2017 • edited Loading

paulmelnikow commented Dec 24, 2017

paulmelnikow commented Dec 24, 2017

RedSparr0w commented Jan 15, 2018

kbrandwijk commented Dec 21, 2017 •

edited

Loading

espadrine commented Dec 23, 2017 •

edited

Loading