Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance optimization of text-width computation #1379

Closed
kbrandwijk opened this issue Dec 19, 2017 · 12 comments
Closed

Performance optimization of text-width computation #1379

kbrandwijk opened this issue Dec 19, 2017 · 12 comments
Labels
core Server, BaseService, GitHub auth, Shared helpers performance-improvement Related to performance or throughput of the badge servers

Comments

@kbrandwijk
Copy link

Currently, pdfkit is used for text width measurements. I would advise to use fontkit instead (it's also used internally by pdfkit), to cut down on the overhead and increase performance. See: https://github.com/devongovett/fontkit, method: font.layout().

@PyvesB
Copy link
Member

PyvesB commented Dec 19, 2017

Hey there, I could look into this, unless you would be interested in submitting a pull request? 😉

@kbrandwijk
Copy link
Author

@PyvesB please do, I'm new to the project, and very limited in time.

@paulmelnikow paulmelnikow added the core Server, BaseService, GitHub auth, Shared helpers label Dec 20, 2017
@espadrine
Copy link
Member

Thanks for the suggestion!

Personal testing makes me feel like pdfkit's overhead (ie, its word caching mechanism) over fontkit is mostly positive; calling fontkit directly changes the time spent in makeBadge from 0.461492ms to 0.563666ms average for the following benchmark:

node server 1111 >log &
sleep 2
for ((i=0;i<10000;i++)); do
  curl -s http://localhost:1111/badge/coverage-"$i"%-green.svg >/dev/null
done
kill $(jobs -p)
<log grep 'makeBadge total' | \
  grep -Eo '[0-9\.]+' | \
  awk '{s+=$1;n++} END {print s/n}'

If we tweak the benchmark to call /badge/coverage"$i"-"$i"%-green.svg to break the pdfkit cache, I get 0.360752ms, against 0.315269ms for fontkit.

@kbrandwijk
Copy link
Author

Wow, I totally didn't expect that.

@paulmelnikow paulmelnikow changed the title Use fontkit instead of pdfkit for text measurements (performance) Performance optimization of text-width computation Dec 21, 2017
@paulmelnikow
Copy link
Member

Continuation of #1377 (comment):

Something else we could think about here is multiprocess parallelism: basically doing the badge-width computation in one or more separate processes that don't block. That won't help with a sequential-test metric, but might help if we simulate a production-like load.

Also, and more simply, perhaps a bigger cache, or even a static cache, which contains precomputed widths for lots of common values, like in the example badge values, version numbers, and static badges we know we see a lot.

This might be a naive question: does kerning prevent us from computing the width character by character? Would it be possible to extract kerning rules from the font, so we could reimplement the computation in an even faster way?

@kbrandwijk
Copy link
Author

kbrandwijk commented Dec 21, 2017

With fontkit, and font.layout(), you get information for every glyph, including kerning info, so you could basically run that once, and store those values? That way text-width calculation becomes a static sum? You would have to cache that glyph info per font of course (is it even possible to use a different font with the hosted version, or is that always just Verdana?).

Also, what's wrong with just caching every value you ever come across? It's only text and a measurement, so it doesn't take up any space, even with potentially tens of thousands of entries.

@paulmelnikow
Copy link
Member

@espadrine It looks like I need to add some logging code for the snippet in #1379 (comment) to work. What would I need to change?

I made a naive implementation and it seems to be working…

@espadrine
Copy link
Member

@paulmelnikow Here's a diff:

diff --git a/server.js b/server.js
index 990c111..949322d 100644
--- a/server.js
+++ b/server.js
@@ -7468,7 +7468,9 @@ function(data, match, end, ask) {
     if (isValidStyle(data.style)) {
       badgeData.template = data.style;
     }
+    console.time('makeBadge total');
     const svg = makeBadge(badgeData);
+    console.timeEnd('makeBadge total');
     makeSend(format, ask.res, end)(svg);
   } catch(e) {
     log.error(e.stack);

You can add other console.time's for analysis' sake.

@espadrine
Copy link
Member

espadrine commented Dec 23, 2017

basically doing the badge-width computation in one or more separate processes that don't block

Quick note: the OVH offer we are on, VPS SSD 1, has a single vcore.

https://www.ovh.com/us/vps/vps-ssd.xml

@paulmelnikow
Copy link
Member

Quick note: the OVH offer we are on, VPS SSD 1, has a single vcore.

Ah, gotcha… thanks.

@paulmelnikow paulmelnikow added the performance-improvement Related to performance or throughput of the badge servers label Dec 24, 2017
@paulmelnikow
Copy link
Member

I opened #1390 with the approach I took.

paulmelnikow added a commit that referenced this issue Dec 27, 2017
Ref: #1379

This takes a naive approach to font-width computation, the most compute-intensive part of rendering badges.

1. Add the widths of the individual characters.
    - These widths are measured on startup using PDFKit.
2. For each character pair, add a kerning adjustment
    - The difference between the width of each character pair, and the sum of the characters' separate widths.
    - These are computed for each character pair on startup using PDFKit.
3. For a string with characters outside the printable ASCII character set, fall back to PDFKit.

This branch averaged 0.041 ms in `makeBadge`, compared to 0.144 ms on master, a speedup of 73%. That was on a test of 10,000 consecutive requests (using the `benchmark-performance.sh` script, now checked in).

The speedup applies to badges containing exclusively printable ASCII characters. It wouldn't be as dramatic on non-ASCII text. Though, we could add some frequently used non-ASCII characters to the cached set.
@RedSparr0w
Copy link
Member

Do you think this could be closed now that we have #1390 merged?
@paulmelnikow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Server, BaseService, GitHub auth, Shared helpers performance-improvement Related to performance or throughput of the badge servers
Projects
None yet
Development

No branches or pull requests

5 participants