Use pdfkit instead of canvas #496

ForbesLindesay · 2015-07-21T10:33:24Z

This removes the dependency on the native "canvas" module and instead uses pdfkit. I did this because I was unable to get canvas installed on my machine. I'm not sure if this will perform better or worse, but it seems like something that may be worth benchmarking.

espadrine · 2015-07-21T21:57:56Z

I definitely know the struggle of getting it to work! It is one of the secret reasons to switch away from Heroku (although installing libcairo has been decently easy on raw Linux so far).

It's great to have an experiment to play with, thanks a lot!

$ node time-pdfkit
avg: 36682.7770700637
median: 18485
$ node time-canvas
avg: 91703.15923566879
median: 77127

Not sure how that is possible. Will make more tests.

Of course, we'll need to have comparisons of actual badge images, with extremes (little text, huge chunk of text, unicode…)

espadrine · 2015-07-21T22:06:06Z

I also had this idea to avoid the problem of sending literal text in badges which will always be seen wrong with people that don't have the right font, by converting it to an SVG path, which I had implemented in node-canvas. But I never got around to doing the last step in shields, so I guess it isn't that valuable at this point.

ForbesLindesay · 2015-07-22T10:57:49Z

That looks like a massive performance impact! Am I reading those numbers correctly?

I would think converting them to paths would make the file size a fair bit larger for very little real gain. Most browsers have a sufficient set of fonts and the fallbacks won't usually look too bad. I also would think that the literal text could be better handled by screen readers (although I don't know if any screen readers actually do that)

espadrine · 2015-07-22T11:29:17Z

That looks like a massive performance impact

It's in nanoseconds. An average of 100ms isn't the bottleneck anyway, but I expected the C implementation to be faster than the JS implementation.

Unfortunately, it doesn't fix the bug we have with th, as in: . However, it does seem to deal with unicode well. (Edit: maybe it is a matter of fallback font; does pdfkit have the concept?)

It isn't as accurate for long strings of text, at least on Linux:

PDFKIT	NODE-CANVAS

ForbesLindesay · 2015-07-22T14:04:52Z

I'm not surprised, well optimised JavaScript modules are usually faster than their C counterparts when they are called from JavaScript unless they are very long running because of the large amount of time spent crossing the boundary between languages. (for example, the JavaScript MySQL driver consistently outperforms C bindings).

I could implement this as a fail over (i.e. use canvas if available, otherwise fall back to pdfkit) if the accuracy on long strings is too much of an issue? Another alternative would be to just allow slightly larger tolerances for very long strings (which risks having excessive margins but would prevent things being cut of)

For th we could special case it in the measure function and just add an adjustment factor? With measureTextWidth factored out into a separate module that should be relatively easy to do.

espadrine · 2015-07-22T15:09:46Z

I could implement this as a fail over (i.e. use canvas if available, otherwise fall back to pdfkit) if the accuracy on long strings is too much of an issue?

Right now, my feeling is that long strings are uncommon enough that the benefit of removing the cairo dependency outweighs the issue. But it feels like an IEEE754 bug that might be fixable in pdfkit. Otherwise, a hack for long strings (such as adding a few pixels to the result of the computation) might work.

For th we could special case it in the measure function and just add an adjustment factor?

Given that it isn't a regression, it isn't necessary for this pull request. I wonder if I'm right about fallback fonts though.

ForbesLindesay · 2015-07-22T16:19:04Z

Not sure about fallback fonts in PDFkit. It errors if the ttf doesn't exist, and if you were actually using it to generate a PDF, it would embed the font in the pdf.

I'll add support for falling back to a default font if the ttf file doesn't exist so that we can run tests on travis again.

espadrine · 2015-07-22T16:45:23Z

It errors if the ttf doesn't exist

Right, but what happens if the glyph isn't defined in the font?

ForbesLindesay · 2015-08-20T15:45:50Z

I'm not sure, that may be an issue

espadrine · 2015-12-12T03:10:03Z

While reading this interesting implementation of OTF-based font measurement, I realized that we may have been mistaken. I think that perhaps PDFKit's widthOfString does not return pixels, but, as is extensively used in the documentation, points. Fortunately, those units are proportional to each other, which means that we merely have to decide of a factor to multiply the result of widthOfString to, and get pixels out. In browsers on screen, it seems hardcoded by CSS3 to 4/3 (~1.33).

Trying to figure out the exact unit used for widthOfString is a bit complex: text calls font calls widthOfGlyph, so it should be of unit widthOfGlyph * fontSize / 1000 and widthOfGlyph is hmtx * 1000 / unitsPerEm. Is it hmtx * point / em * em / FUnit? point is a fixed distance, em is the width and height of the master grid, so fontSize = point / em. unitsPerEm are set from the head table which are documented by Apple as:

This value should be a power of 2. Its range is from 64 through 16384.

Chapter 1 unveils that it reflects the density of the master grid on which curve points are set. The more FUnits there are in an em, the more precise typographs can be when placing curve points.

hmtx is defined in Apple's documentation as a longHorMetric, which is unhelpful, as it isn't defined anywhere. However, Microsoft's documentation indicates that it holds font design units (FUnits).

As a result, after cancellation of units in the original formula, I believe widthOfString is of unit point.

espadrine · 2016-03-12T23:35:21Z

I am not deeply knowledgeable about Microsoft fonts, but I accidentally noticed today that the md5 of my local Verdana font, from the Ubuntu package (3ba52ab1fa0cd726e7868e9c6673902c, 137K) is distinct from that of my Mac (4ba636f1f53078cdd483de8f7630600f, 182K). PDFkit seems to have been built with a Mac. None of the issues I reported are there with the Mac version of Verdana.ttf (well, except for the ᵗʰ thing), which I copied to my Linux machine. I think we can move forward with this patch, assuming I keep using the Mac version going forward, which awkwardly I cannot provide publicly for licensing reasons.

Maybe Linux' Verdana was always broken and Cairo (or is it Pango?) used tricks around that. Maybe Mac's Verdana includes hinting information that PDFkit uses and that is absent from Linux' version, which assumes you read data from a field which PDFkit doesn't currently support. We may never know.

espadrine · 2016-03-13T00:00:14Z

As an aside to get back to in the future, I'd like to note another possibility, which we might consider if font choice becomes necessary. This library makes use of opentype.js, which allows using font information to convert text to an SVG path, as seen here. As it is, it produces a bloated image, with huge paths even for simple letters. Glyphs could be optimized by omitting needless points and caching the result.

Use pdfkit instead of canvas

e2fbb58

Support fallback font when Verdana isn't available

36b664f

espadrine mentioned this pull request Nov 10, 2015

Support NodeJS 4.x - Upgrade Canvas #556

Closed

espadrine mentioned this pull request Feb 14, 2016

Moving text width measure to it's own function, with cache #469

Closed

espadrine merged commit 36b664f into badges:master Mar 13, 2016

paulmelnikow mentioned this pull request Apr 17, 2017

error while installing "gh-badges" #653

Closed

snyk-bot mentioned this pull request Oct 15, 2022

[Snyk] Upgrade fast-xml-parser from 4.0.7 to 4.0.10 wittyphantom333/shields#5

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use pdfkit instead of canvas #496

Use pdfkit instead of canvas #496

ForbesLindesay commented Jul 21, 2015

espadrine commented Jul 21, 2015

espadrine commented Jul 21, 2015

ForbesLindesay commented Jul 22, 2015

espadrine commented Jul 22, 2015

ForbesLindesay commented Jul 22, 2015

espadrine commented Jul 22, 2015

ForbesLindesay commented Jul 22, 2015

espadrine commented Jul 22, 2015

ForbesLindesay commented Aug 20, 2015

espadrine commented Dec 12, 2015

espadrine commented Mar 12, 2016

espadrine commented Mar 13, 2016

Use pdfkit instead of canvas #496

Use pdfkit instead of canvas #496

Conversation

ForbesLindesay commented Jul 21, 2015

espadrine commented Jul 21, 2015

espadrine commented Jul 21, 2015

ForbesLindesay commented Jul 22, 2015

espadrine commented Jul 22, 2015

ForbesLindesay commented Jul 22, 2015

espadrine commented Jul 22, 2015

ForbesLindesay commented Jul 22, 2015

espadrine commented Jul 22, 2015

ForbesLindesay commented Aug 20, 2015

espadrine commented Dec 12, 2015

espadrine commented Mar 12, 2016

espadrine commented Mar 13, 2016