Enable embedding of OpenType fonts #11

jbowtie · 2017-02-13T04:13:39Z

This is a fairly large pull request that enables the embedding of OpenType fonts, along with basic support for kerning and ligatures. Future pull requests will refine this functionality.

There are also a tiny number of correctness fixes.

This low-level functionality is needed for more advanced typography and layout.

The API now allows you register a font with an alias. The font is embedded when the PDF is written out.
The tests directory now has a few OFL-licensed fonts for testing purposes.

Enabled OpenType features get applied. At the moment only standard ligatures and pair positioning for default latin script have been fully implemented and tested.

The immediate future work involves;

Enhancing the API so OpenType features can be turned off and on. (DONE)
Implementing the parsing and application of the full GPOS and GSUB tables.
Falling back to the "kern" table for basic kerning.
Enabling a wider set of features by default. (DONE)
Handling fonts that use AAT for layout.

Known issues:

Times out when embedding large fonts (most CJK fonts, for instance). This will be fixed when subsetting is properly supported. (FIXED performance-wise, though subsetting is not supported properly yet).
Some readers corrupt the copy/paste information. This is probably an issue with serializing the ToUnicode map.

…a composite font

…eatures

…opriate OTP release)

kpandya3 · 2017-02-15T17:11:03Z

Hey @jbowtie, thanks for submitting the PR! We will review it by this weekend and get back to you.

tyre · 2017-02-18T06:09:36Z

I've started reviewing this and will be doing some refactoring as I go. @jbowtie how happy are you with the test coverage for this feature?

tyre · 2017-02-18T06:36:44Z

@jbowtie could you please enable collaboration on the PR? Then we can all work together on this :)

https://github.com/blog/2247-improving-collaboration-with-forks

jbowtie · 2017-02-18T19:46:03Z

@tyre collaboration is enabled. I don't know that there's much value in increasing test coverage until more of the GSUB/GPOS features are implemented, but I'll happily maintain any tests that get added.

jbowtie · 2017-03-05T22:52:43Z

@tyre Is there any progress on reviewing/refactoring this? I appreciate that it's a LOT to take in -- I'm happy to provide orientation and/or discuss refactoring in the comments here.

tyre · 2017-03-18T08:14:35Z

@jbowtie Working on it this weekend! Have gotten the basics worked out and refactored; now doing some of the auxiliary headers. I want to make sure we have something that can reasonably scale to other font types

jbowtie · 2017-04-10T07:38:25Z

I've now implemented a large enough chunk of font processing to better express a reasonable workflow and division of responsibilities.

The script can be autodetected via Unicode properties, or this (along with an optional language) can be specified.

A script-specific shaper can tag individual characters with OpenType properties -- by default this would be detecting fraction numerator/denominator if 'frac' feature is active, positional shaping (initial/medial/final/isolated) for scripts such as Arabic, marking leftmost/rightmost characters if optical bounds feature is active, etc.

Now we do font-specific things:

Characters get converted to initial glyphs one-for-one via font cmap -- tags get transferred one-for-one.

Glyph substitution takes place according to OpenType rules -- this takes care of things like positional shaping, ligatures, mirroring, contextual replacements, etc. If there's no GSUB table this does nothing.

Positions are initialized to the advance width declared in the font metrics.

Glyph positioning takes place according to OpenType rules -- this takes care of mark placement, kerning, cursive alignment, etc. If there's no GPOS table, do kerning (using 'kern' table) or nothing.

At this point we have some output glyphs and their final positioning data. This can be tested for correctness! HarfBuzz tests lots of complex script layout scenarios against various fonts for instance.

When writing out to PDF, we need to scale the font metrics so they match the 1000-per-em metrics of PDF standard. Same with positioning.

This pull request just positions individual glyphs. Ideally you do that only for the subset that requires it; elsewhere you use the TJ syntax if you have kerning adjustment or omit positioning entirely (relying on the standard glyph widths).

So what does that imply?

The layout_text function needs to know:

font
active OpenType features (can have default based on spec)
script (can be autodetected by some unicode module or macro)
language (can default to 'dflt' per spec)
optional width/height constraints (if we ever implement line wrapping/alignment)

Parser module responsible for converting the font binary to something useful.
Positioning module and Substitution module responsible for their respective areas (calling out to the parser as needed).
Shaping module responsible for contextual analysis and tagging of characters. I believe AAT fonts have a built in shaper.
Unicode module for access to unicode database properties (script detection, joining type, etc)
PdfEmbedding module to handle writing out required objects
Text module needs to be smart enough to select correct PDF operators based on positioning type. Ideally we also write out /ActualText (needed for complex scripts that re-order characters but also handy when more advanced layout otherwise confuses a PDF reader).

jbowtie · 2017-05-10T11:47:45Z

Since @tyre hasn't been in a position to push his refactoring and I've made substantial progress on my arabic branch, I'll begin pushing a series of smalller PRs based on my personal refactoring.

I've created #12 and #13 to address the CI build and byte_size bug respectively. Once those are merged I'll have a solid basis for a series of much easier-to-review pull requests to flesh out the functionality.

whossname · 2017-06-24T09:41:43Z

@jbowtie has there been any progress recently? Based on this conversation it looks like this project is dead. This is a shame, I was thinking about implementing tables using this and if the resulting code was useful, offering to add it to the code base.

jbowtie · 2017-06-24T21:57:17Z

@whossname I don't know how active @tyre is as a maintainer. I have a fork I'm maintaining in the meantime -- the default opentype branch is where I'm merging the smaller pull requests as I produce them and the arabic branch is a more advanced (and more correct) implementation of this pull request.

whossname · 2017-06-25T03:19:56Z

Ok, I'm mostly interested in refactoring and adding to the Geometry stuff. When I get to this (still a month from when I need it) I might look at contributing to your fork instead of this one seeing as yours is being actively worked on.

jbowtie added 28 commits November 1, 2016 11:15

Update dependencies to latest

bd5e5af

Fix link to helpful PDF

1322805

Add basic TrueType parser and supporting test

3958e5e

Add support to register fonts with context and (in theory) write out …

04aea66

…a composite font

Use byte_size for correct lengths and offsets

863af16

Tweak serialization for clarity

43ceb5a

Register and embed fonts

6207871

Generate correct glyph widths

36f2ba8

Tweak layout text flow to enable future implementation of GSUB/GPOS f…

101f574

…eatures

Add very rough ligature support

c6aa453

Start looking at GPOS parsing

597509c

Add Noto Italic used to visually inspect ligature output

a7164e4

Move OpenType fonts into a GenServer

79b74d4

Eliminate compiler warnings

4ea84a8

Partial cleanup of some hard-coded bits; more GSUB formats

70e2918

Cleanup and improve existing GSUB/GPOS lookups

b9c814d

Move write_positioned_glyphs into correct module

be6b3f5

Implement chaining context substitution

e4824d1

Bump depedencies

536617f

Build correct ToUnicode map to enable basic copy/paste

4c0b3a6

Better test names

5ebbd27

Embed the whole OTF font instead of just the CFF table

755e433

Capture familyClass from OS/2 table

0fe96a7

Implement class-based kerning

39a74ab

Fix parsing and application of kerning

c936e7e

Update Travis config to use OTP 17.4 or later

62d5e8c

Do better and specify elixir language release (which will select appr…

4c9154a

…opriate OTP release)

Use sFamilyClass, change embed subtype

400a8fc

Only parse one cmap and one set of names

4f6c089

jbowtie added 4 commits February 16, 2017 21:35

Make things fast enough to handle a CJK font

b4a0db9

Use actual leading value instead of hardcoded test value

b7191f3

Implement GPOS type 1 (single glyph adjustment)

b197c10

Allow script and lang to be passed in with sensible fallback policy

0ea1d4e

jbowtie added 2 commits February 19, 2017 12:22

Add API to enable/disable individual OpenType features

e9e1721

Add support for parsing GDEF and handling mark positioning

162319f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable embedding of OpenType fonts #11

Enable embedding of OpenType fonts #11

jbowtie commented Feb 13, 2017 •

edited

Loading

kpandya3 commented Feb 15, 2017

tyre commented Feb 18, 2017

tyre commented Feb 18, 2017

jbowtie commented Feb 18, 2017

jbowtie commented Mar 5, 2017

tyre commented Mar 18, 2017

jbowtie commented Apr 10, 2017

jbowtie commented May 10, 2017

whossname commented Jun 24, 2017

jbowtie commented Jun 24, 2017

whossname commented Jun 25, 2017

Enable embedding of OpenType fonts #11

Are you sure you want to change the base?

Enable embedding of OpenType fonts #11

Conversation

jbowtie commented Feb 13, 2017 • edited Loading

kpandya3 commented Feb 15, 2017

tyre commented Feb 18, 2017

tyre commented Feb 18, 2017

jbowtie commented Feb 18, 2017

jbowtie commented Mar 5, 2017

tyre commented Mar 18, 2017

jbowtie commented Apr 10, 2017

jbowtie commented May 10, 2017

whossname commented Jun 24, 2017

jbowtie commented Jun 24, 2017

whossname commented Jun 25, 2017

jbowtie commented Feb 13, 2017 •

edited

Loading