As I understand it from Behdad's talk, HarfBuzz does -- intentionally -- leave this decision up to the application -- the app decides which bundle of codes to send to HarfBuzz as a unit, and that decision is both script- and language-dependent. So, yes, the word processor (or whatever) would need to know when an o-e sequence needed to be rendered as one ligature and when it didn't. HarfBuzz only needs to know how to form the ligature *when* it is asked to do so.
Whether or not any particular application does a good job of making that call is an issue for the application project's team; it's out of scope for HarfBuzz.
Posted May 23, 2012 16:31 UTC (Wed) by gioele (subscriber, #61675)
[Link]
Whether or not any particular application does a good job of making that call is an issue for the application project's team; it's out of scope for HarfBuzz.
Yet I suppose that applications would love to have some kind of application-independent ligature database a la ICU instead of having to create and distribute a db of that kind for each application, something that you can query with ligatures_for("coeur", "fra-Latn") = {["œ", 1..2]}.
Ligature database
Posted May 24, 2012 0:12 UTC (Thu) by nix (subscriber, #2304)
[Link]
It should be doable with suitable per-language general rules plus a list of exceptions. You know, like TeX has been doing for hyphenation for decades now.