W3C

– DRAFT –
Web Fonts Working Group Teleconference

06 June 2023

Attendees

Present
bberning, Garret, JHudson, skef, Vlad
Regrets
-
Chair
Garret
Scribe
skef

Meeting minutes

<Garret> Skef: re: alternate substitutions

<Garret> Skef: a mechanism for choosing substitutions.

<Garret> Skef: eg. aalt

<Garret> Skef: often have the kitchen sync, their not really normal features. When you turn them on you're not really turning them on.

<Garret> Skef: there's a selection phase and a use phase. You might present a user a UI with the alternates for a glyph.

<Garret> Skef: in the use phase you put in notation saying i want a substitution for a codepoint at this index.

<Garret> Skef: this is slightly complicated. Sometimes there are duplicate tables

<Garret> Skef: same feature name seperate tables.

<Garret> Skef: question is, when we are supporting these. Do we require you subset with the kitchen sink. Or do we have the option of picking the particular index.

<Garret> John: I can give additional background.

<Garret> John: aalt palt are no a good example of this. Not user facing.

<Garret> John: one we should look at are the character variant features.

<Garret> John: where each feature is associated with a base character and some alternate variants of it.

<Garret> John: eg. font Athena Ruby. Has 25 different forms of the greek Alpha

<Garret> John: eg. font Athena Ruby. Has 25 different forms of the greek Alpha or do we try to find the specific variant.

<Garret> Skef: I think it's true that aalt/palt is close to zero, but since we're making a protocol that has general use we need to consider how their generally used across applications.

<Garret> Skef: probably the largest problem in terms of use and size.

<Garret> Skef: can increase the size significantly.

<Garret> John: my point was that there are non-zero cases of user features.

<Garret> John: we need a general protocol principle for type 3 lookups.

<Garret> Skef: I have some evidence as features everything works by virtue of the fact the subsetter keeps everything.

<Garret> Skef: harfbuzz will happily include all of those.

<Garret> Skef: but that can be a big set. Is it worth providing the option in the protocol to set an index. Either requiring or giving an option to the server side to restrict the subset.

<Garret> Skef: and what that means in terms of the tables that will be generated. You'd fake the earlier substitutions and have the correct set of values.

<Garret> Skef: for earlier values do an identity substitution.

<Garret> Garret: from a technical perspective: the first request doesn't know what the substitution indices are and so must grab everything (for the requested codepoints).

<Garret> Skef: the first request may match a font that the indices are already known.

<Garret> Skef: with these tables there may be some earlier point where you've requested the full subset.

<Garret> Skef: with character variant it's possible that's not true.

<Garret> Skef: they may have a specification for the font that says what they are.

<Garret> Garret: another issue is that the indices are implementation details. They may change when the font changes.

<Garret> Skef: indices are somewhat of a convention.

<Garret> John: indices are into the lookup tables.

<Garret> Skef: so when you're using one of these things and you want to render the glyph. They pick out of a list and you remember the index and in subsequent calls use it.

<Garret> Skef: these indices are low positive integers are the interface.

<Garret> John: CSS did define ways to address these.

<Garret> John: not sure if they are still in there.

<Garret> Garret: would be great to find it and review.

<Garret> Vlad: earlier we had discussions on glyph indices vs codepoints as the basis.

<Garret> Vlad: which addresses this, but the argument against requires a mandatory roundtrip to grab layout data in order to grab glyph indices.

<Garret> Vlad: and that roundtrip kills the benefit of reducing extra data.

<JHudson> https://www.w3.org/TR/css-fonts-4/#ref-for-character-variant

<Garret> Vlad: we might run into the same problem, by saving time we might require an additional round trip.

<JHudson> [Link for CSS Fonts Module details on character variants]

<Garret> Skef: I don't think so, we'd have to have a way of saying we want the full table. The reality of this table is they are not commonly used and they're unusual in the way that you use them. Just as the best practice for storing state is to start with the codepoints and set of features. Because that allows you to move between versions of fonts.

<Garret> Skef: the best practice with these is to keep track of codepoints + features + index.

<Garret> Vlad: when a client starts rendering a page is the character set used from the page and which features are used. There is no font information.

<Garret> Vlad: so right now the client asks the server and it gives anything that's relevant.

<Garret> Vlad: the rest of the implementation would be up the server.

<Garret> John: fonts that have alternate substitution lookups is still a relatively small number.

<Garret> John: looking at pretty small benefits. The difference between 2 and 4 glyphs.

<Garret> John: Skef's point because you can use CSS to turn on lower level open type features. If someone did turn on the aalt feature I'm not sure what it would do to the text. It's a dumping ground for all the substitutions in their UI.

<Garret> John: a lot of tools automatically build it.

<Garret> John: it could involve a lot of glyphs being downloaded.

<Garret> John: my inclination is if someone makes a bad decision in CSS we don't need to worry about it too much at the protocol level in IFT>

<Garret> Vlad: depending on the byte savings may not give us noticeable improvements.

<Garret> John: because CSS does provide a mechanism targeting character variants.

<Garret> Garret: It's good to see the CSS feature for this, that addresses some of the concerns as it provides the indices up front.

<Garret> Garret: also it might be reasonable to just keep range request as the option for dealing with fonts with large alternate sets.

<JHudson_> Link for GSUB type 3 test font (Athena Ruby, unclear license) : https://www.doaks.org/resources/athena-ruby

<Garret> Vlad: might be worth verifying the protocol can communicate the feature sets.

<Garret> Garret: also worth noting these features are optional and not sent by default, so the standard use cases won't get them.

Update on noise privacy

privacy simulations

(Addresses concern about codepoint sets revealing content)

Added to the simulation: Making the additional codepoints added proportional to the needed number of codepoints

The variable metric works better than the fixed metric in the simulation

Roughly doubles the size (with Latin) but is still well under the size of a conventional font

Also tested some additional languages, including Arabic

Also worked well: didn't need to add too many codepoints to match many pages

Similar with Devanagari and Japanese (although the page set for the former was small)

Chinese and Korean results were similar to Japanese

Question: To what extent will noise reduce the need for future requests.

<Garret> Skef: one concerning thing is the impact this has on caching. This would break the ability to do regional caching. We likely want to have the randomization seeded by the content.

<Vlad> Re: noise - specifically because the noise is frequency weighted

<Garret> Garret: Agreed, and this may actually be better from a privacy perspective since you won't have multiple requests adding new noise to the same content.

<Garret> Garret: I'll look into this more and see if we should include it.

caching [issues #119](https://github.com/w3c/IFT/issues/119)

<Garret> Garret: started a tag review. There was a comment that there was still some open issues from the HTTP review. I clarified only patch subset is in scope forthe review and caching concerns are the only open issue on the patch subset side. Planning to add additional spec text to explain caching concerns/implementation approaches to enable caching server side, client side, and with a cdn in the middle.

<Garret> Vlad: next issue is about the expiring brotli spec.

<Garret> Garret: I'll check in on that and see if we can get it renewed.

<Garret> Garret: we have a fallback of copying the relevant paragraph from the specification if we need to.

Minutes manually created (not a transcript), formatted by scribe.perl version 210 (Wed Jan 11 19:21:32 2023 UTC).

Diagnostics

Maybe present: Question

All speakers: Question

Active on IRC: bberning, Garret, JHudson, JHudson_, skef, Vlad