Web Fonts WG – 19 October 2021

Meeting minutes

<chris> https://github.com/w3c/IFT/blob/main/RangeRequest.bs

Garret: If you guys hadn't see, the Harfbuzz subsetting API hit "stable" status. That's a cool milestone. It's ready for actual use

Vlad: Great. Thank you.

<Vlad> Agenda: list of issues https://github.com/w3c/IFT/issues

Vlad: Let's go starting from the top, one by one

Vlad: Chris, is there any way to record this review to reference later?

chris: There is, but I'm doing it. I've got an issue with the meta-review. I'm linking to it. I'll take care of the book keeping.

Vlad: Are there any specific resolutions we need to make?

chris: no.

I18n checklist

chris: I filled this in. The answer is basically "no" to most things. It would be good to get more eyes on it.

chris: There was only one thing which was possibly applicable. I don't think we do that in the way that they mean, but I'll let them be the judge.

Vlad: Text is a sequence of characters. When we ask for separate characters, we don't preserve order, so it isn't even qualified as text

chris: I agree.

Garret: I reviewed the checklist as well. Nothing jumped out at me. It all looked okay.

Vlad: If all the questions are "no" then there's no action to take.

chris: This is just something I filled in so they can review our answers. If they have questions they can ask them here

Self-Review Questionnaire: Security and privacy

chris: This asked about temporary identifiers. We make 64-bit checksums but they don't convey security information and they can't be used for tracking because every time you update a font the checksum will change.

Garret: The other thing: Most browsers implement caching on a domain basis; those checksums won't leave a particular domain.

chris: That would be useful to edit my answer to say

Garret: Okay, I can do that.

Range requests and preflights

Garret: They're noting the development of another spec which may be interesting to us.

Vlad: I sounds like we will be working on range request specification. Can we use it, Myles?

myles: i dunno. presumably.

Garret: It's a change to range requests that avoids CORS in certain situations. Hopefully we can be in those situations

Early wide review of IFT

chris: This is the meta-issue where I track all the other issues. A18y is happy so I checked them off. We need to do privacy though. I don't know about security though. TAG review is blocked on those

FAST Checklist review for IFT

Vlad: In addition to what chris indicated, there was a question about if something we do affects content presentation. We might affect content presentation. If a font is requested but not available, there's a unique character set that the content relies on, it will not be rendered.

myles: PUA characters?

Vlad: Not necessarily. For example, you might have content in a unique language that is archaic that is not universally supported anywhere. If you want to quote something in that language, that text might not render.

myles: that's the same for all fonts it hink

Vlad: For some languages, the expectation might be if you don't have a web font loaded, there will be some other font available to display the content. But it's not universally true for every language and character set

chris: I think myles is right. This is general about webfonts. Webfonts are well-accepted. We only need to cover the differences between normal webfonts

Garret: In terms of what content will and will not be there, there's no difference comparing a font with IFC and a font without IFC.

Vlad: Many browsers have certain rules in place about for example if you wait for a font to load and if it doesn't you abandon it. Those rules will never be an issue in IFT. Hopefully

<chris> https://drafts.csswg.org/css-font-loading-3/

myles: well, the rules will still have to exist, they just might be triggered less often.

myles: we have to figure out what to do with the font loading timeline, though. it's a big open issue.

Garret: That should be done in CSS

myles: OK

Privacy section is entirely missing

Garret: My opinion is this is important and we should call it out, but maybe we should leave most of the details up to the implementations. We may get it wrong, and once it's specified it's set in stone. Implementations might want to iterate to get to the right level of privacy

Vlad: I realized I started typing a comment but I forgot to hit enter. Please refresh it

Vlad: I agree that making a request to do something shouldn't be something we do unless we're certain we want to do and has benefit. I commented earlier saying that i'm not sure I agree with the blanket request to add necessary glyphs just to fuzzy the content up. It seems like overkill for many languages.

Vlad: We discussed something in the past about using statistical distributions to optimize subsets. If someone asks for a subset, we might as well include additional characters for performance

Vlad: We could mention this in the spec. I'm reluctant to make it mandatory. It's best to be left to the implementors.

Vlad: They can choose and add value.

Garret: There are 2 other mitigating factors: 1. We require HTTPS. 2. The caching is on the domain level, so your current state is silo'ed into that one domain.

myles: I disagree. Protecting privacy is important. I agree that the spec shouldn't be overly presecriptive, but the spec shouldn't say nothign either. We're arguing about where within the spectrum the spec should lie; maybe instead of arguing about generalities we should write example spec text and argue about the specifics of that text.

Garret: We don't want the spec-text to list every script. We don't have the expertise to get that right

jpamental: Is it reasonable in the spec to suggest adding some percentage of additional glyphs?

Garret: Yes. There are probably quite a few scripts which this is unnecessary - like latin. We could say "here's a way to classify scripts in 'needs it' and 'doesn't need it'" I dont' know what that would exactly look like

myles: We can provide minimums. We can also charactize scipts by number of characters

Vlad: We can include something to hint that there is a freedom here to add value and do more.

myles: There has to be minimums

Vlad: The spec can say "it may be useful, but not everything needs it"

Garret: I can come up with some proposed spec text

chris: I would be more comfortable with a more data-driven approach.

chris: I would like to know, for a script, if we have a corpus of 100 documents, if you can tell with what accuracy whcih one is being requested

myles: we might be able to do what with the corpus we've already provided

Garret: We also have character frequency data.

Vlad: Would also be an example

Garret: If you have characters which are low frequency, their inclusion carries more information than a common one

chris: I'm not sure how much this is a problem

myles: Also we have to be aware that the font server can be part of the attack vector.

Garret: Sure.

Vlad: Okay. Once we have specific proposed text we can discuss it then.

myles: what's the rationale for HTTP-only?

Garret: This is more privacy-sensitive than general font transfers

Garret: I think it would be nice to encourage using the more secure technology these days. In general, HTTP is not recommended

myles: We don't generally agree. HTTPS and webfonts are pretty unrelated

myles: If we get a good solution for this privacy stuff, we might be able to relax the HTTPS-only requirement

Garret: We'd have to have a pretty good solution, but yes theoretically

Garret: Fonts are attack vectors. MITM attacks can compromise systems.

chris: I agree. Some systems are more or less sensitive than others

Garret: This is more than theoretical

Method negotiation has potential time wastes in it

myles: I need time to page this back into my mind

Garret: I made a PR to require patch-subset servers also support the range-request method

Garret: For the second issue, I think that first request won't be large, so it won't be a big deal.

myles: it won't be large???

Garret: I'd expect the worst case to be 400 bytes

Vlad: In response to this discussion, do we expect spec changes?

Garret: For problem 1, there's a PR. For problem 2, no

Vlad: Please write spec text for problem 2 if you want changes

Garret: <describes PR>

myles: I'll make proposed spec text for problem 2

Considerations for multiple distinct servers in patch-subset method

Garret: This is a tricky one. I don't think we need anything in the protocol to help with this. I put a comment on here stating how I think we might do it. I'm open to other perspective though

Garret: There's probably a spectrum of people who might be running this technology. Their needs will be different. A smaller site won't have this problem. A further from that, a small number of instances, this isn't a problem for them. It only is an issue when you have a large number of instances across the world

Garret: From Google Fonts's perspective, we would need to solve this, but it wouldn't require protocol changes.

Vlad: Would we hit critical compat issue where a request comes where the server cannot respond.

chris: The spec says you start again from fresh in that case

Garret: If the server can't reproduce the state, then it just gives you a new replacement font

the concern is that if you have a fleet of half new version and half old version, and they disagree, then the client could jump back and forth between the two different versions

Garret: We would solve it by sticky load balancing. One client, for a specific font, they'd go to the same backend. The upgrade for a particular user would be atomic

Garret: Even if we had that token, it wouldn't be all that helpful from the backend perspective. We can't run two versions of the subsetter binary on the server.

myles: sounds like we need a note to note the concern, and that's it

Garret: okay

Garret: once the subsetter is stable, it will probably be rare that we will change the font's binaries

Garret: Where there are changes, it will be limited to corner cases probably. We probably won't see changes that affect all fonts

myles: really?

Garret: It will be infrequent.

Garret: Subsetting for most tables in most fonts is well-defined. There's not much to change. But there are places where we might be making optimizations. But that will affect a small number of fonts.

Font Collection support

myles: WebKit supports TTCs

Vlad: I don't know what kind of considerations you had in mind

Vlad: IFT produces WOFF2 files, and WOFF2 supports collections.

Vlad: What we put in is the output of the subsetter. It's up to the subsetter

Garret: THe current subsetters do support in a way collections - you can say "given a collection, subset a single font inside" but the current subsetter implementations can't subset the collection as a whole. This is a challenging problem to implement around teh sahred tables.

Garret: We probably do want something here.

Garret: Having this technology, we might not need collections. With subsets, we might not need shared cmaps. If you this collection, and you have subsets inside the collection, you won't be able to share cmap. That once shared cmap will become duplicate.

myles: that's reasonable. We should at least describe what happens when the input is a TTC.

Vlad: We should make sure that if the request comes for the TTC portion that we have enough <missed>

Garret: I had 2 proposals for how to incorporate TTC. 1. Add a single font index field into the request - you make a separate request for each in the collection, and the server opens up the TTC and only considers a single font. The alternative is that we make changes to the request message, to specify parallel data for every item inside the collecetion. In HTTP2, you can do parallel requests .... but you can't use that

myles: I think I need to confer with my team about how these TTCs are being used

Garret: Even if all the members are used, requesting in parallel might acutally end up being faster than one big request

Garret: next steps: wait for more information

Vlad: Don't we have what we need already?

Garret: If we wanted to support collections, the bare minimum is a single new field. Or a third option: Subset all the fonts in the collection to the same set. That would require no changes to the spec today

Garret: There's one small benefit: The table sharing would still work if you were subsetting all the fonts to the same subset. But the additional waste might counteract that.

shown (where?) to give the smallest encodings

Vlad: This is editorial

Vlad: I'm not sure how much more we can do about it

Vlad: Garret may have had a PR in place for this?

Garret: I don't have a PR for this, but i plan on making one. This is based on the simulations I did for the compression sizes. The solution is to just publish those results and link to them from this spec. Just to backup the recommendation.

Vlad: Do you have a timeline?

Garret: Next week or two

myles: Presumably publish in this repo

Garret: Or the analysis one

myles: sure

Progress update

Garret: We had the prototype client and server implementation, those were based on protobufs. In the last couple weeks we converted to use CBOR. Client and server are almost spec-compliant. In the near future we should have a spec-compliant reference implementation of patch-subset.

Vlad: Do you have links we can share?

Garret: Yes. We moved it to a W3C repo.

<Garret> https://github.com/w3c/patch-subset-incxfer

Conformance tests

Garret: I don't have a ton of experience. We have an implementation as a black box and we make test against it.

chris: It depends how black it is

Vlad: When we did it for WOFF2, we checked if rendered results are different

Garret: So you load it in a browser?

Vlad: Everything that was "must" in the spec would have a font

Garret: We'll probably split the conformance suite into two parts - client and server. For server, it will send HTTP and check the response. For client tests, we'll probably need HTML tests.

myles: How do we know if it was incrementally transfered?

chris: you can use unicode-range

myles: i'm not sure that works

<chris> I wasn't suggesting unicode-range btw

Garret: The only requirement is that the client sends at least what was requested. a naive implementation would pass that requirement

Garret: Where should the tests live? separate repo?

chris: Yes

chris: Do want separate repos for client tests and server tests? We might want to put the client ones into WPT

myles: The client tests may actually need a server (!!)

myles: It sort of depends what the browser API will look like. We might tie it into the CSS Font Loading API, in which case the JS could do its own loading

Garret: I could prototype it

Vlad: In the old workflow, we had a stylesheet to map different parts of the spec to <missed>. Do we have anything similar with bikeshed?

chris: I don't think so in bikeshed. It was just based on classes, right? We can put the classes into the bikeshed and they will get passed through

Vlad: ok

Vlad: If we define classes, we could re-use the same stylesheet we used to have?

chris: Yeah, something like that

Garret: I'm officially transferring over to Google Canada, but I have to use up my vacation, so I'm taking the next few weeks off

Next call: November 16

– DRAFT –
Web Fonts WG

19 October 2021

Attendees