Meeting minutes
<chris> https://
Garret: If you guys hadn't see, the Harfbuzz subsetting API hit "stable" status. That's a cool milestone. It's ready for actual use
Vlad: Great. Thank you.
<Vlad> Agenda: list of issues https://
Vlad: Let's go starting from the top, one by one
Vlad: Chris, is there any way to record this review to reference later?
chris: There is, but I'm doing it. I've got an issue with the meta-review. I'm linking to it. I'll take care of the book keeping.
Vlad: Are there any specific resolutions we need to make?
chris: no.
I18n checklist
chris: I filled this in. The answer is basically "no" to most things. It would be good to get more eyes on it.
chris: There was only one thing which was possibly applicable. I don't think we do that in the way that they mean, but I'll let them be the judge.
Vlad: Text is a sequence of characters. When we ask for separate characters, we don't preserve order, so it isn't even qualified as text
chris: I agree.
Garret: I reviewed the checklist as well. Nothing jumped out at me. It all looked okay.
Vlad: If all the questions are "no" then there's no action to take.
chris: This is just something I filled in so they can review our answers. If they have questions they can ask them here
Self-Review Questionnaire: Security and privacy
chris: This asked about temporary identifiers. We make 64-bit checksums but they don't convey security information and they can't be used for tracking because every time you update a font the checksum will change.
Garret: The other thing: Most browsers implement caching on a domain basis; those checksums won't leave a particular domain.
chris: That would be useful to edit my answer to say
Garret: Okay, I can do that.
Range requests and preflights
Garret: They're noting the development of another spec which may be interesting to us.
Vlad: I sounds like we will be working on range request specification. Can we use it, Myles?
myles: i dunno. presumably.
Garret: It's a change to range requests that avoids CORS in certain situations. Hopefully we can be in those situations
Early wide review of IFT
chris: This is the meta-issue where I track all the other issues. A18y is happy so I checked them off. We need to do privacy though. I don't know about security though. TAG review is blocked on those
FAST Checklist review for IFT
Vlad: In addition to what chris indicated, there was a question about if something we do affects content presentation. We might affect content presentation. If a font is requested but not available, there's a unique character set that the content relies on, it will not be rendered.
myles: PUA characters?
Vlad: Not necessarily. For example, you might have content in a unique language that is archaic that is not universally supported anywhere. If you want to quote something in that language, that text might not render.
myles: that's the same for all fonts it hink
Vlad: For some languages, the expectation might be if you don't have a web font loaded, there will be some other font available to display the content. But it's not universally true for every language and character set
chris: I think myles is right. This is general about webfonts. Webfonts are well-accepted. We only need to cover the differences between normal webfonts
Garret: In terms of what content will and will not be there, there's no difference comparing a font with IFC and a font without IFC.
Vlad: Many browsers have certain rules in place about for example if you wait for a font to load and if it doesn't you abandon it. Those rules will never be an issue in IFT. Hopefully
<chris> https://
myles: well, the rules will still have to exist, they just might be triggered less often.
myles: we have to figure out what to do with the font loading timeline, though. it's a big open issue.
Garret: That should be done in CSS
myles: OK
Privacy section is entirely missing
Garret: My opinion is this is important and we should call it out, but maybe we should leave most of the details up to the implementations. We may get it wrong, and once it's specified it's set in stone. Implementations might want to iterate to get to the right level of privacy
Vlad: I realized I started typing a comment but I forgot to hit enter. Please refresh it
Vlad: I agree that making a request to do something shouldn't be something we do unless we're certain we want to do and has benefit. I commented earlier saying that i'm not sure I agree with the blanket request to add necessary glyphs just to fuzzy the content up. It seems like overkill for many languages.
Vlad: We discussed something in the past about using statistical distributions to optimize subsets. If someone asks for a subset, we might as well include additional characters for performance
Vlad: We could mention this in the spec. I'm reluctant to make it mandatory. It's best to be left to the implementors.
Vlad: They can choose and add value.
Garret: There are 2 other mitigating factors: 1. We require HTTPS. 2. The caching is on the domain level, so your current state is silo'ed into that one domain.
myles: I disagree. Protecting privacy is important. I agree that the spec shouldn't be overly presecriptive, but the spec shouldn't say nothign either. We're arguing about where within the spectrum the spec should lie; maybe instead of arguing about generalities we should write example spec text and argue about the specifics of that text.
Garret: We don't want the spec-text to list every script. We don't have the expertise to get that right
jpamental: Is it reasonable in the spec to suggest adding some percentage of additional glyphs?
Garret: Yes. There are probably quite a few scripts which this is unnecessary - like latin. We could say "here's a way to classify scripts in 'needs it' and 'doesn't need it'" I dont' know what that would exactly look like
myles: We can provide minimums. We can also charactize scipts by number of characters
Vlad: We can include something to hint that there is a freedom here to add value and do more.
myles: There has to be minimums
Vlad: The spec can say "it may be useful, but not everything needs it"
Garret: I can come up with some proposed spec text
chris: I would be more comfortable with a more data-driven approach.
chris: I would like to know, for a script, if we have a corpus of 100 documents, if you can tell with what accuracy whcih one is being requested
myles: we might be able to do what with the corpus we've already provided
Garret: We also have character frequency data.
Vlad: Would also be an example
Garret: If you have characters which are low frequency, their inclusion carries more information than a common one
chris: I'm not sure how much this is a problem
myles: Also we have to be aware that the font server can be part of the attack vector.
Garret: Sure.
Vlad: Okay. Once we have specific proposed text we can discuss it then.
myles: what's the rationale for HTTP-only?
Garret: This is more privacy-sensitive than general font transfers
Garret: I think it would be nice to encourage using the more secure technology these days. In general, HTTP is not recommended
myles: We don't generally agree. HTTPS and webfonts are pretty unrelated
myles: If we get a good solution for this privacy stuff, we might be able to relax the HTTPS-only requirement
Garret: We'd have to have a pretty good solution, but yes theoretically
Garret: Fonts are attack vectors. MITM attacks can compromise systems.
chris: I agree. Some systems are more or less sensitive than others
Garret: This is more than theoretical
Method negotiation has potential time wastes in it
myles: I need time to page this back into my mind
Garret: I made a PR to require patch-subset servers also support the range-request method
Garret: For the second issue, I think that first request won't be large, so it won't be a big deal.
myles: it won't be large???
Garret: I'd expect the worst case to be 400 bytes
Vlad: In response to this discussion, do we expect spec changes?
Garret: For problem 1, there's a PR. For problem 2, no
Vlad: Please write spec text for problem 2 if you want changes
Garret: <describes PR>
myles: I'll make proposed spec text for problem 2
Considerations for multiple distinct servers in patch-subset method
Garret: This is a tricky one. I don't think we need anything in the protocol to help with this. I put a comment on here stating how I think we might do it. I'm open to other perspective though
Garret: There's probably a spectrum of people who might be running this technology. Their needs will be different. A smaller site won't have this problem. A further from that, a small number of instances, this isn't a problem for them. It only is an issue when you have a large number of instances across the world
Garret: From Google Fonts's perspective, we would need to solve this, but it wouldn't require protocol changes.
Vlad: Would we hit critical compat issue where a request comes where the server cannot respond.
chris: The spec says you start again from fresh in that case
Garret: If the server can't reproduce the state, then it just gives you a new replacement font
the concern is that if you have a fleet of half new version and half old version, and they disagree, then the client could jump back and forth between the two different versions
Garret: We would solve it by sticky load balancing. One client, for a specific font, they'd go to the same backend. The upgrade for a particular user would be atomic
Garret: Even if we had that token, it wouldn't be all that helpful from the backend perspective. We can't run two versions of the subsetter binary on the server.
myles: sounds like we need a note to note the concern, and that's it
Garret: okay
Garret: once the subsetter is stable, it will probably be rare that we will change the font's binaries
Garret: Where there are changes, it will be limited to corner cases probably. We probably won't see changes that affect all fonts
myles: really?
Garret: It will be infrequent.
Garret: Subsetting for most tables in most fonts is well-defined. There's not much to change. But there are places where we might be making optimizations. But that will affect a small number of fonts.
Font Collection support
myles: WebKit supports TTCs
Vlad: I don't know what kind of considerations you had in mind
Vlad: IFT produces WOFF2 files, and WOFF2 supports collections.
Vlad: What we put in is the output of the subsetter. It's up to the subsetter
Garret: THe current subsetters do support in a way collections - you can say "given a collection, subset a single font inside" but the current subsetter implementations can't subset the collection as a whole. This is a challenging problem to implement around teh sahred tables.
Garret: We probably do want something here.
Garret: Having this technology, we might not need collections. With subsets, we might not need shared cmaps. If you this collection, and you have subsets inside the collection, you won't be able to share cmap. That once shared cmap will become duplicate.
myles: that's reasonable. We should at least describe what happens when the input is a TTC.
Vlad: We should make sure that if the request comes for the TTC portion that we have enough <missed>
Garret: I had 2 proposals for how to incorporate TTC. 1. Add a single font index field into the request - you make a separate request for each in the collection, and the server opens up the TTC and only considers a single font. The alternative is that we make changes to the request message, to specify parallel data for every item inside the collecetion. In HTTP2, you can do parallel requests .... but you can't use that
myles: I think I need to confer with my team about how these TTCs are being used
Garret: Even if all the members are used, requesting in parallel might acutally end up being faster than one big request
Garret: next steps: wait for more information
Vlad: Don't we have what we need already?
Garret: If we wanted to support collections, the bare minimum is a single new field. Or a third option: Subset all the fonts in the collection to the same set. That would require no changes to the spec today
Garret: There's one small benefit: The table sharing would still work if you were subsetting all the fonts to the same subset. But the additional waste might counteract that.
shown (where?) to give the smallest encodings
Vlad: This is editorial
Vlad: I'm not sure how much more we can do about it
Vlad: Garret may have had a PR in place for this?
Garret: I don't have a PR for this, but i plan on making one. This is based on the simulations I did for the compression sizes. The solution is to just publish those results and link to them from this spec. Just to backup the recommendation.
Vlad: Do you have a timeline?
Garret: Next week or two
myles: Presumably publish in this repo
Garret: Or the analysis one
myles: sure
Progress update
Garret: We had the prototype client and server implementation, those were based on protobufs. In the last couple weeks we converted to use CBOR. Client and server are almost spec-compliant. In the near future we should have a spec-compliant reference implementation of patch-subset.
Vlad: Do you have links we can share?
Garret: Yes. We moved it to a W3C repo.
Conformance tests
Garret: I don't have a ton of experience. We have an implementation as a black box and we make test against it.
chris: It depends how black it is
Vlad: When we did it for WOFF2, we checked if rendered results are different
Garret: So you load it in a browser?
Vlad: Everything that was "must" in the spec would have a font
Garret: We'll probably split the conformance suite into two parts - client and server. For server, it will send HTTP and check the response. For client tests, we'll probably need HTML tests.
myles: How do we know if it was incrementally transfered?
chris: you can use unicode-range
myles: i'm not sure that works
<chris> I wasn't suggesting unicode-range btw
Garret: The only requirement is that the client sends at least what was requested. a naive implementation would pass that requirement
Garret: Where should the tests live? separate repo?
chris: Yes
chris: Do want separate repos for client tests and server tests? We might want to put the client ones into WPT
myles: The client tests may actually need a server (!!)
myles: It sort of depends what the browser API will look like. We might tie it into the CSS Font Loading API, in which case the JS could do its own loading
Garret: I could prototype it
Vlad: In the old workflow, we had a stylesheet to map different parts of the spec to <missed>. Do we have anything similar with bikeshed?
chris: I don't think so in bikeshed. It was just based on classes, right? We can put the classes into the bikeshed and they will get passed through
Vlad: ok
Vlad: If we define classes, we could re-use the same stylesheet we used to have?
chris: Yeah, something like that
Garret: I'm officially transferring over to Google Canada, but I have to use up my vacation, so I'm taking the next few weeks off
Next call: November 16