<jpamental> can you share the link? Garret and I are on the old zoom...
Myles: two updates. One to the
optimizer and one to the enrichment repository.
... the update to the optimizer:
... previously I had been re-ordering the glyphs and had been
putting glyf/cff at the end.
... wasn't putting data at the end of the CFF table. So other
info ended up after the glyf data. I've fixed that.
... and the other update is that I flattened the fonts, so for
glyf tables with composite glyphs. Those were flattened into a
single set of contours so there's no glyph dependencies and
flattended CFF sub routines.
... that's a requirement for an optimized font.
... all glyphs must be independent and glyph data is at the end
of the file.
... those are the two hard requirements.
... also updated the w3c pull request.
... first, removed the objc dependency. Just uses fontTools
now. So there's no problem with fork/exec issue that I
discussed before.
... it's also much faster now.
... and the other piece that I updated is rather than pulling
out the size I pull out the actual data for each glyph and
gzip'd to get a more realistic compression ratio.
... so what's in the PR now matches what a browser would
do.
... so that's it for my update. What's next is I've done
everything before running on the big data set. So need to
optimize a whole lot of fonts. Then run on the data set.
... right after that want to dig into the data set and
characterize things like which languages and characters are in
use.
... also want to look into the cost function.
... think that within a week will have some numbers on running
on the big google data set.
Vlad: two follow up questions. One is desubroutinizing data. Have you tested the fonts to see how they work after optimization.
Myles: I tested for correctness. I made sure after flattening that the glyphs were actually flattned and tested it on a webpage visually.
Vlad: when we were developing woff2 and looked at cff data. When it has sub-routine calls when you de-sub it, it will be 10-15% extra data but brotli will reduce it to less then.
Myles: so were comparing de-sub fonts with brotli to sub-routinized fonts with brotli?
Vlad: the end result was both the
output from Brotli
... turned out that de-sub increases the data going into brotli
but the end result after compression is smaller. 2-3% vs the
subroutinized.
... so recommendation was to not use subroutinization.
... are you seeing the same dynamics here? Likewise with
flattening glyph data.
... at monotype when we produce composites from flat designs
there are big savings.
... I wonder if inability to handle composite glyphs is this a
test limitation or a limitation of the method?
Myles: didn't do any specific
investigation about it. If you have depedencies in glyphs. ie.
To use 7 you also need 13. If you don't know that until you've
got 7. That's a problem because it means there's yet another
round trip.
... according to the numbers in the w3c repository. The way the
algorithm works: all the requests that can go in parallel do
and share the startup cost. As soon as you need a new frontier
in this graph. That means you have one more startup cost.
... each additional frontier is very expensive vs. the price of
the glyph.
... I think that the way the numbers work out now I think it'll
be cheaper to flatten.
Vlad: if that dependency can be flagged where the composites are.
Myles: so another solution is a
depedency table.
... there's a couple of other places where this could be
valuable.
... hoping to not create another file format.
Vlad: intuitively doubling the original data set doesn't see efficient, but can see how that might work out.
Garret: tables can be added to the format without breaking renderers.
Vlad: that's right custom tables wont be exposed outside of this environment.
Myles: I think that there's two
pieces. I'd be worried about tools dropping tables that it
doesn't know about.
... I'm envisioning a world where lots of websites have these
optimized fonts. A company could come up with a custom
optimization tool. Any solution to streamable fonts involves an
owner making changes to their website.
... in order to aid developers want to make their lives as easy
as possible. Want to minimize that set of complicated parts.
However if that's a requirement to get good performance may
have to use them.
... I'm worried someone creates an optimized font, then some
time later someone runs a tool on that font and it drops the
tables.
Vlad: I guess we'll cross that
bridge when we get there and have the results of the
performance analysis.
... we can easily describe it as part of the
recommendation.
Myles: one additional point, if the author messes up and makes a font that claims to be optimized but isn't the browser can download the whole font.
Vlad: composite fonts are in wide use in many environments.
Myles: we can try it both ways.
Jason: one of our goals is to make sure it works on most webservers. If there's a step to pre-process the fonts before they can be used that doesn't seem unreasonable if there's an easy to follow recipe for that.
Vlad: if we decide to go that way
and the tools required are publicly available. I don't see any
disadvantages. The whole process is described in the spec and
there's a tool.
... you mentioned in the beginning you need another week to
finalize things. Do you think you'll have something ready in a
week.
... could you take an action item to have something ready for
next monday.
Myles: I think I will have run
the analysis on the data, but not sure I'll be confident that
the code in the analysis will be right by then.
... for example the fact that all parallel requests download in
parallel is not quite right. For example in successive requests
network parameters from the first requests can be used.
... decisions on how to group glyphs are based on network
parameters.
... think I'll need to investigate more after getting initial
numbers.
... one way to solve the network parameters issue is to make
them available to the method.
Vlad: the byte range that needs
to be downloaded could be derivative of these numbers.
... if you know bandwidth/rtt time you could figure out how
many bytes to download in a single request.
<myles> Garret: I've implemented code to do codepoint prediction. Looks at frequency data of characters in that script, it will add additional codepoints to the font that comes back
<myles> Garret: In the hope that those will be needed in additional requests. I finished the code to do that, now I just need to re-run the analysis with the data set just to see how much it saves. That's my plan for this week - get some numbers for a week from today.
<myles> myles: Are you considering privacy or just performance?
<myles> Garret: Just performance. The privacy question is separate. We'll need to add that in later (for all methods)
<myles> Garret: I don't think codepoint prediction will decrease privacy, because it uses a global frequency table.
<myles> Vlad: Should we include privacy considerations as part of the analysis? Or just say "at this point we only care about performance"? And say the effects of privacy will be the same for all methods?
Myles: consider a complex script
and the browser wanted to add a letter to improve privacy. So
it adds the letter but by adding that the server has to send
another hundred characters, thats bad.
... but for something like range request if the additional
glyph is added there's no closure added. So that doesn't come
with a bunch of extra stuff.
This is scribe.perl Revision of Date Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: Irssi_ISO8601_Log_Text_Format (score 1.00) Default Present: Vlad, jpamental, Garret, Myles Present: Vlad jpamental Garret Myles No ScribeNick specified. Guessing ScribeNick: Garret Inferring Scribes: Garret WARNING: No "Topic:" lines found. WARNING: No meeting chair found! You should specify the meeting chair like this: <dbooth> Chair: dbooth Found Date: 06 Jul 2020 People with action items: WARNING: No "Topic: ..." lines found! Resulting HTML may have an empty (invalid) <ol>...</ol>. Explanation: "Topic: ..." lines are used to indicate the start of new discussion topics or agenda items, such as: <dbooth> Topic: Review of Amy's report WARNING: IRC log location not specified! (You can ignore this warning if you do not want the generated minutes to contain a link to the original IRC log.)[End of scribe.perl diagnostic output]