IRC log of webmachinelearning on 2024-03-07

Timestamps are in UTC.

14:47:12 [RRSAgent]
RRSAgent has joined #webmachinelearning
14:47:16 [RRSAgent]
logging to https://www.w3.org/2024/03/07-webmachinelearning-irc
14:47:16 [Zakim]
RRSAgent, make logs Public
14:47:17 [Zakim]
please title this meeting ("meeting: ..."), anssik
14:47:17 [anssik]
Meeting: WebML WG Teleconference – 7 March 2024
14:47:21 [anssik]
Chair: Anssi
14:47:26 [anssik]
Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2024-03-07-wg-agenda.md
14:47:45 [anssik]
Scribe: Anssi
14:47:49 [anssik]
scribeNick: anssik
14:47:56 [anssik]
gb, this is webmachinelearning/webnn
14:47:56 [gb]
anssik, OK.
14:48:01 [anssik]
Present+ Anssi_Kostiainen
14:48:05 [anssik]
Regrets+ Ningxin_Hu
14:48:15 [anssik]
RRSAgent, draft minutes
14:48:16 [RRSAgent]
I have made the request to generate https://www.w3.org/2024/03/07-webmachinelearning-minutes.html anssik
14:51:47 [grgustaf]
grgustaf has joined #webmachinelearning
14:54:24 [anssik]
Present+ Geoff_Gustafson
14:56:10 [anssik]
Present+ Michael_McCool
14:56:43 [anssik]
Present+ Etienne_Noel
14:58:59 [jsbell]
jsbell has joined #webmachinelearning
14:59:14 [anssik]
Present+ Joshua_Bell
14:59:26 [anssik]
Present+ Rachel_Yager
14:59:31 [Rach]
Rach has joined #webmachinelearning
15:00:17 [anssik]
Present+ Joshua_Lochner
15:00:30 [Joshua_Lochner]
Joshua_Lochner has joined #webmachinelearning
15:00:40 [McCool]
McCool has joined #webmachinelearning
15:01:03 [etiennenoel]
etiennenoel has joined #webmachinelearning
15:01:17 [anssik]
Present+ Zoltan_Kis
15:01:19 [asully]
asully has joined #webmachinelearning
15:01:36 [anssik]
Present+ Sudeep_Divakaran
15:01:44 [anssik]
Present+ Dwayne_Robinson
15:01:53 [anssik]
Present+ Ilya_Rezvov
15:01:59 [anssik]
Present+ Bryan_Bernhart
15:02:27 [anssik]
RRSAgent, draft minutes
15:02:28 [RRSAgent]
I have made the request to generate https://www.w3.org/2024/03/07-webmachinelearning-minutes.html anssik
15:03:04 [anssik]
anssik: Welcome to a new participant, Ilya Rezvov from Google
15:03:52 [anssik]
Ilya: I work for Google on Wasm mostly, recently looking at ML-related efforts, fp16 in Wasm specifically and want to extend my interests to other areas of ML on the web
15:04:29 [anssik]
Topic: Call for Consensus: WebNN API CR Snapshot
15:04:45 [anssik]
anssik: On 28 Feb 2024 we issued a Call for Consensus (CfC) to publish the Web Neural Network API as a new Candidate Recommendation Snapshot (CRS).
15:04:55 [anssik]
-> CfC to publish WebNN API Candidate Recommendation Snapshot - review by 7 Mar 2024
15:04:55 [anssik]
https://lists.w3.org/Archives/Public/public-webmachinelearning-wg/2024Feb/0006.html
15:04:59 [anssik]
... the WG has received explicit support for this CfC and no concerns have been raised
15:05:08 [anssik]
... our CR readiness tracker #240 shows all green except for the WIP TAG delta review
15:05:08 [gb]
https://github.com/webmachinelearning/webnn/issues/240 -> Issue 240 Candidate Recommendation readiness tracker (by anssiko) [process]
15:05:31 [anssik]
... the TAG delta review in flight may raise some questions at the transition time so we should be prepared for that
15:05:57 [anssik]
... that said, I hope we can address this in flight by noting this publication in fact explicitly addresses earlier TAG review feedback by removing support for synchronous execution
15:06:21 [anssik]
... I will handle CRS transition logistics with Dom and will ask the WG for further information as needed
15:06:29 [Deepti]
Deepti has joined #webmachinelearning
15:06:41 [anssik]
... we can still merge the currently open PRs before branching for this release, it is important that after each commit the spec remains in a cohesive state
15:07:03 [anssik]
... transition request processing is expected to take a week, so earliest publication date would be on the week 18-22 March 2024
15:07:22 [anssik]
Present+ Deepti_Gandluri
15:07:55 [anssik]
Topic: Hybrid AI exploration
15:08:25 [anssik]
anssik: As you recall, we have a sister WebML Community Group responsible for incubating new ideas for future work, the CG works closely with this WG, sharing many participants
15:08:32 [anssik]
... the WebML CG has received a new proposal called "Hybrid AI exploration":
15:08:35 [anssik]
-> https://github.com/webmachinelearning/proposals/issues/5
15:08:36 [gb]
https://github.com/webmachinelearning/proposals/issues/5 -> Issue 5 Hybrid AI Exploration (by grgustaf)
15:08:59 [anssik]
... so I wanted to invite WebML CG participants Michael, Geoff, Sudeep to present this proposal to solicit input from this WG and inform the direction this exploration should take
15:09:09 [anssik]
... the timebox is ~20 minutes including Q&A
15:09:23 [anssik]
... any concrete proposals from this exploration that may impact Web APIs are expected to be incubated in an applicable Community Group first
15:09:30 [anssik]
Slideset: https://lists.w3.org/Archives/Public/www-archive/2024Mar/att-0000/WebML.Discussion.-.Hybrid.AI.for.the.Web.-.Slides.pdf
15:09:53 [anssik]
[slide 1]
15:10:23 [anssik]
Michael: Proposal is titled "Hybrid AI for the Web", probably a bit mistitled, we're looking at general model management too
15:10:51 [anssik]
... started work to improve the fit of WebNN on the client and want to make sure we look at the right problems, we're not proposing solutions at this stage
15:10:57 [anssik]
[slide 2]
15:11:42 [anssik]
Michael: first going through the general status as we undersrand it, specific issues, goals and requirements we see, prioritization, closing with questions for this group
15:11:56 [zkis]
zkis has joined #webmachinelearning
15:11:57 [anssik]
[slide 3]
15:12:11 [anssik]
Michael: looked at WebNN use cases, client AI execution clearly in focus
15:12:39 [AramZS]
AramZS has joined #webmachinelearning
15:12:45 [anssik]
... we found some problems, e.g. language translation requires large models, long download times, need to figure out client capabilities
15:12:53 [anssik]
... startup time may be significant
15:13:16 [anssik]
... if we have two different web sites using a model we need to download a model twice,
15:14:03 [anssik]
... clients vary in capabilities, vary in time, clients grow rapidly in performance and want to avoid least common denominator approach
15:14:12 [anssik]
[slide 4]
15:14:16 [anssik]
Michael: Specific issues, three broader categories:
15:14:30 [anssik]
... 1) Model Management
15:14:52 [anssik]
... - Large models cannot be reused across origins
15:15:04 [anssik]
... - Model storage and management opaque to the user
15:15:27 [anssik]
... - Cache eviction may not match user preferences
15:15:47 [anssik]
... 2) Elasticity through Hybrid AI
15:16:13 [anssik]
... - Distributing work between client and server
15:16:32 [anssik]
... - Difficult to predict performance on a client
15:17:04 [anssik]
... - Sharing detailed client capabilities a privacy risk
15:17:27 [anssik]
... (noting possible overlap with PWA caching mechanisms)
15:17:33 [anssik]
... 3) User Experience
15:17:36 [anssik]
... - Privacy behaviour unclear, not match user preferences
15:17:40 [anssik]
... - Managing latency of model downloads
15:17:44 [anssik]
[slide 5]
15:17:49 [anssik]
Michael: Goals and Requirements
15:17:59 [anssik]
... Maximize ease of use for the end user
15:18:19 [anssik]
... - minimize load times and meet latency targets
15:18:32 [anssik]
... Portability and elasticity
15:19:09 [anssik]
... - minimize costs, support varying client capabilities, adapt based on resource availability
15:21:16 [anssik]
... Data privacy
15:21:29 [anssik]
... - Personal and biz data, support user choice and control
15:22:10 [anssik]
... last but not least, developer ease of use and consistency
15:22:30 [anssik]
[slide 6]
15:22:35 [anssik]
Michael: Questions for Discussion
15:22:43 [anssik]
... How to:
15:22:48 [anssik]
... handle model download latency and storage?
15:22:55 [anssik]
... match model reqs to client capabilities?
15:22:59 [anssik]
... choose among model fidelity levels?
15:23:03 [anssik]
... support progressive transmission of models?
15:23:08 [anssik]
... partition single models, support separate models, both?
15:23:20 [anssik]
... Questions to the group:
15:23:25 [anssik]
... - what should be the priorities?
15:23:28 [anssik]
... - Specific use cases for Hybrid AI?
15:23:32 [anssik]
[slide 7]
15:23:35 [anssik]
Michael: Proposed Next Steps
15:23:42 [anssik]
... 1) Make sure we solve the right problem
15:23:47 [anssik]
... We welcome your feedback via the GH issue submitted to the proposals repo:
15:23:50 [anssik]
-> https://github.com/webmachinelearning/proposals/issues/5
15:23:51 [gb]
https://github.com/webmachinelearning/proposals/issues/5 -> Issue 5 Hybrid AI Exploration (by grgustaf)
15:23:57 [anssik]
Michael: 2) Build a prototype implementation
15:24:02 [anssik]
... e.g. using the Model Loader API from the CG as a basis, we have some ideas to test
15:24:06 [anssik]
-> Model Loader API https://webmachinelearning.github.io/model-loader/
15:24:20 [anssik]
Michael: 3) Bring results back to the group to discuss further
15:24:33 [anssik]
RRSAgent, draft minutes
15:24:34 [RRSAgent]
I have made the request to generate https://www.w3.org/2024/03/07-webmachinelearning-minutes.html anssik
15:24:57 [RafaelCintron]
RafaelCintron has joined #webmachinelearning
15:25:06 [RafaelCintron]
q+
15:25:10 [anssik]
Present+ Rafael_Cintron
15:25:26 [anssik]
ack RafaelCintron
15:25:52 [anssik]
RafaelCintron: is there a solution in mind?
15:26:31 [anssik]
Michael: caching strategy on the computational graph, negotiating model requirements and client capablities are a few ideas
15:27:33 [Joshua_Lochner]
q+
15:27:35 [anssik]
anssik: we have discussed Storage APIs for caching large models in this group earlier
15:27:37 [anssik]
ack Joshua_Lochner
15:27:51 [anssik]
anssik: I recall Joshua Lochner has prototyped solutions to cross-site sharing of models with a browser extension
15:28:31 [anssik]
Joshua_Lochner: from Transformers.js perspective we're fortunate that the browser caching API is pretty performant, can do 1.5B param model and refresh the page and it loads from the cache
15:29:09 [anssik]
... however, the issues emerges when I go to another web site on a different origin, my extension idea works but it requires the user to download a random extension, extra effort for the user, not a standards-based feature
15:29:51 [anssik]
... for the size issue, I'm focused on smaller models that can perform in Wasm environment, soon WebGPU
15:30:09 [anssik]
... storage and issues related to exceptionally big models has not been the main focus, 50M to 250M parameters has been the focus of Transformers.js, sweet point
15:30:14 [anssik]
... due to the cache issues
15:30:40 [anssik]
... the main API is Web Cache API, models loaded as HTTP request-response pair from HuggingFace Hub
15:30:41 [anssik]
q?
15:30:53 [anssik]
Michael: are you caching serializations of the models, ONNX files?
15:31:03 [anssik]
... any ideas re adapters?
15:31:32 [anssik]
Joshua_Lochner: caching serializations, single .onnx files, in the future separate graph and weight in separate files
15:31:44 [anssik]
... for adapters, haven't thought about it yet
15:32:11 [anssik]
... ONNX RT is rather limited in this sense, if we want to use an adapter we need to export the whole model
15:32:18 [anssik]
... MMS text-to-speech model was one example where an adapter is at the end
15:32:58 [anssik]
... what we have been able to do is split up the models, only works is the backend is identical, e.g. text-gen model, chop of the head
15:33:16 [anssik]
... that's one way to share weights, not exactly the way you're proposing
15:33:48 [anssik]
Michael: the topology is smaller chunck we're concerned of weight caching at this stage
15:34:22 [anssik]
... progressive transmitting creates questions regarding the API if it's read-only, the best now is to create zero nodes and add weight later
15:34:34 [anssik]
... to progressively enhance the model we need to built it from the scratch
15:34:35 [anssik]
q?
15:34:48 [anssik]
q?
15:35:48 [anssik]
Michael: the prototype will likely be similar to HuggingFace solution, for good UX may need some standardization work later in the future
15:35:50 [anssik]
q?
15:36:24 [anssik]
Michael: what use cases can make use of different level of fidelity?
15:36:48 [anssik]
... big-little mapping to server-client, any specific use cases for Hybrid AI?
15:37:16 [Joshua_Lochner]
q+
15:37:19 [anssik]
q?
15:37:22 [anssik]
ack Joshua_Lochner
15:38:01 [anssik]
Joshua_Lochner: I guess some form of a personalization model that learns things over time, continuous training, a model where you can update the weights over time, private personalization preference learning model
15:38:20 [anssik]
... e.g. you're on Twitter and want to block certain things
15:39:00 [anssik]
... you're probably referring to much larger things, multiple LoRAs on top llama
15:39:42 [anssik]
Michael: fine-tuning on the client, split out the LoRAs so we can select them, a lot of these optimizations are relevant to big models mainly, smaller models are faster to download as is
15:39:43 [anssik]
q?
15:40:58 [anssik]
Joshua_Lochner: another use case, some form of underlying embeddings adapting, speech-to-text, text-to-speech, base model stays the same
15:41:03 [anssik]
q?
15:41:16 [anssik]
https://github.com/webmachinelearning/proposals/issues/5
15:41:17 [gb]
https://github.com/webmachinelearning/proposals/issues/5 -> Issue 5 Hybrid AI Exploration (by grgustaf)
15:42:03 [anssik]
Topic: Open issues and PRs
15:42:07 [anssik]
anssik: as usual, let's discuss open issues and review PRs based on your feedback
15:42:11 [anssik]
-> All open issues https://github.com/webmachinelearning/webnn/issues
15:42:16 [anssik]
-> All pull requests https://github.com/webmachinelearning/webnn/pulls
15:42:23 [anssik]
-> Triage guidance https://github.com/webmachinelearning/webnn/blob/main/docs/IssueTriage.md
15:43:28 [anssik]
Subtopic: Core operator set
15:43:50 [zkis]
https://github.com/webmachinelearning/webnn/issues/573
15:43:51 [gb]
https://github.com/webmachinelearning/webnn/issues/573 -> Issue 573 Core operator set (by philloooo) [question] [opset]
15:43:53 [anssik]
anssik: issue #573
15:43:53 [gb]
https://github.com/webmachinelearning/webnn/issues/573 -> Issue 573 Core operator set (by philloooo) [question] [opset]
15:43:55 [anssik]
jsbell: we're working on an additional contribution
15:44:38 [anssik]
... in principle, if there are emerging standards from the ecosystem, we should look at them as well to inform our work, describe them separately
15:46:28 [anssik]
Subtopic: Consider using label to allow better error handling for async errors
15:46:34 [anssik]
anssik: issue #585
15:46:35 [gb]
https://github.com/webmachinelearning/webnn/issues/585 -> Issue 585 Consider using `label` to allow better error handling for async errors. (by philloooo) [feature request]
15:47:37 [anssik]
jsbell: different between sync and async errors, in the initial implementation a lot of validation happens syncronously in the renderer process because XNNPACK is also running in that process
15:48:09 [dwayner]
dwayner has joined #webmachinelearning
15:48:36 [anssik]
... sync errors can be moved earlier, async errors are hard for developers and also frameworks to handle
15:49:06 [anssik]
... thinking how to report those errors, with promise rejection, how to know which node is responsible for the error?
15:49:12 [anssik]
... proposed solution to follow WebGPU’s practice to define a MLObjectBase with a label field to let MLOperand extend from
15:49:16 [anssik]
-> WebGPUObjectBase.label https://www.w3.org/TR/webgpu/#gpuobjectbase
15:49:41 [anssik]
jsbell: when an async error is raised them the developer has useful information about the reason
15:50:05 [anssik]
... more interesting if decomp or fusion is done
15:50:22 [anssik]
... Zoltan proposed we could auto-gen these labels
15:50:34 [anssik]
... we're interested in any feedback
15:50:36 [RafaelCintron]
q+
15:51:11 [anssik]
zkis: I just agree with the latest comment from Josh
15:51:14 [anssik]
ack RafaelCintron
15:51:31 [anssik]
RafaelCintron: wanted to say I'm in favour of labels and sync
15:51:40 [anssik]
... I was also a proponent of .label WebGPU
15:52:13 [anssik]
anssik: jsbell Google folks interested in implementing this?
15:52:21 [anssik]
jsbell: yes
15:52:50 [anssik]
... what happens if developers call build and async build is happening and code modifies the label of an operand, does the build step snapshot all the labels
15:52:58 [anssik]
q?
15:53:26 [anssik]
Dwayne: for debugging this is very helpful
15:54:06 [anssik]
... what is the format of labels, it would be helpful to raise the errors sooner than later, I wonder if you can know all the backend capabilities early, can do pre-validation, overall like the idea
15:54:08 [anssik]
q?
15:54:29 [anssik]
q?
15:54:31 [zkis]
q+
15:54:34 [anssik]
ack zkis
15:54:55 [anssik]
zkis: we could keep generated labels separate from user-provided labels
15:55:02 [anssik]
... to be discused in the issue
15:55:03 [anssik]
q?
15:55:46 [anssik]
Subtopic: Rename inputSize variables as inputRank in algorithms
15:55:50 [anssik]
anssik: issue #588
15:55:51 [gb]
https://github.com/webmachinelearning/webnn/issues/588 -> Issue 588 Rename inputSize variables as inputRank in algorithms (by inexorabletash) [conventions]
15:56:07 [anssik]
jsbell: this is very simple, comments welcome
15:56:12 [anssik]
Subtopic: Consider alternate styling/linking for method argument definitions
15:56:17 [anssik]
anssik: issue #574
15:56:18 [gb]
https://github.com/webmachinelearning/webnn/issues/574 -> Issue 574 Consider alternate styling/linking for method argument definitions (by inexorabletash) [question] [editorial] [conventions]
15:56:26 [anssik]
... question regarding styling method arguments with three alternatives:
15:56:31 [anssik]
... Alternative 1: Make args into definitions
15:56:34 [anssik]
... Alternative 2: Style as definition list
15:56:39 [anssik]
... Alternative 3: Auto-generated table
15:56:41 [anssik]
q?
15:57:13 [anssik]
q?
15:57:32 [anssik]
RRSAgent, draft minutes
15:57:33 [RRSAgent]
I have made the request to generate https://www.w3.org/2024/03/07-webmachinelearning-minutes.html anssik
16:01:47 [anssik]
s/works is/works if
16:02:22 [anssik]
s/chunck/chunk
16:07:19 [anssik]
s/… the WG/anssik: the WG
16:07:19 [anssik]
RRSAgent, draft minutes
16:07:20 [RRSAgent]
I have made the request to generate https://www.w3.org/2024/03/07-webmachinelearning-minutes.html anssik
16:09:07 [anssik]
s/the WG has/anssik: the WG has
16:09:09 [anssik]
RRSAgent, draft minutes
16:09:10 [RRSAgent]
I have made the request to generate https://www.w3.org/2024/03/07-webmachinelearning-minutes.html anssik
18:01:34 [Zakim]
Zakim has left #webmachinelearning