Meeting minutes
anssik: Welcome to our new participants Yang Gu, Jiajia Qin and Shaobo Yan from Microsoft
… also warm welcome to Yuichiro Tachibana from Hugging Face joining as a guest!
Yuichiro: I met W3C members at a conference discussing Transformers.js, thanks for inviting me, interested in client-side ML execution
WebML Working Group Charter update
Repository: w3c/machine-learning-charter
anssik: Working Group charter is up for renewal in 2025
… today we'll review the current charter, discuss and solicit input on proposed changes, triage open charter issues, and confirm the timeline
anssik: the key principle for WG rechartering is "if it ain't broken don't change it"
… this concerns the scope expansion specifically
… if the scope expands, existing participants need to re-join
Current WG Charter
anssik: Motivation and Background is informational
… Scope is an important section and is future-proofed with "well-known model architectures" and "major platform APIs"
… Out of Scope section includes training, hardware features and hardware algorithms
… Deliverables are WebNN API, Model Loader API as tentative, Ethical Principles WG Note
… Success Criteria is the standard one, two independent interoperable implementations
… Coordination enumerates groups we usually work with, but does not prevent us from working with other groups or projects outside this list
… the rest is standard charter boilerplate
Dom: good overview Anssi, Scope and Deliverables are the most important
… think no need to change Scope or add new Deliverables
… submitted a PR #39 to align with the latest charter template, thanks!
<gb> Pull Request 39 Align with latest charter template (by dontcallmedom)
Diff
… any questions or comments about the current charter or Dom's proposed changes?
anssik: next we'll look at open charter issues
Model Loader API, keep as tentative or remove from scope?
anssik: issue #38
<gb> Issue 38 Model Loader API, keep as tentative or remove from scope? (by anssiko)
anssik: Model Loader API incubation has been on a pause since Feb 2023 and its known implementation was removed from Chromium after experimentation phase
… unless there's interest, I'd propose to remove this from the WG Charter and revive this work in the WebML CG as appropriate
jsbell: not currently being persued, supportive of removing
Core operator set, scope and coordination
anssik: issue #37
<gb> Issue 37 Core operator set, scope and coordination (by anssiko)
anssik: the current WG charter scope section is abstract enough to not warrant a revision to allow work on core op set to happen
… we could update the informative list of major platform APIs if there are changes:
"The APIs in scope of this group are not tied to any particular platform and are implementable on top of existing major platform APIs, such as Android Neural Networks API, Windows DirectML, and macOS/iOS Metal Performance Shaders and Basic Neural Network Subroutines."
… this is an open ended list, not being on this list does not exclude any platform APIs
anssik: related, in Coordination we note StableHLO, while we also consider MLIR Linalg, PyTorch Prims IR, TOSA for our compositional fundamentals research
… we probably don't want to enumerate all possible projects (MLIR, PyTorch, TOSA), so perhaps it is more balanced to remove the StableHLO reference?
Dom: being open-ended and informative, the only thing is this can be useful as a reminder of communities who to seek wide review from
… perhaps under that lens, consider removing
Christian: seems good to remove, that seems quite specific one
jsbell: we normally list WHATWG in charters, most specs have dependencies to WebIDL, Infra and other specs an standards
Dom: we usually do that when there's a specific requirement
… e.g. new types for WebIDL required, any browser WG has a WHATWG relationship, would not put that as a hard requirement, can put it in if wanted
Task-based APIs and Prompt API
anssik: issue #36
<gb> Issue 36 Task-based APIs and Prompt API (by anssiko)
anssik: Task-based APIs and Prompt API were adopted into the WebML Community Group earlier this month
… these APIs are now incubated in the WebML CG
… proposal is to check for readiness for WebML WG adoption from time to time considering implementation experience, end user feedback
Dom: we could list them as tentative if we feel they are likely target for adoption, concern is this space is very active with interesting IP questions
… bringing into charter might create friction in the AC review and require everyone re-join
… proposal is to wait and see the traction
Christian: personally, would support adding the APIs, but understand and appreciate Dom's perspective
… compared to Model Loader API these APIs seem compatible
Dom: we are free to charter again 6 months from now when the incubations have made more progress
… Model Loader API has different IPR scope from task-based APIs
jsbell: given what Dom said, it has a very specific API called out, maybe in 6 months from now we would be in a better position to see what would make sense to capture as Tentative then
Dom: as part of our communication to AC, I will make sure we mention these APIs have been adopted in the CG and are consideration for the WG future revision
christianliebel: what Dom says makes sense, having publicly visible commitment we are looking at these new APIs, it makes sense to not add to the WG charter right now to not create friction
Speech Synthesis
anssik: issue #31
<gb> Issue 31 Speech synthesis and machine learning (by r12a) [deferred]
anssik: suggestion to mention Speech Synthesis for symmetry with Speech Recognition in informative Motivation and Background
… I propose we update the informative text e.g. as follows:
"Speech Recognition and Speech Synthesis enable computers to recognize and translate spoken language into text and vice versa."
On-device training
anssik: no feedback that'd suggest we should bring on-device training in scope for the WG
… issue #27
<gb> Issue 27 On-device training (by anssiko) [deferred]
Charter development timeline
anssik: I'll prepare an updated charter with Dom early new year for your review
… then complete horizontal review, around mid-Feb we will initiate the AC review
… new charter start date would be 2025-05-01
Device selection abstractions update
Repository: webmachinelearning/webnn
anssik: PR #784
<gb> Pull Request 784 Add device selection explainer (by zolkis)
anssik: Zoltan has updated the explainer proposal, added new use cases, known implementation limitations, added MVP solution to remove explicit deviceType to make contexts device agnostic, allow multiple devices per context
Zoltan: folks will need some time to digest the space, we have documented intro, history, key use cases and requirements, considered alternatives
… recently added Minimum Viable Solution for your review
… one pain point is we were tied to device per context, while some platforms can execute on multiple devices
… also our device selection mechanism did not map well to platform APIs
… proposal is to:
… - Remove MLDeviceType as explicit context option.
… - Update MLContext so that it becomes device agnostic, or default/generic context. Allow supporting multiple devices with one context.
… - Add notes to implementations on how to map power preference to devices.
… - Improve the device selection hints in context options and define their implementation mappings.
… - Check if requesting a certain device type or combination of devices is still a use case.
… please review and chime in the PR if this direction has issues, otherwise I will proceed with a spec PR per this design
jsbell: explainer looks great, support getting the explainer merged
… where to capture feedback to support the explainer, is there an issue?
… Google team is on a vacation for the next few weeks
Zoltan: I can wait over the holiday for feedback
RafaelCintron: thanks for putting this together, couple of questions about the explainer
… at the end it says "allow supporting multiple devices in one context"
… does "other devices" mean a new type?
Zoltan: this would be abstracted away, if defining a context as a combination of devices, should be also possible
Rafael: low-latency, what kind of device would be low latency?
<jsbell> +1 that I had the same question as Rafael at first, so clarifying the text would be great.
Zoltan: low latency, if you want to optimize e.g. LLM throughput tell that to the implementation and it's do the best to satisfy that, consider it a hint
… just a suggestion in which direction to extend the context options, I found low-latency hint in OpenVINO, we need to validate the use cases and craft hints based on that
<zkis> https://
Rafael: high-performance and low-power I know how to deal with, they exist on Windows, low-latency I'm not familiar with, CPU could be low-latency
Zoltan: I'd spec these as hints that if they cannot be fulfilled they're not errors
Zoltan: PR #784 would be the perfect place for feedback
<gb> Pull Request 784 Add device selection explainer (by zolkis)
<ningxin> +1 to provide feedback on PR
WebNN Operator Update Wave 3
anssik: Dwayne has a WIP PR for op set Wave 3, thanks Dwayne!
… OK to push WIP PR to upstream repo and mark it as a Draft
… this enables PR Preview and CI checkers that can be helpful for development
… and folks can help contribute
Dwayne: Austin has a few question on uint4 on CoreML, does not block the spec PR
Core op set: MLIR Linalg findings revisited
anssik: we discussed Dwayne's extensive mapping table month ago, Linalg specifically
Mapping table
… I wanted to bump this topic to get feedback on the 6 primitive ops proposed for inclusion into WebNN API informed by this reasearch, they are:
1-D convolution with no channels
3-D convolution with no channels conv_3d
Fill output with random numbers fill_rng_2d
Dwayne: this set 6 is implementable across backends
… no direct mapping for all of them, most directly implementable
… I can fill in the table with all the ops, take a subset of three backends Chromium uses and show how they map to these
jsbell: following the usual process, if implemetable across backends, use cases, sounds good
Get to know Task-specific APIs and Prompt API
anssik: companion WebML Community Group now incubates selected task-specific APIs to enable reuse of the built-in models that are distributed as part of the browser or the underlying software platform
… as an observation, it seems Prompt API as a more general purpose and flexible API has received the most feedback
… on our last meeting we discussed how to versioning the models, and it was suggested as a topic to be discussed in this group, adapters and LoRAs, model management
Dom: open-ended question, one topic I'd expect to be brought up is questions on ethical considerations for ML
… would a model that is used by these APIs be documented in a way developers can learn how they're trained, bias, other qualities, a la Model Cards style info
… as we think bringing specific models as part of API surface in browsers, thinking about ethical aspects is important
Christian: good questions, tough to answer, I have given dozen of presentation on these APIs, can confirm Prompt API has the most interest
… restricting only for extensions is problem for developers
… ethical part, how to make sure the answers are safe?
… is there filtering, can I query?
… overall I'm happy we have these APIs here, and look forward to see more
… for specific issues, output constraining is important, our customers use function calling, multi-modal, Prompt API represents what LLMs were year ago
jsbell: +1 to what Christian said, we want to work through the Prompt API issues, also Dom thanks for ethical considerations
<dom> +1 that WebNN delegates that question rather than solving it :)
jsbell: in WebNN API there's transparency, but it pushes complexity to developer on how "responsible" the model is
jsbell: this is not the first time we have ML-backed APIs in the browser, we have Web Speech API
McCool: STT and TTS, use cases for Prompt API?
jsbell: we'll explore multimodal in 2025, current APIs are only text-to-text
Happy Holidays!
anssik: Thank You everyone for your significant contributions during 2024!
… our Working Group accomplished a lot this year
… a few highlights:
… WebNN API evolved driven by research into popular more advanced models, more diverse implementations
… WebNN API Candidate Rec Snapshot milestone was met in 2Q 2024
… the WG made in total ~100 spec publications and merged +180 PRs this year
… many new active contributors joined, Dwayne as a co-editor, the group grew and diversified further
… we converged on new API abstractions for tensors, device selection, defined op set principles
… we improved the spec quality significantly with expert advice
… we organized our first F2F in Anaheim and it was a blast
… we made strong progress on the implementations across 3 backends, XPUs, multiple OSes
… we witnessed positive buzz in the tech industry around WebNN, made a few keynote appearances
… a lot of exciting demos and samples were published, wpt test coverage improved
… and much more!
… we're entering an exciting phase of development in 2025
… the WebNN API is expected to get in the hands of more developers for large-scale trials, and more
… feedback from developers and users will help guide our priorities
… Happy Holidays everyone -- please relax, disconnect, and recharge
… see you on our next call 16 Jan 2025!
<anssik> s/… Dom submitted/anssik: Dom submitted