Meeting minutes
Repository: webmachinelearning/webnn
anssik: I'm seeing many of our agenda topics receive your contributions over the past few week, thanks for moving things forward!
Last Call: TPAC 2024 registration and WebML WG F2F agenda
anssik: W3C TPAC 2024 takes place in Anaheim, CA, USA at the Hilton Anaheim on 23–27 September 2024. The WebML WG will meet F2F on Monday, 23 September 2024, 09:00–18:00 PDT.
… you should have received a TPAC meeting invite some hours ago, if not, please let me know
… this TPAC invite is titled "Web Machine Learning Working Group" to differentiate from our bi-weekly teleconferences
… we'll add the Zoom URL in an update, note it'll be different from the usual Zoom URL
… the registration is open until 13 September 2024
… the meeting hotel is now sold out on 23th, alternative hotels are still available:
Alternative hotels
… two things to do:
… 1) Please complete the registration form: https://
… if you participate remotely, please choose the "I will attend remotely" option
… 2) Review the in-development F2F agenda and provide your suggestions:
… WebML WG - TPAC 2024 agenda
… Our agenda is shaping up nicely and we'll have a number of guests joining and presenting on interesting topics
… now would be a good time to share any additional proposals as next we'll start to map the proposals to timeslots
… Please share your agenda suggestions in the GH issue by the end of your workday, or chime in now
<gb> Issue 25 WebML WG - TPAC 2024 agenda (by anssiko)
anssik: current F2F topics grouped:
… ethics:
… - in-browser explainability libraries and viz tools (Jay Wang, OpenAI / Georgia Tech)
… process-y and scope:
… - triage pass through open issues at F2F: breaking changes, priorities, next steps for the issue (All)
… - wide review status, close on TAG review feedback
… - W3C “living standards” topic (Dom?)
… - a refreshed analysis of popular models, operator & data type gaps (Dwayne et al.)
… new features:
… - Quantization and dequantization (QDQ), a bag of multiple issues (JoshB)
… - platform capability detection (Phillis et al.?)
… - future-proof device selection abstractions
… interop and cross-group coordination:
… - MLBuffer and WebGPU interop (MikeW)
MikeW: we are willing to contribute WebKit perspective to this discussion
… customer feedback and trials:
… - Transformers.js WebNN backend (JoshuaL)
<Joshua_Lochner> xenova/
<gb> MERGED Pull Request 890 Initial WebNN Support (by ibelem)
MikeW: - ONNX Runtime Web & WebNN EP
… - WebLLM + MLC discussion (TQ)
… - interop issues across different backends (Ningxin et al.)
… - next step for implementations, Origin Trial or equivalent and align with framework developer feedback
… - core operator set #573
<gb> Issue 573 Core operator set (by philloooo) [question] [opset]
MikeW: incubations:
… - Built-in APIs for translation and prompting (Domenic)
… - custom ops (Ningxin)
<Joshua_Lochner> WebNN is now supported in Transformers.js V3 (see xenova/
MikeW: - model management (McCool)
… now I'd like to understand if there is preference for any specific session(s) to happen AM or PM?
… my priority is to make sure any remote participants who are driving topics have as reasonable working hours as possible
Device selection abstractions
anssik: We agreed to evolve MLContextOptions and other API controls for device selection informed by further implementation experience and new use cases from the wider web community.
… thank you MikeW for submitting feedback on behalf of the WebKit project, we'll discuss it today
… issue #749
<gb> Issue 749 MLContextOptions.deviceType seems unnecessary outside of conformance testing (by mwyrzykowski) [device selection]
anssik: key points from the feedback:
… - MLContextOptions.deviceType is currently unimplementable via CoreML
… - MLContextOptions.deviceType, as currently specified, would lead to additional fragmentation, works on some devices, fails on others
… - The browser has better insight into workloads than the website author
anssik: MikeW, I'll let you expand on this WebKit feedback and then we can discuss better device selection abstractions you may have thought about
MikeW: if we must run a model on NPU and it fails, not optimal, hints system would be better
… to have a device selection would lead to fingerprinting concerns
MikeW: power preference could be perhaps combined with the device type hints
… CoreML build on the hints concept, MLComputeUnits
McCool: timing attacks concerns, a particular GPU can be detected
<MikeWyrzykowski> ?+
MikeW: discussed in WebGPU WG, mitigation strategies for browser vendors is to add tiny bit of noise to calculations
… timing can be solved on the browser level to some level similar to canvas noise injection
… browser vendor can decide what level of privacy protection to add
… unless browser leaves it to LCD harder to implement in a privacy sensitive manner
RafaelCintron: for fingerprinting, it is a concern but I think we're somewhat limited in what we can do without compromising web developer's ability to do cool things
… a lot of capability bits that tell what extensions you can use with your GPU
… cannot be so that web developer knows nothing about the hardware capabilitities
<dwayner> This already exposes quite a bit of info, no need for timing comparisons - https://
RafaelCintron: wanted to ask MikeW about CoreML, we know CoreML has hints too and they were added for a reason
… if they're needed for writing native code on Apple platform it looks like we need something similar on the web platform that is more diverse
… sometimes GPUs are better or worse, we should explore hinting mechanisms and see how people use them to inform our designs
… defer hints later to model creation time would complicate things
MikeW: the hints generally seem fine, the difference between hint and a requirement is important, some overlap between power preference and device type, do you need both?
RafaelCintron: would it be OK to have hints and know what was picked?
MikeW: don't understand why the website should know what was picked (by the implementation)
… browser could say I support types CPU and NPU
anssik: how do you mitigate against WebGPU extensions fingerprint?
MikeW: by bucketization, this is a problem that we try to mitigate by grouping similar devices
… we have the freedom to change the buckets later
… for extension detection is a more of a problem in Intel Mac machines that have more extensions they support, even there many extensions are very similar than Mac silicon Macs so cannot tell which Mac silicon Mac is used
ningxin: thanks for the good discussion, want to provide another aspect I shared in #623
<gb> Issue 623 WebNN should support NPU and QDQ operations (by wchao1115) [v2] [opset] [feature request] [device selection]
ningxin: frameworks may want to make WebGPU subgraph, e.g. ONNX Runtime runs a model on a Wasm EP
… some of this subgraph can be accelerated on WebNN, some ops run on Wasm EP, want the WebNN graphs to be executed on CPU device so tensor data exchange could be on the same device without perf cliff due to copying data back and forth
… WebNN context can be created from GPU device
… want to mention this mixed execution device scenario, subgraph execution and efficient data exchange
MikeW: I think these use cases are great, and browser implementations have insights to shift compute to the most appropriate devices
… ahead of time some indication can be provided that you want to run it on multiple devices, WebKit is open to that in abstract, some concerns with the design specifics
dwayner: currently there's only concrete device types, one option could be to add "do whatever is the best" device type, that'd relax the device type to device type hint
… I see some cases where you want to define GPU explicitly, maybe discrete vs integrated
… website author may have more insights into the workload than the browser, especially when classical apps are becoming web apps
… MLComputeUnits could expose more low level controls
Open issue and PRs
anssik: on our last call we did not have time to acknowledge great work done during July. Belated thank you!
anssik: here's what landed since end of June while our calls were on a break:
… anssik: normative/breaking changes
… issues #442 #678 fixed by PR #647
anssik: issue #466 fixed by PR #649
<gb> MERGED Pull Request 649 Add axis argument to softmax() (by inexorabletash)
<gb> CLOSED Issue 466 Softmax axis absent (by fdwr) [operator specific]
anssik: as proposed in issue #689 fixed by PR #718
<gb> MERGED Pull Request 718 Replace MLActivation with MLRecurrentNetworkActivation (by a-sully)
<gb> Issue 689 Consider removing `lstm` and `gru` operators (by a-sully) [question] [operator specific]
anssik: issue #629 fixed by PR #724
… issue from code review fixed by PR #719 - rename
<gb> MERGED Pull Request 724 change argmin/argmax to take scalar axis (by philloooo)
anssik: issue #652 fixed by PR #722
anssik: issue #531 fixed by PR #723 - error handling
anssik: tools related:
<gb> MERGED Pull Request 722 Remove argmin/max selectLastIndex parameter (by philloooo)
anssik: issue n/a fixed by PR #720 - tools
anssik: issue n/a fixed by PR #727 - lint
anssik: issue n/a fixed by PR #728 - lint workflow
<gb> CLOSED Issue 531 Drop the support of synchronous execution (by huningxin)
anssik: issue n/a fixed by PR #731 - lint
<gb> MERGED Pull Request 727 Lint: Improve logic for validating algorithm steps. (by inexorabletash)
anssik: other issues, editorial, clarifications etc.
anssik: issue #713 fixed by PR #715
… issue #567 fixed by PR #717
<gb> MERGED Pull Request 715 Fix #713: DOMString to USVString (by zolkis)
anssik: issue #574 fixed by PR #721
<gb> CLOSED Issue 713 Use USVString for operand name (by philloooo) [bug]
anssik: issue #443 fixed by PR #725 - security considerations
<gb> MERGED Pull Request 717 Allow MLGraphBuilder.build() to be called only once (by a-sully)
anssik: issue #489 fixed by PR #726
<gb> CLOSED Issue 567 Can an MLGraphBuilder be reused? (by reillyeon) [question]
anssik: issue n/a fixed by PR #729
<gb> MERGED Pull Request 721 Style: Link method argument definitions (by inexorabletash)
anssik: issue #653 fixed by PR #730 - add dict member
anssik: issue #351 fixed by PR #732
anssik: issue n/a fixed by PR #733
anssik: issue n/a fixed by PR #735
anssik: issue n/a fixed by PR #735
anssik: issue #734 fixed by PR #738
<gb> MERGED Pull Request 730 Add outputDataType to argmin/argmax (by philloooo)
anssik: issue #740 fixed by PR #741
anssik: issue #585 fixed by PR #742
<gb> MERGED Pull Request 732 Reference newly landed WebIDL "transferable" definition (by inexorabletash)
anssik: issue #590 fixed by PR #743
anssik: issue #378 fixed by PR #745
… issue n/a fixed by PR #746
anssik: thanks Josh, Ningxin, Dwayne, Zoltan, Austin, Phillis, Shiyi, Bruce, others for PRs & reviews!
<gb> MERGED Pull Request 738 Update dimension valid range to signed integer (by philloooo)
anssik: a lot of progress both in spec, but also in tools, lint, workflow improvements that will make editing this spec even more fun :-)
<gb> CLOSED Issue 734 Restrict dimensions size to [0, INT32_MAX] (by philloooo) [interop]
<gb> CLOSED Issue 590 Consider adopting new broadcasting rules (by a-sully) [question] [interop]
<gb> CLOSED Issue 378 Add samples of shape broadcasting (by huningxin) [testing]
<gb> MERGED Pull Request 746 Bug fix: softmax()'s axis argument should EnforceRange (by inexorabletash)
Zoltan: interested in issue #754
<gb> Pull Request 754 Add MLTensor explainer (by a-sully)
MLTensor explainer
ningxin: MLTensor is a rename of MLBuffer, captures all the discussion around MLBuffer
… captures the consensus on different MLBuffer issues
… my +1
… I need Bryan, Rafael, Dwayne and others to take a look
Zoltan: good to have consensus before TPAC
ningxin: I think the current MLBuffer issues are good, we can consider MLTensor a rename of MLBuffer
… Bryan has an issue to rename MLBuffer to MLTensor, I already gave my thumbs up and Austin too
… MLTensor is an evolved MLBuffer, all MLBuffer discussions prior are still relevant
… we can probably close some MLBuffer issues that have been addressed
Zoltan: explainer is a public summary of all the arguments
ningxin: ONNX RT feedback for MLTensor with Chromium, this explainer will help solicit that
… that's why Austin volunteered to put this explainer in place
… another thing, with MLTensor and dispatch, we can probably think of deprecation of compute() a breaking change, we can discuss that with the MLTensor dispatch proposal
[operator specific] How to define the algorithm of L2_Pool2d?
anssik: issue #278
<gb> Issue 278 How to define the algorithm of L2_Pool2d? (by mingmingtasd) [question] [operator specific]
ningxin: update from Jiewei, TFLite engineer assigned to fix the related bug
… this issue was blocking the Chromium TFLite backend development for l2Pool2d