W3C

– DRAFT –
WebML WG Teleconference – 22 August 2024

22 August 2024

Attendees

Present
Anssi_Kostiainen, Dwayne_Robinson, Joshua_Lochner, Michael_McCool, Mike_Wyrzykowski, Ningxin_Hu, Rafael_Cintron, Zoltan_Kis
Regrets
-
Chair
Anssi
Scribe
Anssi, anssik

Meeting minutes

Repository: webmachinelearning/webnn

anssik: I'm seeing many of our agenda topics receive your contributions over the past few week, thanks for moving things forward!

Last Call: TPAC 2024 registration and WebML WG F2F agenda

anssik: W3C TPAC 2024 takes place in Anaheim, CA, USA at the Hilton Anaheim on 23–27 September 2024. The WebML WG will meet F2F on Monday, 23 September 2024, 09:00–18:00 PDT.
… you should have received a TPAC meeting invite some hours ago, if not, please let me know
… this TPAC invite is titled "Web Machine Learning Working Group" to differentiate from our bi-weekly teleconferences
… we'll add the Zoom URL in an update, note it'll be different from the usual Zoom URL
… the registration is open until 13 September 2024
… the meeting hotel is now sold out on 23th, alternative hotels are still available:

Alternative hotels
… two things to do:
… 1) Please complete the registration form: https://www.w3.org/2024/09/TPAC/#registration
… if you participate remotely, please choose the "I will attend remotely" option
… 2) Review the in-development F2F agenda and provide your suggestions:
WebML WG - TPAC 2024 agenda
… Our agenda is shaping up nicely and we'll have a number of guests joining and presenting on interesting topics
… now would be a good time to share any additional proposals as next we'll start to map the proposals to timeslots
… Please share your agenda suggestions in the GH issue by the end of your workday, or chime in now

<gb> Issue 25 WebML WG - TPAC 2024 agenda (by anssiko)

anssik: current F2F topics grouped:
… ethics:
… - in-browser explainability libraries and viz tools (Jay Wang, OpenAI / Georgia Tech)
… process-y and scope:
… - triage pass through open issues at F2F: breaking changes, priorities, next steps for the issue (All)
… - wide review status, close on TAG review feedback
… - W3C “living standards” topic (Dom?)
… - a refreshed analysis of popular models, operator & data type gaps (Dwayne et al.)
… new features:
… - Quantization and dequantization (QDQ), a bag of multiple issues (JoshB)
… - platform capability detection (Phillis et al.?)
… - future-proof device selection abstractions
… interop and cross-group coordination:
… - MLBuffer and WebGPU interop (MikeW)

MikeW: we are willing to contribute WebKit perspective to this discussion
… customer feedback and trials:
… - Transformers.js WebNN backend (JoshuaL)

<Joshua_Lochner> xenova/transformers.js#890

<gb> MERGED Pull Request 890 Initial WebNN Support (by ibelem)

MikeW: - ONNX Runtime Web & WebNN EP
… - WebLLM + MLC discussion (TQ)
… - interop issues across different backends (Ningxin et al.)
… - next step for implementations, Origin Trial or equivalent and align with framework developer feedback
… - core operator set #573

<gb> Issue 573 Core operator set (by philloooo) [question] [opset]

MikeW: incubations:
… - Built-in APIs for translation and prompting (Domenic)
… - custom ops (Ningxin)

<Joshua_Lochner> WebNN is now supported in Transformers.js V3 (see xenova/transformers.js#890 (comment) The last thing to do is to add this to the config.json so that the user doesn't need to specify freeDimensionOverrides

MikeW: - model management (McCool)
… now I'd like to understand if there is preference for any specific session(s) to happen AM or PM?
… my priority is to make sure any remote participants who are driving topics have as reasonable working hours as possible

Event Times Around the World

Device selection abstractions

anssik: We agreed to evolve MLContextOptions and other API controls for device selection informed by further implementation experience and new use cases from the wider web community.
… thank you MikeW for submitting feedback on behalf of the WebKit project, we'll discuss it today
… issue #749

<gb> Issue 749 MLContextOptions.deviceType seems unnecessary outside of conformance testing (by mwyrzykowski) [device selection]

anssik: key points from the feedback:
… - MLContextOptions.deviceType is currently unimplementable via CoreML
… - MLContextOptions.deviceType, as currently specified, would lead to additional fragmentation, works on some devices, fails on others
… - The browser has better insight into workloads than the website author

anssik: MikeW, I'll let you expand on this WebKit feedback and then we can discuss better device selection abstractions you may have thought about

MikeW: if we must run a model on NPU and it fails, not optimal, hints system would be better
… to have a device selection would lead to fingerprinting concerns

MikeW: power preference could be perhaps combined with the device type hints
… CoreML build on the hints concept, MLComputeUnits

McCool: timing attacks concerns, a particular GPU can be detected

<MikeWyrzykowski> ?+

MikeW: discussed in WebGPU WG, mitigation strategies for browser vendors is to add tiny bit of noise to calculations
… timing can be solved on the browser level to some level similar to canvas noise injection
… browser vendor can decide what level of privacy protection to add
… unless browser leaves it to LCD harder to implement in a privacy sensitive manner

RafaelCintron: for fingerprinting, it is a concern but I think we're somewhat limited in what we can do without compromising web developer's ability to do cool things
… a lot of capability bits that tell what extensions you can use with your GPU
… cannot be so that web developer knows nothing about the hardware capabilitities

<dwayner> This already exposes quite a bit of info, no need for timing comparisons - https://developer.mozilla.org/en-US/docs/Web/API/GPUSupportedLimits

RafaelCintron: wanted to ask MikeW about CoreML, we know CoreML has hints too and they were added for a reason
… if they're needed for writing native code on Apple platform it looks like we need something similar on the web platform that is more diverse
… sometimes GPUs are better or worse, we should explore hinting mechanisms and see how people use them to inform our designs
… defer hints later to model creation time would complicate things

MikeW: the hints generally seem fine, the difference between hint and a requirement is important, some overlap between power preference and device type, do you need both?

RafaelCintron: would it be OK to have hints and know what was picked?

MikeW: don't understand why the website should know what was picked (by the implementation)
… browser could say I support types CPU and NPU

anssik: how do you mitigate against WebGPU extensions fingerprint?

MikeW: by bucketization, this is a problem that we try to mitigate by grouping similar devices
… we have the freedom to change the buckets later
… for extension detection is a more of a problem in Intel Mac machines that have more extensions they support, even there many extensions are very similar than Mac silicon Macs so cannot tell which Mac silicon Mac is used

ningxin: thanks for the good discussion, want to provide another aspect I shared in #623

<gb> Issue 623 WebNN should support NPU and QDQ operations (by wchao1115) [v2] [opset] [feature request] [device selection]

ningxin: frameworks may want to make WebGPU subgraph, e.g. ONNX Runtime runs a model on a Wasm EP
… some of this subgraph can be accelerated on WebNN, some ops run on Wasm EP, want the WebNN graphs to be executed on CPU device so tensor data exchange could be on the same device without perf cliff due to copying data back and forth
… WebNN context can be created from GPU device
… want to mention this mixed execution device scenario, subgraph execution and efficient data exchange

MikeW: I think these use cases are great, and browser implementations have insights to shift compute to the most appropriate devices
… ahead of time some indication can be provided that you want to run it on multiple devices, WebKit is open to that in abstract, some concerns with the design specifics

dwayner: currently there's only concrete device types, one option could be to add "do whatever is the best" device type, that'd relax the device type to device type hint
… I see some cases where you want to define GPU explicitly, maybe discrete vs integrated
… website author may have more insights into the workload than the browser, especially when classical apps are becoming web apps
… MLComputeUnits could expose more low level controls

Open issue and PRs

anssik: on our last call we did not have time to acknowledge great work done during July. Belated thank you!

Recently merged PRs

anssik: here's what landed since end of June while our calls were on a break:
… anssik: normative/breaking changes
… issues #442 #678 fixed by PR #647

<gb> MERGED Pull Request 647 Introduce MLNumber for specifying numeric inputs of any type (by inexorabletash)

<gb> CLOSED Issue 442 Type of some parameters should match the input data type (by Honry) [feature request] [operator specific]

<gb> CLOSED Issue 678 Specifies scalar values casted to match input type. (by philloooo) [feature request]

anssik: issue #466 fixed by PR #649

<gb> MERGED Pull Request 649 Add axis argument to softmax() (by inexorabletash)

<gb> CLOSED Issue 466 Softmax axis absent (by fdwr) [operator specific]

anssik: as proposed in issue #689 fixed by PR #718

<gb> MERGED Pull Request 718 Replace MLActivation with MLRecurrentNetworkActivation (by a-sully)

<gb> Issue 689 Consider removing `lstm` and `gru` operators (by a-sully) [question] [operator specific]

anssik: issue #629 fixed by PR #724
… issue from code review fixed by PR #719 - rename

<gb> MERGED Pull Request 724 change argmin/argmax to take scalar axis (by philloooo)

anssik: issue #652 fixed by PR #722

<gb> CLOSED Issue 629 argMax/Min only support scalar axis in TFLite runtime (by fujunwei) [operator specific] [interop]

anssik: issue #531 fixed by PR #723 - error handling

<gb> MERGED Pull Request 719 Rename where()'s parameters "input" and "other" to "trueValue" and "falseValue" (by shiyi9801)

anssik: tools related:

<gb> MERGED Pull Request 722 Remove argmin/max selectLastIndex parameter (by philloooo)

anssik: issue n/a fixed by PR #720 - tools

<gb> CLOSED Issue 652 ArgMax/Min `selectLastIndex` is not supported on CoreML (by philloooo) [bug] [operator specific] [interop]

anssik: issue n/a fixed by PR #727 - lint

<gb> MERGED Pull Request 723 Define error handling of MLNamedArrayBufferViews transfer algorithm (by inexorabletash)

anssik: issue n/a fixed by PR #728 - lint workflow

<gb> CLOSED Issue 531 Drop the support of synchronous execution (by huningxin)

<gb> MERGED Pull Request 720 Ensure every method dfn is correctly associated with an interface (by a-sully)

anssik: issue n/a fixed by PR #731 - lint

<gb> MERGED Pull Request 727 Lint: Improve logic for validating algorithm steps. (by inexorabletash)

anssik: other issues, editorial, clarifications etc.

<gb> MERGED Pull Request 728 Workflow: Add steps to run tools/lint.mjs after spec is built (by inexorabletash)

<gb> MERGED Pull Request 731 Update documentation and fix some minor issues with lint tool. (by inexorabletash) [editorial]

anssik: issue #713 fixed by PR #715
… issue #567 fixed by PR #717

<gb> MERGED Pull Request 715 Fix #713: DOMString to USVString (by zolkis)

anssik: issue #574 fixed by PR #721

<gb> CLOSED Issue 713 Use USVString for operand name (by philloooo) [bug]

anssik: issue #443 fixed by PR #725 - security considerations

<gb> MERGED Pull Request 717 Allow MLGraphBuilder.build() to be called only once (by a-sully)

anssik: issue #489 fixed by PR #726

<gb> CLOSED Issue 567 Can an MLGraphBuilder be reused? (by reillyeon) [question]

anssik: issue n/a fixed by PR #729

<gb> MERGED Pull Request 721 Style: Link method argument definitions (by inexorabletash)

anssik: issue #653 fixed by PR #730 - add dict member

<gb> CLOSED Issue 574 Consider alternate styling/linking for method argument definitions (by inexorabletash) [question] [editorial] [conventions]

anssik: issue #351 fixed by PR #732

<gb> MERGED Pull Request 725 Add security consideration for computation control-flow attacks (by inexorabletash)

<gb> CLOSED Issue 443 Add security consideration for computation control-flow attack based on weights / constants change (by huningxin) [editorial]

anssik: issue n/a fixed by PR #733

<gb> MERGED Pull Request 726 Clarify the cast() op behavior between different data types (by inexorabletash)

anssik: issue n/a fixed by PR #735

<gb> CLOSED Issue 489 Clarify the casting behavior from floating-point / signed integers <-> unsigned integers (by huningxin) [operator specific] [interop]

anssik: issue n/a fixed by PR #735

<gb> MERGED Pull Request 729 Avoid "sequence" in prose, linkify use in types (by inexorabletash) [editorial] [conventions]

anssik: issue #734 fixed by PR #738

<gb> MERGED Pull Request 730 Add outputDataType to argmin/argmax (by philloooo)

anssik: issue #740 fixed by PR #741

<gb> CLOSED Issue 653 Consider changing output type of ArgMax/Argmin to int32, or allow passing output_type (by philloooo) [bug] [operator specific] [interop]

anssik: issue #585 fixed by PR #742

<gb> MERGED Pull Request 732 Reference newly landed WebIDL "transferable" definition (by inexorabletash)

anssik: issue #590 fixed by PR #743

<gb> CLOSED Issue 351 Need to define error handling of MLNamedArrayBufferViews transfer algorithm (by huningxin) [bug]

anssik: issue #378 fixed by PR #745
… issue n/a fixed by PR #746

<gb> MERGED Pull Request 733 Editorial: Fix an instance of "nchw" enum member styling/linking (by inexorabletash) [editorial]

<gb> MERGED Pull Request 735 Editorial: Don't reference IDL boolean for values in tensors (by inexorabletash) [editorial]

anssik: thanks Josh, Ningxin, Dwayne, Zoltan, Austin, Phillis, Shiyi, Bruce, others for PRs & reviews!

<gb> MERGED Pull Request 738 Update dimension valid range to signed integer (by philloooo)

anssik: a lot of progress both in spec, but also in tools, lint, workflow improvements that will make editing this spec even more fun :-)

<gb> CLOSED Issue 734 Restrict dimensions size to [0, INT32_MAX] (by philloooo) [interop]

<gb> CLOSED Issue 740 Question on reduction with empty axes and scalar input (by philloooo) [question] [editorial]

<gb> MERGED Pull Request 741 Clarify reduction with empty axes and scalar input (by inexorabletash) [editorial]

<gb> MERGED Pull Request 742 Add optional operator labels for more diagnosable error messages (by inexorabletash)

<gb> CLOSED Issue 585 Consider adding node `label`s for more diagnosable error messages for async errors. (by philloooo) [feature request]

<gb> MERGED Pull Request 743 Make prelu() bidirectionally broadcast, improve broadcast wording (by inexorabletash)

<gb> CLOSED Issue 590 Consider adopting new broadcasting rules (by a-sully) [question] [interop]

<gb> MERGED Pull Request 745 Add blurb explaining broadcasting in more detail (by inexorabletash) [editorial]

<gb> CLOSED Issue 378 Add samples of shape broadcasting (by huningxin) [testing]

<gb> MERGED Pull Request 746 Bug fix: softmax()'s axis argument should EnforceRange (by inexorabletash)

Zoltan: interested in issue #754

<gb> Pull Request 754 Add MLTensor explainer (by a-sully)

MLTensor explainer

ningxin: MLTensor is a rename of MLBuffer, captures all the discussion around MLBuffer
… captures the consensus on different MLBuffer issues
… my +1
… I need Bryan, Rafael, Dwayne and others to take a look

Zoltan: good to have consensus before TPAC

ningxin: I think the current MLBuffer issues are good, we can consider MLTensor a rename of MLBuffer
… Bryan has an issue to rename MLBuffer to MLTensor, I already gave my thumbs up and Austin too
… MLTensor is an evolved MLBuffer, all MLBuffer discussions prior are still relevant
… we can probably close some MLBuffer issues that have been addressed

Zoltan: explainer is a public summary of all the arguments

ningxin: ONNX RT feedback for MLTensor with Chromium, this explainer will help solicit that
… that's why Austin volunteered to put this explainer in place
… another thing, with MLTensor and dispatch, we can probably think of deprecation of compute() a breaking change, we can discuss that with the MLTensor dispatch proposal

[operator specific] How to define the algorithm of L2_Pool2d?

anssik: issue #278

<gb> Issue 278 How to define the algorithm of L2_Pool2d? (by mingmingtasd) [question] [operator specific]

ningxin: update from Jiewei, TFLite engineer assigned to fix the related bug
… this issue was blocking the Chromium TFLite backend development for l2Pool2d

Minutes manually created (not a transcript), formatted by scribe.perl version 229 (Thu Jul 25 08:38:54 2024 UTC).

Diagnostics

Succeeded: s/in find/thought about

Succeeded: s/and fails/and it fails

Succeeded: s/can device/can decide

Succeeded: s/that's relax/that'd relax

Succeeded: s/We did not/on our last call we did not

Succeeded: s/things,/thing,

Succeeded: s/thing of/think of

Succeeded: s/Jianwei/Jiewei

Maybe present: anssik, dwayner, McCool, MikeW, ningxin, RafaelCintron, Zoltan

All speakers: anssik, dwayner, McCool, MikeW, ningxin, RafaelCintron, Zoltan

Active on IRC: anssik, dwayner, Joshua_Lochner, McCool, MikeWyrzykowski, ningxin, RafaelCintron