WebML WG Teleconference – 16 May 2024

Meeting minutes

anssik: Our groups continues to grow, please welcome our most recent new participants: Enrico Galli, Muthaiah Venkatachalam and Rahul Unnikrishnan Nair from Intel

Announcements

Repository: webmachinelearning/meetings

TPAC 2024

anssik: W3C's annual conference TPAC 2024 gathers all working groups to transcend group borders, coordinate solutions to technical issues, with public breakouts
… This year TPAC 2024 takes place 23-27 September 2024 in Anaheim, CA, USA
… this is an opportunity for the WG to finally meet in the context of TPAC after many years of working together
… as a non-binding poll, do participants like to get together during the TPAC week?

#23
… and thanks Reilly for already signaling your interest by opening an issue for discussion
… please give a (non-binding) thumbs up in the meetings issue #23 if you'd like to see this meeting happen
… any other feedback welcome there as well, you can also reach out to me privately via email on this matter

<gb> Issue 23 WebML WG Hybrid Meeting at TPAC 2024 (by reillyeon)

<jsbell> Clicked the thumbs up but: definitely planning to attend

WebNN Implementation Status Update

Implementation Status of WebNN Operations

What's new

<gb> MERGED Pull Request 71 Update the DirectML and MLService implementation status (by ibelem)

anssik: Implementation Status updated, thanks Belem for this update and various folks working on the implementations first of all!
… changes since the previous update:
… - 78/78 (100%) ops implemented on DirectML backend
… - also updated MLService status

anssik: work in progress is to add CoreML implementation status

jsbell: we're in process of removing XNNPACK from Chromium, and MLService that ChromeOS exposes basically TFLite, this will be used as a backend for Linux, Android and ChromeOS, plumbing may differ per OS but backend is the same
… so we'll have DML, TFLite and CoreML backends

ningxin: want to comment we discussed with Belem about this development, TFLite replacing XNNPACK, currently ML Service is TFLite-based, that reflects TFLite op coverage
… the latest update predates the most recent change, so we'll update the table to expand MLService to cover other than ChromeOS to align with the new architecture
… XNNPACK is a subset of TFLite, we plan to do more investigation with Austin and Phillis to add also CoreML backend data, will send a new PR soonish

NPU support

Repository: webmachinelearning/webnn

anssik: issue #623

<gb> Issue 623 WebNN should support NPU and QDQ operations (by wchao1115) [v2] [opset] [feature request]

Chromium implementation

anssik: We agreed to start formalize the NPU support with the simplest design (option 1: deviceType: "npu") informed by implementation experience.
… Dwayne indicated in the issue he is planning to start a PR for this option 1 soonish that allows the WG to reserve the option to potentially expand from this base informed by further implementation experience on more backends, for example

anssik: is there any new information or considerations that should be brought to the group's attention?
… looking at the issue I see just thumbs up to start with option 1 and this is also consistent with our resolution from the last meeting:

https://www.w3.org/2024/05/02-webmachinelearning-minutes.html#t02

Dwayne: I could start with PR for option 1 today

ningxin: update on the prototyping and testing, we recently merged the PR into webnn-samples repo that add classification models in fp16 and UI to select "npu" device type, allows developers to test the prototype implementation on Intel NPU and CoreML
… I hope that will help people test and inform the spec development

<ningxin> webmachinelearning/webnn-samples#226

<gb> MERGED Pull Request 226 Add NPU device type and three fp16 models for image classification (by mingmingtasd)

anssik: I'd like to bring for discussion one related consideration:
… Privacy and fingerprinting considerations of the expanded three bits of entropy
… we currently have the following privacy considerations in place

https://www.w3.org/TR/webnn/#privacy

"An MLDeviceType normatively indicates the kind of device and is either "cpu" or "gpu". If this type cannot be satisfied, an "OperationError" DOMException is thrown, thus this type can in some cases add two bits of entropy to the fingerprint."

anssik: with the addition of "npu" we'd add one additional bit of entropy
… we already anticipated this in the privacy considerations when we put in place this text:

"If a future version of this specification introduces support for a new MLDeviceType that can only support a subset of MLOperandDataTypes, that may introduce a new fingerprint."
… we could still keep this "future version" text in place to future-proof
… we will consult the Privacy IG on our next wide review cycle on this design and will update the privacy considerations based on their suggestions

Open issues and PRs

anssik: issues addressed ahead the meeting were removed from the agenda, thanks again for moving at a high velocity!

Agenda diff

Debrief on PRs merged recently

anssik: as usual, JoshuaB has been on a roll again submitted a bunch of PR, thanks! Also thanks Ningxin, Dwayne, Austin, Zoltan, others for your PRs and PR reviews.
… issue (n/a) fixed by PR #672

<gb> MERGED Pull Request 672 Handful of algorithm and convention fixes (by inexorabletash) [editorial]

anssik: issue #572 fixed by PR #674

<gb> MERGED Pull Request 674 Use consistent phrasing for operator creation (by inexorabletash) [editorial]

<gb> CLOSED Issue 572 Synchronously validate input operands/activations (by inexorabletash) [bug] [question]

anssik: issue (n/a) fixed by PR #679

<gb> MERGED Pull Request 679 Build fix: Correct link for "transferred" term (by inexorabletash) [editorial]

anssik: issue (n/a) fixed by PR #680

<gb> MERGED Pull Request 680 Add missing definitions of inputShape to conv2d algorithms (by inexorabletash) [editorial]

anssik: issue #673 fixed by PR #682

<gb> CLOSED Issue 673 Meta: Introduce "Interop" label? (by inexorabletash) [process]

<gb> MERGED Pull Request 682 Process: Add "interop" label (by anssiko) [process]

anssik: issue #681 fixed by PR #683

<gb> MERGED Pull Request 683 Validate no duplicate axes for reduction ops (by inexorabletash)

<gb> CLOSED Issue 681 Shall we update the spec to add constraints on the axes and the input rank for the reduction operator? (by mei1127) [question] [operator specific]

anssik: issue #396 fixed by PR #684

<gb> CLOSED Issue 396 Clarify the restriction for `minValue` and `maxValue` of `MLClampOptions` (by huningxin) [operator specific]

<gb> MERGED Pull Request 684 Remove note about interop issues with clamp()'s minValue == maxValue (by inexorabletash) [editorial]

anssik: issue #675 fixed by PR #685
… issue #686 fixed by PR #687

<gb> MERGED Pull Request 685 Support int32 data type for 'indices' operand of 'gather' operator (by huningxin)

<gb> CLOSED Issue 675 why `gather` indices only accept uint32 or int64? (by philloooo) [question] [interop]

<gb> MERGED Pull Request 687 Validate layerNormalization options.axes (by inexorabletash)

<gb> CLOSED Issue 686 `layerNormalization()` method steps should validate the values of `options.axes` (by huningxin)

anssik: non-editorial PRs include #683 #685 #687 from Josh and Ningxin may benefit from a debrief?

<jsbell> checking...

ningxin: PR #685 int32 data type support for gather, thanks Phillis for providing the platform support data to inform this PR

<jsbell> (yeah, my two were very straightforward validation additions)

ningxin: widely supported data type, unblocks wpt tests for gather for CoreML and we see also issues for ONNX RT support int32 and models unblocked

jsbell: #683 #687 reflect validation additions informed by implementation experience

[process] Introduce "interop" label

anssik: we have a new "interop" workstream, comes with a new shiny "interop" label we try to put on issues arising from differences between backends

https://github.com/webmachinelearning/webnn/blob/main/docs/IssueTriage.md#workstream

current "interop" issue

anssik: thanks to my triage team pal Josh for identifying the first batch of "interop" issues
… Josh anything else to share with the group about this?

jsbell: I was interested in adding "interop" label because these are the ones where there's no "easy solution" while for others we can make a call between e.g. A or B

[process] TypeScript Types Declarations for WebNN

anssik: issue #677

<gb> Issue 677 Missing TypeScript Type Declaration (by egalli) [process]

anssik: a proposal for TypeScript type definitions for WebNN similar to respective definitions for WebGPU

PROPOSED TypeScript type definitions for WebNN (repo)

TypeScript type definitions for WebGPU (repo)

TypeScript type definitions for WebGPU (index)

anssik: this seems like a useful project and could fit into the WebML CG that is chartered to develop, among other things, also "Other Software"

WebML CG Charter Test Suites and Other Software

anssik: The CG is already using the standard W3C 3-clause BSD License for its Test Suites contributions and the same license is a good fit for this types proposal too considered "Other Software" because WebGPU group appears to be using the same BSD 3-Clause License for its corresponding types project

W3C 3-clause BSD License

anssik: Enrico just joined the WebML CG (welcome!) and is prepared to make the initial contribution and help keep this project maintained
… I propose we setup a repo for this effort unless there are concerns

jsbell: having TS definitions sounds great, want to make sure we're transparent that anything in the spec can change
… any breaking changes in the spec will have an effect to other projects such as on this project
… bikeshedding the name, WebGPU uses gpuweb/types -- do we call this webmachinelearning/types or something less generic like "webnn-types"

anssik: if no concerns raised in the issue in the coming week, we'll create a repo for this project

<ningxin> "webnn-types" SGTM

[process] Broad device coverage and maintainability

anssik: issue #453

<gb> Issue 453 Google Chrome Feedback on WebNN: aiming for broad device coverage and maintainability (by vsekhar) [process] [opset] [use case]

anssik: I want us to revisit this high-level issue discussing broad device coverage and maintainability
… would like to discuss what has been done, what remains to be done, disseminate any new information
… a subset of the recommendations in this issue have been or are being discussed in topic-specific issues, I see these open issues mentioned #456 and #573

<gb> Issue 456 Define the maximum number of operand dimensions (maximum rank) (by huningxin) [interop]

<gb> Issue 573 Core operator set (by philloooo) [question] [opset]

anssik: recently the group has focused on the models and hardware targets mentioned in this high-level summary, in both specification and prototyping efforts
… so, checking how to make progress with this issue, would a revision to this high-level summary would be appropriate?

jsbell: quick update, I'm a big fan of closing issues
… for the most parts the issue should remain open and recommendations are valid and progress is made on a lot of those
… four things:
… - 1) public positions from browser vendors
… would love feedback from Apple in particular
… we at Chrome team do consider standards positions from Apple and Mozilla
… - 2) streamlining the API surface, core op set
… Austin opened an issue to revisit comples GPU and LSTM, e.g. CISC ops that can be difficult to implement on some backends

<asully> webmachinelearning/webnn#689

<gb> Issue 689 Consider removing `lstm` and `gru` operators (by a-sully) [question] [operator specific]

jsbell: - 3) Performance for CPU and GPU across OSes and backends
… worked on Windows, macOS, Intel has shared early PnP numbers on NPU
… work is happening, need fully interoperable prototypes across all the OSes
… - 4) Performance using ML accelerator hardware

[bug] ArgMax/Min selectLastIndex is not supported on CoreML

anssik: issue #652

<gb> Issue 652 ArgMax/Min `selectLastIndex` is not supported on CoreML (by philloooo) [bug] [operator specific] [interop]

anssik: to recap, this is a proposal from Phillis to consider removing selectLastIndex parameter due to CoreML compatibility
… discussed on our previous call

https://www.w3.org/2024/05/02-webmachinelearning-minutes.html#t06

anssik: Mike provided helpful details on the previous call re BNNS and MPS, mentioning they don't run on an NPU and wanted to talk to some engineers and get back
… I wonder if there's new information to share?

Mike: the reason for difference is the sorting algorithm, both being equal depends on the underlying sorting algorithm

Dwayne: you can get different results depending on whether you go through BNNS or MPS

Mike: right

Dwayne: this matters for some models that are explicit in this regard, cannot advocate without knowing any specific models, don't anticipate a lot of harm in removing
… we have an option to add it back later
… we can deterministically compute answers on all the platforms

anssik: any concerns in removing selectLastIndex?

[bug] Consider changing output type of ArgMax/Argmin to int32, or allow passing output_type

anssik: issue #653

<gb> Issue 653 Consider changing output type of ArgMax/Argmin to int32, or allow passing output_type (by philloooo) [bug] [operator specific] [interop]

anssik: also discussed on our previous call, wanted to check in to if we have further research or explorations to report

https://www.w3.org/2024/05/02-webmachinelearning-minutes.html#t07

anssik: I think both Dwayne and Phillis indicated interest to look at this, not sure if you've had time for that yet?

Dwayne: been too busy last two weeks, not looked at this yet, will follow up
… anssik: it looks like Phillis is awaiting Dwayne's example models where int64 is useful
… there's also a separate question from Phillis to Dwayne in the issue, quoting: "I didn't fully understand your gather example, because I actually don't understand why indices allow both int64 and uint32. The indices should point to valid indices that's within MLOperand's dimensions right?"

<jsbell> Re: "Type casting all the things" + "MLNumber" - for #442, #678, #325 - please look at PR #647 which tries to tackle all of these - early feedback welcome. No live discussion needed...

<jsbell> #489 is about the cast() op so it's a different beast

<gb> Pull Request 647 Introduce MLNumber for specifying numeric inputs of any type (by inexorabletash)

<gb> Issue 325 Clarify the usage of 32 bit floating point type and consider using double (by huningxin) [feature request]

<gb> Issue 678 Specifies scalar values casted to match input type. (by philloooo) [feature request]

<gb> Issue 442 Type of some parameters should match the input data type (by Honry) [feature request] [operator specific]

<gb> Issue 489 Clarify the casting behavior from floating-point / signed integers <-> unsigned integers (by huningxin) [operator specific] [interop]

[feature request] Allow checking whether operators/types are supported for a backend before creating a graph

anssik: issue #463

<gb> Issue 463 Allow checking whether operators/types are supported for a backend before creating a graph (by huningxin) [feature request]

anssik: this is the original issue regarding checking for op/data type support

anssik: this feature would help with issues such as uint64/int64 data type #654 and MLConstantOperand #668, and interop issues e.g. #653, #675 #283

<gb> Issue 654 Consider dropping the support of uint64/int64 data type for some operators (by lisa0314) [bug] [operator specific] [interop]

<gb> Issue 653 Consider changing output type of ArgMax/Argmin to int32, or allow passing output_type (by philloooo) [bug] [operator specific] [interop]

<gb> CLOSED Issue 283 Specify the operand data type constraints of operation (by huningxin) [question]

<gb> Issue 668 Do we need an `MLConstantOperand`? (by a-sully) [question] [interop]

<gb> CLOSED Issue 675 why `gather` indices only accept uint32 or int64? (by philloooo) [question] [interop]

anssik: Phillis shared a context.opSupportLimits() proposal for exposing the data type support level:

webmachinelearning/webnn#463 (comment)

<gb> Issue 463 Allow checking whether operators/types are supported for a backend before creating a graph (by huningxin) [feature request]

jsbell: we think this would be useful for just maintaining op coverage reports
… we'd love feedback from people from frameworks sitting on top e.g. ONNX RT EP that could take benefit of this

Dwayne: would be useful for registration of ops and use a fallback as appropriate

<ningxin> +1, this would be useful for frameworks

<Joshua_Lochner> +1 agreed

jsbell: as soon as we get confirmation the shape is right and prototoping is appropriate Phillis was interested in advancing with POC

Dwayne: the proposal reminds me what we did in our team, list data types and rank ranges

RafaelCintron: haven't looked at the proposal in detail but in general it comes down to we need whether some op works or not and what data types work
… even inputs support different data types, I wonder if it is possible to group support into sets of things, so that way web developers can pick which model they run, without needing to consider every op separately
… in 3D space we have a similar grouping concept

jsbell: totally agree with Rafael, we land on interoperable set of interoperable backends eventually

jsbell: we'll take a look at this with Phillis and come back to this issue

ningxin: I can ensure Wanming working on ONNX RT EP will investigate from that angle, EP did check for data type on an op-level, we had an allowlist in the code
… it's per op-level, Phillis data structure is useful for framework to do this op-level examination, I will let Wanming to investigate and comment

<anssik> s/… it looks/anssik: … it looks

<anssik> s/Topic: Broad device coverage and maintainability/Topic: [process] Broad device coverage and maintainability

– DRAFT –
WebML WG Teleconference – 16 May 2024

16 May 2024

Attendees