WebML WG Teleconference – 30 June 2022

Meeting minutes

WebNN implementation status update

Ningxin: we landed a few patches to land WebNN IDL in Chromium - the operators for Mobile.net
… they've been merged
… I'm now working on a patch to implement the WebNN CPU backend based on XNNPACK
… it's been reviewed by a few folks of the Chrome Team and Rafael

<ningxin_hu> https://chromium-review.googlesource.com/c/chromium/src/+/3684745

Ningxin: it implements the graph building & compute functionalities on top of XNNPACK
… and execute them with optimized kernels on x86 and ARM
… XNNPACK support other architectures, but this patch focuses on this subset
… targets both Windows and Linux
… it implements the operations that landed in the previous WebIDL patch
… it allows to run our image segmentation example
… with ~3x speedup compared to the WASM version
… with the context-based graph execution, this allows the background-thread processing for async
… with the sync API restricted to workers
… last time, we agreed to use Sync postfix for the sync (vs Async for async) - the patch uses that convention (even if it hasn't been merged in the spec yet)
… the security reviewer we got on the spec is also making good suggestions on the patch to avoid overflow / underflow, enforcing safe conversions
… based on an existing chromium utility
… also suggested a fuzzer to test the graph building & computing
… which sounds good and will be added to my todo
… Rafael mentioned that the CPU backend could run in the renderer process to avoid IPC
… which this patch is doing and provides good speedup (3x)
… works very well on Windows, but some issues on linux
… XNNPACK requires CPU info for optimization
… which is restricted in the linux sandbox
… there are workarounds I have proposed, but this will likely need a more sophisticated solution
… or modify XNNPACK to work better with the linux sandbox

dom: thanks for the update, congrats!

<RafaelCintron> Great progress, Ningxin!

dom: I'm impressed by 3x speedup
… is it Wasm with SIMD?

ningxin_hu: yes

dom: did you get a sense where we get this acceleration from?

ningxin_hu: major thing is SIMD width
… SIMD 128 bits, but XNNPACK can select what is the best available on the platform, 256 bit on my test platform
… that gives good performance boost
… there may be also other instruction-level optimizations that helped
… this is very close to native performance, ~80-90%

RafaelCintron: question, is the CL you have up today, is it complete implementation of WebNN of a start of more CLs?

RafaelCintron: is the patch a complete implementation of WebNN or just a start?

Ningxin: it's a start but provides complete support for mobilenet v2
… (needs 8 operators that are supported in this patch)

Rafael: no dependency on WebNN native - will that be necessary in the future?

Ningxin: that will depend on feedback from the chrome reviewers
… WebNN native is a tool we use to evaluate multiple backend implementations
… and OS API capabilities
… it supports a wide range of APIs, but not all APIs are suitable for Chrome integration given its deployment model or other restrictions
… we select the backends that are suitable for chrome integration - here XNNPACK
… which allows to reduce the WebNN native code that is needed
… so most likely WebNN native won't be used as a whole
… XNNPACK is for CPU only
… for GPU, we would look at upstreaming the DirectML backend
… GPU will require Mojo-based IPC, so we're looking into this
… which is part of our next steps

RafaelCintron: do you anticipate for other things like GPU backend, you would do the same approach of copy & pasting from WebNN Native?

ningxin: right, although the code may diverge based on Chrome reviewers

<Zakim> dom, you wanted to suggest a recorded demo/update for TPAC

dom: wanted to suggest a recorded demo/update for TPAC
… ~parity with native would be a nice story
… re TPAC, wrapping what has been done in real-time processing integration into a demo might be another concrete demonstration of the impact of the work

ningxin: good idea
… this would show progress from last year demo based on electron+node
… for the GPU, we have a good demo
… based on integration as a third party, different from what we're doing through this patch
… incl a different IPC mechansim

dom: to be clear, if we are doing a demo based on unreleased code, that's perfectly OK
… if it'd be repackaging the demos produced earlier that's OK too, the goal is to show what we're doing, prototype code is fine and valuable

Privacy review request for PING

#276

ghurlbot, this is webmachinelearning/webnn

#276

[PROPOSED] Privacy review request

<ghurlbot> anssik, OK

<ghurlbot> #276

#271

<ghurlbot> Pull Request 271 Privacy Considerations refresh (anssiko)

anssik: any concern with merging #271?
… will merge; we'll continue refining privacy considerations even after the review request has been submitted

PROPOSED RESOLUTION: merge #271 and submit updated privacy review request

RESOLUTION: merge #271 and submit updated privacy review request

Ethical Principles for Web Machine Learning Draft Note

anssi: the first draft note of our ethical principles for Web Machine Learning has been published

https://lists.w3.org/Archives/Public/public-webmachinelearning-wg/2022Jun/0007.html

anssi: this has been a cross-group & cross-disciplinary effort, based on our charter commitment
… happy to have reached that initial milestone

WebNN API Candidate Recommendation issue scope

Current CR issues

anssik: new "cr" issue #272, discussed later today

<ghurlbot> Issue 272 Support asynchronous context creation (huningxin) cr

anssik: remove "cr" from issue #226 per our resolution:

<ghurlbot> Issue 226 Integration with real-time video processing (dontcallmedom)

RESOLUTION: In response to real-time video processing #226, Ningxin to open new issues for WebGPU and WebCodec WebNN interop issues and remove "cr" label from #226 once done

anssik: should we label #264 as "cr"?

<ghurlbot> Issue 264 CommandBuffer usage clarification: internal, external, both? (bbernhar)

ningxin_hu: #264 is related to WebGPU interop, which WebGPU has labeled as post-v1 for WebGPU

ningxin_hu: this issue is related to WebGPU interop, and as discussed before, WebGPU interop is post-V1 for WebGPU WG so adding this to our "cr" scope would add a schedule risk for our CR 2022 plan
… which creates risk for our schedule

dom: making sure we understand the impact, we're going to do CR without GPU?

ningxin_hu: GPUCommandEncoder would be out of CR scope, but standalone GPU usage via MLDeviceType would be in scope

dom: real-time processing use case would be out of scope for CR?

ningxin_hu: right, because it'd need WebGPU interop

RafaelCintron: question, if we remove WebGPU interop from the scope for CR, the only thing web developer can do with GPU is to use ArrayBuffers?

ningxin_hu: correct

Rafael: if we remove from WebGPU interop from the spec for CR, that would mean the only thing a dev can do with WebGPU is send and get back ArrayBuffer with WebNN mediating exchange with the GPU

RafaelCintron: for CR there would be no IDL types for GPUBuffer or GPUTexture

ningxin_hu: right, those are supported by CommandEncoder interface, so we try to make a clear separation between ArrayBuffer input/output and WebGPU primitives

RafaelCintron: I'm trying to understand if CommandEncoder is in scope

anssi: the CR scope kind of says "feature complete"
… to move from CR to the next stage, it's useful if the scope hasn't changed

dom: whether this means moving things out of the spec for CR, should we split the spec for CPU and GPU that'd advance at a different pace
… when we go from CR to the next stage PR we are expected to have independent separate implementations
… having real-time processing use case in scope of the WebNN would help drive interest among a broader group of implementers

Rafael: I personally feel that the WebGPU interop needs a bit more discussion & implementation experience

dom: revising a CR is easier today that it was before
… need to do wide review update but I don't expect changes delta to have many implications re these review

anssik: indeed, with the new process, updating CR is much cheaper
… might require new reviews in some cases

rafael: would it be getting back the same reviews as the ones we're conducting?

anssi: reviews of the delta, and if the said delta impacts the said review from our perspective
… which for WebGPU interop might be very limited

rafael: let's pretend we get updated TAG & PING reviews, we go to CR and make impacting changes, the same people would be involved in the updated reviews?

anssi: right
… with different expectations (probably not as big changes expected from that point on)

Support asynchronous context creation

anssik: A new "cr" issue for asynchronous context creation: #272

<ghurlbot> Issue 272 Support asynchronous context creation (huningxin) cr

anssik: and an accompanying PR #274 by Ningxin in review, awaits Chai

<ghurlbot> Pull Request 274 Support async context creation and use sync postfix (huningxin)

Forward-looking use cases

anssik: Use cases #207 and #253 proposed to be redirected to the proposals repo

<ghurlbot> Pull Request 253 Add "Ethical Content Filtering" use case to WebNN specs (humeranoor)

<ghurlbot> Pull Request 207 Update "Performance Adaptation" use case (spshin3)

anssik: Humera confirmed this approach sounds good, Sungpil Shin indicated he's been busy and will look into this once back in the office
… the implication for the WebNN API spec would be these use cases are considered out of scope for CR and as such do not impose new requirements for the API

ningxin_hu: just checked the issue, we missed w-p-t

#240

<ghurlbot> Issue 240 Candidate Recommendation readiness tracker (anssiko)

#265

<ghurlbot> Issue 265 Define ULP (unit of least precision) tolerances for Conformance testing of WebNN API (BruceDai)

Bruce: I have a proposal to define ULP tolerances
… I have these in a WPT PR for 8 operations

anssik: chai would have an expert opinion on this one
… WPT is otherwise an expectation for CR, but we don't have to have 100% coverage either

<ningxin_hu> propose to label wpt related issues with "cr"

CommandBuffer usage clarification

anssik: Discuss on how to address the WebGPU resource usage and sharing issues in preparation for the WebGPU WG review.
… discussed this on 2022/06/02 call, we missed Chai's input last time due to vacation so revisiting
… still on vacation, so propose we skip this again
… next steps would be to create a PR and ask Bryan to review

2022/06/02 minutes

Proposed new features

#275

<ghurlbot> Issue 275 Should MLBufferView + MLOperandDescriptor be strongly typed (as a MLTensor)? (wacky6)

#270

<ghurlbot> Issue 270 Support coordinate transformation modes for Resample2d (Honry)

#269

<ghurlbot> Issue 269 Define the data type of the padding, strides and dilations of MLConv2dOptions as sequence<unsigned long> (miaobin)

TPAC 2022

W3C TPAC 2022

WebML WG Hybrid Meeting at TPAC 2022

anssik: TPAC 2022 takes place 12–16 September 2022

anssik: "This event brings together W3C technical groups, the Advisory Board, the Technical Architecture Group and the Advisory Committee for exciting, coordinated work. The benefit of assembling the community for thought-provoking discussions is invaluable."

WebML WG Hybrid Meeting at TPAC 2022
… WebML WG meets 13 Sep 2022 15:00 UTC - 18:00 UTC
… Tentative ML & Ethics workshop sessions: Mon 12 & Thu 15 Sep 2022 afternoon(?) Vancouver time
… In-person hub:
… Sheraton Vancouver Wall Centre
… Vancouver British Columbia, Canada
… I'm hearing the remote experience will be dramatically improved from earlier years
… If you're planning to join in-person and willing to share your plans, it might help others decide their participation mode

Rafael: I'm not sure what our plans are will need to talk with Travis who's our W3C AC rep
… questions about TPAC?

AOB

Next teleconference 11 August 2022

anssik: We'll pause the meetings for July due to vacation period in the Northern hemisphere and will resume 11 August 2022.
… Have a relaxing vacation to whom it concerns!
… and as usual, GH repos remains open for your contributions.
… we've made substantial progress during the first half of 2022 and have very exciting milestones ahead of us in the second half of 2022!

– DRAFT –
WebML WG Teleconference – 30 June 2022

30 June 2022

Attendees