Meeting: WebML WG Teleconference – 7 April 2022
Chair: Anssi
Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2022-04-07-wg-agenda.md
Scribe: Anssi
scribeNick: anssik
scribe+ dom
Present+ Anssi_Kostiainen
Present+ Dominique_Hazael-Massieux
Present+ James_Fletcher
Topic: Ethical Web Machine Learning
present+
james: we've published a 2nd of a WG note on ethical considerations for ML https://webmachinelearning.github.io/ethical-webmachinelearning/
... we ran 2 brainstorm sessions for feedback & input on risks & mitigations this week
... both to generate content and set up a process for continuous concrete input on these topics
... we have 8 people on Monday, 10-12 on Tuesday
... they went really well and generated lots of useful input and positive feedback
... the next step will be to integrate the feedback into the document
... and bring content into the operationalization section that would include a white label version of that workshop as a tool to continue thinking on these topics
... we plan to bring the doc for approval by WG for publication as a note - I'll be pausing my involvement for a few months, but hope the work can continue forward
q?
Topic: Proposed new use cases
Subtopic: WebNN API in gaming scenarios
Anssi: introducing Tracy, senior engineer from Unity, interested in bringing WebNN into their Web backend
Tracy: I was at Microsoft, most recently working on DirectML & +
... moved to Unity in January to work on Barracuda, a lightweight runtime for inferences
... running on phones, handheld consoles, games consoles, PCs
... it also supports a WebGL-based Web player It is used for a variety of use cases internally: reinforcement learning on objects to bring more intelligent behaviors
Present+ Rafael_Cintron
... lots of R&D with neural rendering
... public API for unity customers to use for the ML use cases
Present+ Tracy_Sharpe
Present+ Ganesan_Ramalingam
... Some of the use cases we also have is where Unity is both the editor and the runtime player
... the content tools are done with web pages - it allows to distribute them more quickly to end users
... some of the neural rendering techniques are using WebGL, which can't keep up with what would be expected from a desktop GPU WebGPU would already help with improvements, but Chai mentioned WebNN could provide another target to achieve best performance while operating in Web pages
... we're exploring the topic, may be doing prototyping and benchmarking performance
q?
Ningxin: very exciting use case!
... do you have any special requirement in terms of ML functionality to work with the rendering component to work together?
... for example, real-time super-resolution in games
Present+ Jonathan_Bingham
Tracy: operating with a game engine is a constrained environment, even in non Web cases
... the frame budget is restricted
... many of the models in the game cases (e.g. smart agent) end up running on CPU to avoid interfering with the GPU time budget
... this needs careful management
... in terms of the editor experience (where we're using the Web today) don't have the same constraints
q+
ningxin: one of the target for WebNN aims to serve as backend for frameworks such as tf.js or onnx
... it would be interesting to collaborate on experiments
dom: any requirements for storage or memory management in the editor context on desktop? is that a constraint you're hitting
dom: any particular constraint on memory & storage in your use case
q+
q?
ack dom
q+
ack ningxin_hu
tracy: not that I know of atm
ningxin: we are discussing having an async API vs a synchronous API only within a web worker
... based on our prototypes with other frameworks transpiled from C++ into WASM, we see the requirement to call API synchronously
... but the sync API can have issues, e.g. making the API janky if running in the main thread
... in your framework, is there any requirement for sync / async calls? main thread vs worker?
Tracy: everything is asynchronous in our API
... nothing should block the rendering of the game thread
ningxin: with the work done in the worker - is the code running the inference sync or async?
tracy: this would be flexible, depending on the backend
ningxin: ok, so no restriction on sync vs async
anssi: the WebNN API started with a bunch of use cases; are we getting new requirements from this scenario? or does the API satisfy its requirements?
tracy: I'll review the spec to give a more informed answer on potential gaps
q?
ack RafaelCintron
RafaelCintron: in your usage of ML in Unity - is the expectations that models will be pre-set, or will the graph be updated during runtime?
Tracy: more of the former; some of the weights might change, but mostly baked
q?
Topic: Context-based graph execution methods for different threading models
Rafael: I won't be on the call in two weeks - the crux of the question is how much dependency we want to have on WebGPU
ningxin_hu: we need a clear boundary between WebNN & WebGPU
-> Context-based graph execution methods for different threading models PR #257 https://github.com/webmachinelearning/webnn/pull/257
... we should target the WebNN based on typical use; if WebNN is going to be expected to be used mostly in standalone mode, WebGPU integration should be hidden MLContext.compute & MLContext.asyncCompute would be the primary usage
... for WebGPU integration, I would support the typical WebGPU usage
... with a common buffer and queue that a WebGPU developer would be familiar with to make WebNN more friendly to them MLCommandBuffer or MLCommandEncoder would be closer to that approach
... we haven't had much investigation on the integration with the WebGL pipeline - more work may be needed there
... we should target typical usages of the API
q?
... this will help orient the developers toward the right path
RafaelCintron: this is a useful classification; remains to be seen how much this needs to be reflected into different API shapes
... there is a discussion in the WebGPU WG on similar topics
... on async to avoid jank
... with similarly people porting from native wanting to have a sync version
... e.g. a Snap funny hat implementatio
s/io/ion
... the group is leaning towards sync only in worker; but struggling with very similar discussions indeed
q?
s/fredue/freude/
Topic: Candidate Recommendation readiness
anssi: in terms of use cases - do we feel that we have captured the use cases for this API? any major gap?
... e.g. anything emerging from the background blur discussion that should be reflected in the use cases?
WebGPU Sync vs Async Github Issue: mapSync on Workers - and possibly on the main thread (https://github.com/gpuweb/gpuweb/issues/2217 )
-> Integration with real-time video processing https://www.w3.org/TR/webnn/#usecase-real-time-video-processing
... please review use cases for potential gaps
q+
ack ningxin_hu
dom: the best demonstration will be having frameworks that run with a WebNN backend, with associated demonstrated performance benefits
ningxin_hu: the video processing prototype has a dependency to WebGPU integration - right now it relies on the WebGPU API to import video frame, pre-process it with a WebGPU shader into a WebGPU buffer on which the inference is running
... but there is also an alternative where VideoFrame becomes a directly usable primitive for WebNN, as Chai suggested in the discussion There may be value in supporting other models beyond the one we're using for segmentation
-> The first-wave models https://github.com/webmachinelearning/webnn/blob/main/op_compatibility/first_wave_models.md
... we may want to revisit the model list to integrate such a support
-> WebNN should support int8 quantized models #128 https://github.com/webmachinelearning/webnn/issues/128
... For Data Types, there are open issues about quantized and int8 models - we need to investigate this in more depth
... Typical AI accelerators in GPU hardware optimize for quantized data types Performance benefits are key to the work
q+
ack dom
+1
dom: maybe useful to derive semi-formal requirements from the use cases, incl performance
anssi: I've assumed documenting them as issues was enough, but would not oppose to document them formally if people find this useful
... maybe we can label issues as requirements
q?
-> Ethical Considerations https://www.w3.org/TR/webnn/#ethics
anssi: we'll have need ethical considerations
... and document implementation experience
... supported by Web Platform Tests, the webnn-baseline and a browser implementation (targeting Chromium atm)
dom: note that implementation experience is not a requirement to get into CR (although double implementation experience is definitely need to get out of CR)
... in terms of wide review, we had TAG review, first security review (now tagged for horizontal review)
... we'll do another privacy review, and will have to lauch accessibility and internationalization
... and then WebGPU integration review
q?
... we're on good track
Present+ Wonsuk_Lee
s|Anssi: introducing Tracy|slideset: https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0000/WebML_WG_Intro.pdf\\nAnssi: introducing Tracy|
s|Slideset: https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0000/WebML_WG_Intro.pdf\\nAnssi: introducing Tracy|Anssi: introducing Tracy|
s|slideset: https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0000/WebML_WG_Intro.pdf\\nAnssi: introducing Tracy|Anssi: introducing Tracy|
i|Anssi: introducing Tracy|Slideset: https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0000/WebML_WG_Intro.pdf
i|Tracy: I was at Microsoft,|[slide 1]
i|... moved to Unity in January|[slide 2]
i| It is used for|[slide 3]
i|... the content tools|[slide 4]