Meeting minutes
Ethical Web Machine Learning
james: we've published a 2nd of a WG note on ethical considerations for ML https://
… we ran 2 brainstorm sessions for feedback & input on risks & mitigations this week
… both to generate content and set up a process for continuous concrete input on these topics
… we have 8 people on Monday, 10-12 on Tuesday
… they went really well and generated lots of useful input and positive feedback
… the next step will be to integrate the feedback into the document
… and bring content into the operationalization section that would include a white label version of that workshop as a tool to continue thinking on these topics
… we plan to bring the doc for approval by WG for publication as a note - I'll be pausing my involvement for a few months, but hope the work can continue forward
Proposed new use cases
WebNN API in gaming scenarios
Slideset: https://
Anssi: introducing Tracy, senior engineer from Unity, interested in bringing WebNN into their Web backend
Tracy: I was at Microsoft, most recently working on DirectML & +
Tracy: moved to Unity in January to work on Barracuda, a lightweight runtime for inferences
… running on phones, handheld consoles, games consoles, PCs
… it also supports a WebGL-based Web player
Tracy: It is used for a variety of use cases internally: reinforcement learning on objects to bring more intelligent behaviors
… lots of R&D with neural rendering
… public API for unity customers to use for the ML use cases
… Some of the use cases we also have is where Unity is both the editor and the runtime player
Tracy: the content tools are done with web pages - it allows to distribute them more quickly to end users
… some of the neural rendering techniques are using WebGL, which can't keep up with what would be expected from a desktop GPU
… WebGPU would already help with improvements, but Chai mentioned WebNN could provide another target to achieve best performance while operating in Web pages
… we're exploring the topic, may be doing prototyping and benchmarking performance
Ningxin: very exciting use case!
… do you have any special requirement in terms of ML functionality to work with the rendering component to work together?
… for example, real-time super-resolution in games
Tracy: operating with a game engine is a constrained environment, even in non Web cases
… the frame budget is restricted
… many of the models in the game cases (e.g. smart agent) end up running on CPU to avoid interfering with the GPU time budget
… this needs careful management
… in terms of the editor experience (where we're using the Web today) don't have the same constraints
ningxin: one of the target for WebNN aims to serve as backend for frameworks such as tf.js or onnx
… it would be interesting to collaborate on experiments
dom: any requirements for storage or memory management in the editor context on desktop? is that a constraint you're hitting
dom: any particular constraint on memory & storage in your use case
tracy: not that I know of atm
ningxin: we are discussing having an async API vs a synchronous API only within a web worker
… based on our prototypes with other frameworks transpiled from C++ into WASM, we see the requirement to call API synchronously
… but the sync API can have issues, e.g. making the API janky if running in the main thread
… in your framework, is there any requirement for sync / async calls? main thread vs worker?
Tracy: everything is asynchronous in our API
… nothing should block the rendering of the game thread
ningxin: with the work done in the worker - is the code running the inference sync or async?
tracy: this would be flexible, depending on the backend
ningxin: ok, so no restriction on sync vs async
anssi: the WebNN API started with a bunch of use cases; are we getting new requirements from this scenario? or does the API satisfy its requirements?
tracy: I'll review the spec to give a more informed answer on potential gaps
RafaelCintron: in your usage of ML in Unity - is the expectations that models will be pre-set, or will the graph be updated during runtime?
Tracy: more of the former; some of the weights might change, but mostly baked
Context-based graph execution methods for different threading models
Rafael: I won't be on the call in two weeks - the crux of the question is how much dependency we want to have on WebGPU
ningxin_hu: we need a clear boundary between WebNN & WebGPU
Context-based graph execution methods for different threading models PR #257
ningxin_hu: we should target the WebNN based on typical use; if WebNN is going to be expected to be used mostly in standalone mode, WebGPU integration should be hidden
… MLContext.compute & MLContext.asyncCompute would be the primary usage
… for WebGPU integration, I would support the typical WebGPU usage
… with a common buffer and queue that a WebGPU developer would be familiar with to make WebNN more friendly to them
… MLCommandBuffer or MLCommandEncoder would be closer to that approach
… we haven't had much investigation on the integration with the WebGL pipeline - more work may be needed there
… we should target typical usages of the API
… this will help orient the developers toward the right path
RafaelCintron: this is a useful classification; remains to be seen how much this needs to be reflected into different API shapes
… there is a discussion in the WebGPU WG on similar topics
… on async to avoid jank
… with similarly people porting from native wanting to have a sync version
… e.g. a Snap funny hat implementation
… the group is leaning towards sync only in worker; but struggling with very similar discussions indeed
Candidate Recommendation readiness
anssi: in terms of use cases - do we feel that we have captured the use cases for this API? any major gap?
… e.g. anything emerging from the background blur discussion that should be reflected in the use cases?
<RafaelCintron> WebGPU Sync vs Async Github Issue: mapSync on Workers - and possibly on the main thread (https://
Integration with real-time video processing
anssi: please review use cases for potential gaps
dom: the best demonstration will be having frameworks that run with a WebNN backend, with associated demonstrated performance benefits
ningxin_hu: the video processing prototype has a dependency to WebGPU integration - right now it relies on the WebGPU API to import video frame, pre-process it with a WebGPU shader into a WebGPU buffer on which the inference is running
… but there is also an alternative where VideoFrame becomes a directly usable primitive for WebNN, as Chai suggested in the discussion
… There may be value in supporting other models beyond the one we're using for segmentation
ningxin_hu: we may want to revisit the model list to integrate such a support
WebNN should support int8 quantized models #128
ningxin_hu: For Data Types, there are open issues about quantized and int8 models - we need to investigate this in more depth
… Typical AI accelerators in GPU hardware optimize for quantized data types
… Performance benefits are key to the work
<ningxin_hu> +1
dom: maybe useful to derive semi-formal requirements from the use cases, incl performance
anssi: I've assumed documenting them as issues was enough, but would not oppose to document them formally if people find this useful
… maybe we can label issues as requirements
anssi: we'll have need ethical considerations
… and document implementation experience
… supported by Web Platform Tests, the webnn-baseline and a browser implementation (targeting Chromium atm)
dom: note that implementation experience is not a requirement to get into CR
… (although double implementation experience is definitely need to get out of CR)
… in terms of wide review, we had TAG review, first security review (now tagged for horizontal review)
… we'll do another privacy review, and will have to lauch accessibility and internationalization
… and then WebGPU integration review
… we're on good track
<anssik> s|Slideset: https://