IRC log of webmachinelearning on 2024-02-22

Timestamps are in UTC.

15:00:29 [RRSAgent]: RRSAgent has joined #webmachinelearning
15:00:33 [RRSAgent]: logging to https://www.w3.org/2024/02/22-webmachinelearning-irc
15:00:33 [Zakim]: RRSAgent, make logs Public
15:00:34 [Zakim]: please title this meeting ("meeting: ..."), anssik
15:00:34 [anssik]: Meeting: WebML WG Teleconference – 22 February 2024
15:00:39 [anssik]: Chair: Anssi
15:00:50 [Joshua_Lochner]: Joshua_Lochner has joined #webmachinelearning
15:00:55 [anssik]: Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2024-02-22-wg-agenda.md
15:00:59 [anssik]: Scribe: Anssi
15:01:02 [anssik]: scribeNick: anssik
15:01:11 [anssik]: gb, this is webmachinelearning/webnn
15:01:11 [gb]: anssik, OK.
15:01:16 [asully]: asully has joined #webmachinelearning
15:01:19 [anssik]: Present+ Anssi_Kostiainen
15:01:31 [anssik]: Present+ Austin_Sullivan
15:01:31 [anssik]: Present+ Dwayne_Robinson
15:01:35 [phillis]: phillis has joined #webmachinelearning
15:01:42 [anssik]: Present+ Bryan_Bernhart
15:01:48 [anssik]: Present+ Zoltan_Kis
15:01:51 [dom]: Present+ Dominique_Hazael-Massieux
15:01:53 [anssik]: Present+ Phillis_Tang
15:02:03 [anssik]: Present+ Joshua_Bell
15:02:11 [anssik]: Present+ Joshua_Lochner
15:02:14 [dwayner]: dwayner has joined #webmachinelearning
15:02:16 [anssik]: Present+ Ningxin_Hu
15:02:25 [anssik]: Present+ Rafael_Cintron
15:02:35 [anssik]: RRSAgent, draft minutes
15:02:36 [RRSAgent]: I have made the request to generate https://www.w3.org/2024/02/22-webmachinelearning-minutes.html anssik
15:02:52 [anssik]: Present+ Mike_Wyrzykowski
15:03:07 [RafaelCintron]: RafaelCintron has joined #webmachinelearning
15:03:38 [MikeW]: MikeW has joined #webmachinelearning
15:05:19 [anssik]: MikeW: I represent Apple, WebGPU implementation background on Apple platform, leading WebNN/ML effort on Apple side as well
15:06:53 [MikeW]: mwyrzykowski
15:07:03 [anssik]: Topic: Announcements
15:07:11 [anssik]: -> W3C Breakouts Day 2024 https://github.com/w3c/breakouts-day-2024/
15:07:18 [anssik]: anssik: W3C Breakouts Day 2024 welcomes proposals by 29 February
15:07:28 [anssik]: ... breakout concept is familiar from TPAC, but now there's a separate day
15:07:37 [anssik]: ... in fact, an early WebNN API proposal was introduced in a breakout session at TPAC some years ago
15:07:47 [anssik]: ... breakouts are a good opportunity to share any new explorations with the broader community to gather feedback
15:07:53 [anssik]: ... if anyone has a breakout proposal in mind, simply open a GH issue in the breakouts-day-2024 repo https://github.com/w3c/breakouts-day-2024/issues
15:08:08 [zkis]: zkis has joined #webmachinelearning
15:08:11 [anssik]: q?
15:08:32 [anssik]: dom: in general breakouts are an opportunity to raise topics that go beyond one group's scope
15:08:44 [anssik]: ... typically prepare for 1 hour session with a small presentation and discussion
15:08:55 [anssik]: q?
15:09:04 [anssik]: Topic: WebNN API Candidate Recommendation Snapshot transition
15:09:35 [anssik]: anssik: Proposal is to initiate WebNN API CR Snapshot transition request 7 March, then publish by the end of March
15:09:45 [anssik]: ... let's review our CR readiness #240
15:09:45 [gb]: https://github.com/webmachinelearning/webnn/issues/240 -> Issue 240 Candidate Recommendation readiness tracker (by anssiko) [process]
15:10:00 [anssik]: ... we're soon ready to turn all green, I have still a few areas I want to confirm with the group
15:10:03 [anssik]: Subtopic: Test coverage
15:10:33 [anssik]: anssik: my expectation is the current test coverage is considered adequate for the CR transition, 100% coverage is not expected at CR time
15:10:40 [anssik]: -> wpt/webnn https://github.com/web-platform-tests/wpt/tree/master/webnn
15:11:02 [anssik]: -> wpt.fyi https://wpt.fyi/results/webnn
15:11:58 [anssik]: dom: we are way beyond expectations in terms of test coverage
15:12:04 [anssik]: anssik: happy to hear that
15:13:14 [anssik]: anssik: I suggest we note in the CR transition request that wpt.fyi results reflect XNNPACK backend implementation status
15:13:30 [anssik]: dom: good idea to clarify that in transition request
15:14:56 [anssik]: Ningxin_Hu: for DirectML implementation, we have an SW adapter for WPT, some gaps for real GPU in the bots to run wpt
15:15:11 [gdeepti]: gdeepti has joined #webmachinelearning
15:15:14 [anssik]: dom: is this something that is being worked on and is there a timeline?
15:15:24 [anssik]: Present+ Deepti_Gandluri
15:15:54 [anssik]: Ningxin_Hu: Austin informed we want to enable GPU tests in Chromium infrastructure, not sure if anyone is working on wpt.fyi currently
15:16:46 [anssik]: dom: wpt.fyi in some circles is used to gauge momentum so it is useful to figure out what needs to be done to improve CI setup for WPT for DirectML
15:17:20 [anssik]: Ningxin_Hu: currently wpt.fyi runs Edge on Windows 10, DML backends would require Windows 11, please point a contact for wpt.fyi to work with
15:18:00 [anssik]: dom: wpt owners groups would be the responsible people, being clear on what is needed would be the good first step
15:19:02 [anssik]: anssik: can have a separate call about this
15:19:12 [anssik]: Subtopic: Delta wide review
15:19:18 [anssik]: anssik: Delta wide review tracked in #239
15:19:19 [gb]: https://github.com/webmachinelearning/webnn/issues/239 -> Issue 239 Wide review tracker (by anssiko) [process]
15:19:23 [anssik]: ... no concerns raised
15:19:50 [anssik]: ... not expecting any major concerns given this is a delta review and the earlier full review passed and changes since have been either to address review feedback or adjust opset scope, improve overall spec quality
15:20:04 [anssik]: Subtopic: High-level status
15:20:08 [anssik]: -> High-level status (aka Status of this document) https://www.w3.org/TR/webnn/#sotd
15:20:15 [anssik]: anssik: I think we're good with this status text, merged to main, any proposals welcome
15:20:20 [anssik]: ... to recap this is the section busy people read, it is not inclusive of everything
15:20:31 [anssik]: Subtopic: Implementation status
15:20:35 [anssik]: -> Implementation Status https://webmachinelearning.github.io/webnn-status/
15:20:45 [anssik]: anssik: Belem & co have maintained the implementation status page, it is fit for the purpose of the CR
15:21:18 [MikeW]: thank you
15:21:52 [anssik]: anssik: all good to initiate transition req on March 7?
15:22:49 [anssik]: dom: only dangling bit is the TAG review
15:22:58 [anssik]: anssik: can you help bring this to their attention?
15:23:36 [anssik]: dom: I can try
15:24:17 [anssik]: Topic: Triage Guidance and Milestones
15:25:08 [anssik]: anssik: Next, I'd like to introduce the newly minted triage guidance and review initial triage results. Thanks Josh for working with me on this. I hope the group sees this effort as net positive
15:25:27 [anssik]: ... for this call, I'd like to hear if any of the issues identified as "bug", "testing", or "untriaged" (later "big issues") should be addressed by imminent CR Snapshot
15:25:40 [anssik]: ... for CR Snapshot purposes, we obviously are not expected to reach zero issues
15:25:46 [anssik]: -> Triage Guidance https://github.com/webmachinelearning/webnn/blob/main/docs/IssueTriage.md
15:26:22 [anssik]: jsbell: a few weeks ago we published triage guidance
15:26:36 [anssik]: ... since then have tried to follow the guidance
15:26:52 [anssik]: ... a big new label was "operator specific" with 41 issues
15:27:09 [anssik]: ... even if a lot of issues, the problems are scoped, not affecting the shape of the API overall
15:27:18 [jsbell]: https://github.com/webmachinelearning/webnn/issues?page=1&q=is%3Aissue+is%3Aopen+-label%3A%22operator+specific%22+-label%3A%22opset%22+-label%3A%22use+case%22+-label%3A%22webgpu+interop%22+-label%3A%22conventions%22+-label%3A%22editorial%22+-label%3A%22process%22++-label%3A%22testing%22
15:27:23 [anssik]: ... important are issues that do not fit into the workstreams
15:27:32 [anssik]: aka "unknown unknowns"
15:27:39 [anssik]: s/aka/... aka
15:27:57 [anssik]: jsbell: some additional issue clusters includes:
15:28:06 [anssik]: ... - Graph construction and build steps - covers about 5 issues; we've got some active discussion from several participants narrowing in on what, where, and how to make things more precise.
15:28:23 [anssik]: ... - Data types and number handling, including casting, small and big ints, input validation, and so on
15:28:36 [anssik]: ... Dwayne has kicked off discussions with WebIDL maintainers about the path to supporting both float64 (double) and int64 (bigint) as inputs to the same method
15:29:00 [anssik]: ... closed 15-20 issues as part of this initial triage
15:29:21 [zkis]: q?
15:29:48 [anssik]: q?
15:29:51 [anssik]: ack zkis
15:30:09 [anssik]: zkis: thanks Josh!
15:31:00 [anssik]: dom: I think some groups could borrow best practices from this group for triage guidance
15:31:40 [anssik]: anssik: triage guidance is welcoming PRs
15:32:18 [anssik]: ... also new contributors to the Triage team welcome
15:32:40 [anssik]: Subtopic: Milestones
15:32:50 [anssik]: anssik: how do we want to concretely make the best use of the GH milestones feature?
15:32:54 [anssik]: ... there was support on our last call to adopt milestones
15:33:31 [anssik]: ... is a CR Snapshot a good spec milestone with an a scope that is feasible for a ~quarter worth of work?
15:33:37 [anssik]: q?
15:34:07 [anssik]: dom: CR Snapshot every 3 months would raise a question how we do wide reviews for that cadence
15:34:47 [anssik]: ... we want another CR Snaphot beyond the next planned one, one milestone might be the next CR Snapshot, another obvious would be "Proposed Rec", not anticipating any timelines
15:35:12 [anssik]: ... discussion how to integrate backends to Proposed Rec implementation experience
15:35:30 [anssik]: ... what should not be part of the first Rec
15:35:46 [anssik]: ... declaring the first victory is beneficial
15:36:49 [anssik]: q?
15:37:27 [anssik]: RafaelCintron: WebGPU group there's a concept of a milestone, Mike can confirm
15:37:43 [MikeW]: That's right, its quite fluid however the milestones for WebGPU
15:37:50 [anssik]: ... criteria there is different from ours
15:39:07 [anssik]: MikeW: WebGPU group basically just categorizes issues to milestones based on complexity, flexibly changing from milestone to another
15:39:21 [anssik]: q?
15:39:55 [anssik]: Topic: New features
15:39:58 [anssik]: Subtopic: MLBuffer
15:40:39 [anssik]: anssik: Let's continue discussion on the proposal for a backend-agnostic storage type for WebNN operations informed by implementation experience.
15:40:39 [anssik]: ... I'd ask the group to pay attention to the open questions in the sub-issues and the exploration doc
15:40:42 [anssik]: -> MLBuffer proposal #482
15:40:42 [gb]: https://github.com/webmachinelearning/webnn/issues/482 -> Issue 482 Support for device-based tensor storage objects (by bbernhar) [webgpu interop]
15:40:45 [anssik]: -> Creation and representing MLBuffer on XPU devices #542
15:40:46 [gb]: https://github.com/webmachinelearning/webnn/issues/542 -> Issue 542 [MLBuffer] Creation and representing MLBuffer on a XPU devices (by bbernhar) [webgpu interop]
15:40:49 [anssik]: -> Uploading/downloading tensor data #543
15:40:49 [gb]: https://github.com/webmachinelearning/webnn/issues/543 -> Issue 543 [MLBuffer] Uploading/downloading tensor data (by bbernhar) [webgpu interop]
15:40:52 [anssik]: -> Support for MLBuffer in graph execution #544
15:40:53 [gb]: https://github.com/webmachinelearning/webnn/issues/544 -> Issue 544 [MLBuffer] Support for MLBuffer in graph execution (by bbernhar) [webgpu interop]
15:40:58 [anssik]: -> MLBuffer exploration doc #541
15:40:59 [gb]: https://github.com/webmachinelearning/webnn/pull/541 -> Pull Request 541 Add MLBuffer exploration doc (by a-sully) [webgpu interop]
15:41:03 [anssik]: -> Chromium implementation https://chromium-review.googlesource.com/c/chromium/src/+/5173676
15:41:27 [anssik]: anssik: I'm seeing good discussion in the exploration doc
15:41:35 [anssik]: ... I'd like to bring for discussion Austin's proposal for refocusing MLBuffer on the following goals:
15:41:51 [anssik]: ... - Prove out that the MLBuffer concept is feasible to implement on all platforms,
15:42:21 [anssik]: ... - Prove out that MLBuffer provides meaningful performance wins for the two use cases we've identified, and
15:42:42 [anssik]: ... - Avoid baking in any assumptions which would preclude adopting further optimizations in the future
15:42:55 [anssik]: ... tentative suggestions:
15:43:12 [anssik]: ... - Start with the initially-proposed readBuffer() and writeBuffer() APIs as the only way to read/write data to an MLBuffer from script
15:43:13 [asully]: q+
15:43:38 [anssik]: ... tentative suggestions:
15:43:41 [anssik]: ... - Start with the initially-proposed readBuffer() and writeBuffer() APIs as the only way to read/write data to an MLBuffer from script
15:43:45 [anssik]: ... - Take a phased approach to supporting WebGPU <-> WebNN interop
15:43:49 [anssik]: ... - Punt on the following features: Buffer mapping to JS, Minimizing buffer copies for UMA systems
15:43:51 [anssik]: q?
15:43:58 [anssik]: ack asully
15:44:09 [anssik]: asully: thanks for taking a look at this!
15:44:34 [anssik]: ... I think the purpose is to make sure we get performance wins, JS buffer mapping is not so helpful because it may introduce overhead in fact
15:45:03 [anssik]: ... most of the discussion can happen async, but would love to get Apple's feedback on this call
15:45:05 [anssik]: q?
15:46:15 [anssik]: MikeW: I'm reading the issue now, I need to do a little bit research first
15:46:16 [anssik]: q?
15:47:02 [anssik]: Bryan: more or less we're back to where we started
15:47:19 [phillis]: q+
15:47:23 [anssik]: ack phillis
15:47:49 [anssik]: Topic: Pull Requests and open issues
15:48:00 [anssik]: anssik: we've worked through our PR queue, so we can focus on discussing open issues based on your feedback
15:48:06 [anssik]: -> Open PRs https://github.com/webmachinelearning/webnn/pulls
15:48:06 [anssik]: -> Open issues https://github.com/webmachinelearning/webnn/issues
15:48:12 [jsbell]: https://github.com/webmachinelearning/webnn/issues/573
15:48:13 [gb]: https://github.com/webmachinelearning/webnn/issues/573 -> Issue 573 Core operator set (by philloooo) [question] [opset]
15:48:39 [anssik]: Subtopic: Core operator set
15:49:24 [anssik]: phillis: feedback from our platform teams, want to ensure we have good coverage and it is decomporable
15:49:35 [anssik]: ... works consistently so frameworks on top can rely on it
15:49:51 [anssik]: q?
15:49:52 [jsbell]: q+
15:49:54 [anssik]: ack jsbell
15:50:45 [anssik]: jsbell: thi has come up with StableHLO and PyTorch that have tried to move to very well defined baseline ops
15:50:50 [anssik]: s/thi/this
15:51:36 [RafaelCintron]: q+
15:51:38 [anssik]: jsbell: if the higher-level op is missing they want to be able to lower to a core ops
15:51:38 [anssik]: q?
15:51:41 [anssik]: ack RafaelCintron
15:51:59 [anssik]: RafaelCintron: I'm willing to explore what a core op set would mean
15:52:15 [anssik]: ... need web developers to be able to use an expanded ops on platforms that support it
15:53:05 [anssik]: ... being able to do higher-level things easily seems very useful, everything should be in the spec, core and "expanded" op set
15:53:21 [jsbell]: q+
15:53:28 [anssik]: phillis: expanded op set should be in the spec, the question is whether we section the ops per core and "extended"
15:53:29 [anssik]: q?
15:53:44 [anssik]: q?
15:53:48 [anssik]: ack jsbell
15:53:57 [jsbell]: https://pytorch.org/docs/main/torch.compiler_ir.html
15:54:25 [asully]: q+
15:54:27 [anssik]: jsbell: agree we don't want to go down to minimal set, PyTorch has settled on a core op set
15:54:28 [anssik]: q?
15:54:30 [anssik]: ack asully
15:54:58 [anssik]: asully: one of the key thing when we way "core op set" these are defined closely with constraints and would behave the same across platforms
15:55:29 [anssik]: ... the higher-level the op, more variation in implementations, e.g. LSTM
15:55:57 [anssik]: q?
15:56:25 [jsbell]: Dwayne has hand up in Zoom?
15:56:38 [RafaelCintron]: q+
15:56:43 [anssik]: q?
15:56:45 [anssik]: ack RafaelCintron
15:57:31 [anssik]: Dwayne: this is not a new concept, we haven't gone deep into this, there's primitive, aggregate, optional ops -- what does it mean to be compliant to this spec then?
15:57:49 [anssik]: ... I feel every complex op should be decomporable and core set should behave the same across platforms
15:58:05 [anssik]: ... there's wiggle room around edges, casting, truncating to zero etc.
15:58:14 [dom]: s/ora/osa/
15:58:14 [anssik]: ... fuzzier areas to iron out
15:58:28 [anssik]: ... required and optional, logically organized with a label next to them
15:58:36 [asully]: q+
15:58:42 [anssik]: ack asully
15:59:27 [anssik]: asully: to respond to Dwayne, agree ideally every op behaves the same on all platforms and we have no distinction of core and others
15:59:51 [anssik]: ... in reality we have different backends to TF, PT etc. with differences
16:00:28 [anssik]: ... for many web platform APIs you expect to run everyone, if we require every op to be supported everywhere e.g. LSTM is not implemented everywhere and would require a CPU fallback
16:00:39 [anssik]: ... there's room to establish clarity around this
16:00:40 [anssik]: q?
16:01:41 [anssik]: q?
16:02:33 [anssik]: RRSAgent, draft minutes
16:02:35 [RRSAgent]: I have made the request to generate https://www.w3.org/2024/02/22-webmachinelearning-minutes.html anssik
16:13:00 [anssik]: s/we way/we say
16:14:26 [anssik]: RRSAgent, draft minutes
16:14:27 [RRSAgent]: I have made the request to generate https://www.w3.org/2024/02/22-webmachinelearning-minutes.html anssik
18:05:00 [Zakim]: Zakim has left #webmachinelearning