13:59:45 RRSAgent has joined #webmachinelearning 13:59:49 logging to https://www.w3.org/2023/06/29-webmachinelearning-irc 13:59:49 RRSAgent, make logs Public 13:59:50 please title this meeting ("meeting: ..."), anssik 13:59:52 Meeting: WebML WG Teleconference – 29 June 2023 13:59:56 Chair: Anssi 14:00:02 Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2023-06-29-wg-agenda.md 14:00:20 Scribe: Anssi 14:00:24 scribeNick: anssik 14:00:51 ghurlbot, this is webmachinelearning/webnn 14:00:51 anssik, OK. 14:01:02 ningxin_hu has joined #webmachinelearning 14:01:02 Present+ Anssi_Kostiainen 14:01:54 Present+ Joshua_Lochner 14:02:09 Present+ Chai_Chaoweeraprasit 14:02:17 Present+ Ningxin_Hu 14:02:39 RRSAgent, draft minutes 14:02:40 I have made the request to generate https://www.w3.org/2023/06/29-webmachinelearning-minutes.html anssik 14:03:22 Present+ Zoltan_Kis 14:03:51 Present+ Vivek_Sekhar 14:04:11 Present+ Dwayne_Robinson 14:04:24 RRSAgent, draft minutes 14:04:25 I have made the request to generate https://www.w3.org/2023/06/29-webmachinelearning-minutes.html anssik 14:05:01 Regrets+ Dominique_Hazael-Massieux 14:05:01 chai has joined #webmachinelearning 14:05:01 dwayner has joined #webmachinelearning 14:05:01 Joshua_Lochner has joined #webmachinelearning 14:05:05 Topic: Announcements 14:05:16 -> Implementation Status of WebNN Operations https://webmachinelearning.github.io/webnn-status/ 14:05:31 Vivek has joined #webmachinelearning 14:05:34 anssik: we've launched a web developer-focused implementation status page to provide better visibility into implementations of WebNN API 14:05:47 ... The total number of WebNN v1 ops is 60. 14:05:56 ... this table currently lists ops that are fully implemented or WIP by multiple backends. 14:06:02 ... that's why you see percentages lower than 100% below the table. 14:06:06 ... the table is maintained on GitHub, PRs welcome 14:06:10 -> webnn-status.md source https://github.com/webmachinelearning/webmachinelearning.github.io/blob/main/webnn-status.md?plain=1 14:06:16 ... questions, comments? 14:06:37 q? 14:06:45 +q 14:06:51 ack Vivek 14:07:52 Topic: WebNN v2: text-to-image and text-to-text use cases and requirements 14:08:20 anssik: today we will discuss prototyping findings from a transformer-based generative AI use case exploration 14:08:39 ... the goal is to solicit input on new ops and data types required to support important models informed by our prototyping efforts 14:08:51 ... on our last meeting Joshua shared his experiences with Transformers.js, please check out the great prensentation if you missed it 14:09:27 -> Transformers.js presentation https://lists.w3.org/Archives/Public/www-archive/2023Jun/att-0000/Transformers_js.pdf 14:09:27 ... today I'm happy to welcome Dwayne from Microsoft to share his explorations and findings in this space 14:09:27 ... Dwayne has experimented with a WebNN DirectML prototype https://github.com/fdwr/chromium-src-webnn-dml/pull/1 to inform us with running code 14:09:42 ... as a reminder, we have a meta issue #375 for transformer-based models and use cases discussion 14:09:43 https://github.com/webmachinelearning/webnn/issues/375 -> Issue 375 Support for transformers (dontcallmedom) v2 14:09:54 anssik: without further ado, let me welcome Dwayne to present what he has been up to lately 14:10:05 Slideset: url-to-slides 14:10:46 [slide 1] 14:11:12 [slide 2] 14:11:49 dwayner: Chromium fork with DML backend that is fairly mature 14:11:56 [slide 3] 14:12:14 dwayner: ONNXRuntime for Web, in master branch 14:12:44 ... v1 ops half implemented, "v2" 14/20 implemented 14:12:48 [slide 4] 14:13:03 dwayner: SegmentAnything, Stable Diffusion model targets 14:13:08 [slide 5] 14:13:40 dwayner: Segment Anything pretty small model, 39 ops, works in ORT for Web via WebNN EP, not in master Chromium but the fork currently 14:13:47 ... caveat, static shapes only 14:13:51 [slide 6] 14:14:12 dwayner: Stable Diffusion quite much bigger, f16 is 2.3 GB 14:14:55 ... 38 ops, 5 models coordinated 14:15:00 [slide 7] 14:15:39 dwayner: native implementations of ORT+DML, WebNN proto needs a bit more work, 2 ops in Chromium and 5 in ORTW 14:16:05 ... unique challenges in fitting in 4GB limit in Wasm32 14:16:26 ... browsers will also evict cache so need to download again sometimes 14:16:29 [slide 8] 14:16:38 (small text warning) 14:16:46 [slide 9] 14:17:14 dwayner: v2 ops enumerated on this slide 14:17:45 ... notably, we need conversion between datatypes, every model has at least one cast op 14:17:56 ... squeeze and unsqueeze needed 14:18:00 [slide 10] 14:18:24 dwayner: missing MLOperandDataTypes: bool8, int64 14:19:17 ... ORT Web could map these, can be emulated when backends don't support these types 14:20:00 [slide 11] 14:20:36 dwayner: missing concepts, 0D scalars that everyone supports 14:20:50 ... another gap, retrieving the built shape 14:21:13 [slide 12] 14:21:50 dwayner: reshaping ops, WebNN has squeeze but no unsqueeze, flattenTo2d needed 14:21:54 [slide 13] 14:22:44 dwayner: MVN, many ways to arrange axes, spatial normalization, spatial+batch, spatial+channel, spatial+grouped channel normalization 14:22:53 [slide 14] 14:23:07 dwayner: dedicated ops for efficiency and semantics 14:23:40 ... Pow(x, 0.5) and Div(1/x) as examples, implementation have dedicated instructions 14:23:45 [slide 15] 14:24:27 dwayner: optimized operators, Olive optimizations rearrange the model, fure new ops, e.g. MultiheadAttantion 14:25:03 ... many variations, prefer the community to settle on blessed approach 14:25:09 [slide 16] 14:25:20 dwayner: a bunch of links for references 14:25:58 q? 14:25:58 +q 14:26:26 anssik: when do you plan to share running demos? 14:26:49 dwayner: anyone can now build and experiment with SegmentAnything demo 14:26:55 ... I could package a file with components 14:27:21 q? 14:27:23 ack Joshua_Lochner 14:27:32 Joshua_Lochner: nice to meet you! 14:27:52 ... amazing presentation, it is very interesting to see how low-level things are implemented 14:28:30 https://github.com/mlc-ai/web-stable-diffusion 14:28:41 ... Stable Diffusion, have you seen the recent demo https://github.com/mlc-ai/web-stable-diffusion 14:29:10 ... WebGPU backend to run SD in the browser, quite large LLMs 14:29:22 ... 4 GB Wasm32 limit I've also run into myself 14:29:54 ... they're doing some swapping to avoid hitting the 4 GB limit, interesting in what optimizations are done there and can we learn from them 14:30:17 https://github.com/huggingface/optimum/issues/1078 14:31:05 ... SegmentAnything, how are you splitting encoder and decoder? 14:32:22 s/encoder and decoder/encoder and decoder, precomputing embeddings, use mask decoder ... 14:32:33 dwayner: compute embeddings offline 14:33:06 Joshua_Lochner: about separation, have you experimented with it to reuse embedding in the decode pass? 14:33:12 dwayner: no 14:33:18 Joshua_Lochner: too heavy operation? 14:33:27 dwayner: it could be possible, not tried that yet 14:33:44 q? 14:33:59 q? 14:34:00 q+ 14:34:03 ack ningxin_hu 14:34:14 ningxin_hu: thanks for sharing this exploration! 14:35:09 ... ops v2, re squareroot, I can share on update that for XNNPACK backend we are implementing this, this is a good proposal if we can propose a dedicated one we can handle this special case 14:35:32 q+ 14:35:35 ... helps align with the implementation in Chromium 14:36:04 ... MVN is a good one too, want to look into this more and see how it can be handled by other backends 14:36:39 ... in general good list, we can open GH issues for all these to investigate 14:37:06 anssik: tracking issue and specific issues for specific v2 ops sounds good 14:37:08 q? 14:37:09 ack chai 14:37:30 chai: thanks Dwayne! the v2 list we need to look over and see what makes sense to add to WebNN API 14:38:04 ... when we first defined this WebNN spec one assumption has been that ops that are more facilitator-style, not requiring major compute should stay in the framework 14:38:11 ... the WebNN purpose is to accelerate compute 14:38:27 ... has been the case for ORT when we consider different backends that implement it 14:38:39 ... some ops mentioned here need to be looked into more closer 14:38:57 ... the number of ops we may want to add to WebNN eventually may not be as big as this list 14:39:29 ... if it is something that facilitates graph execution we should think if those can be make in the framework instead, maintaining a huge number of ops is a burden 14:39:37 ... what we add are the ones that are really needed 14:40:09 ... because we talk about v2 we should have an API that supports versioning so the framework can detect what the browser supports 14:40:32 ... we need to discuss on an approach for versioning / feature detection so we don't break content 14:40:39 ... a topic for v2 op discussion 14:40:59 q? 14:41:20 chai: implementation specific feedback we need for the spec 14:41:39 q+ 14:41:41 ... what are the columns in https://webmachinelearning.github.io/webnn-status/ and how are they implemented in TFLite 14:41:46 ack ningxin_hu 14:42:31 ningxin_hu: earlier we sent RFC to TF.js community to propose WebNN delegate for TFLite, it was approved by TF team in their SIG 14:42:54 ... this column in the table represents the prototype implementation of that TFLite delegate 14:43:19 ... this is working code and works with Chromium Canary 14:43:45 ... we are discussing with TF community to integrate this TF delegate 14:44:02 ... WebNN EP in ORT, this is a counterpart to that for TF 14:45:11 ... TFLite column represents WebNN delegate implementation status 14:46:43 chai: second column is the browser implementation of CPU backend? 14:46:46 ningxin_hu: yes, in master 14:48:14 chai: I'd adjust the table to include all WebNN ops 14:48:37 ... that'd be more understandable way to present this data 14:49:16 ... caveat, XNNPACK column should highlight this is the impl status in the browser, and the last two columns are framework statuses 14:49:20 +1 14:50:09 chai: this is great information to have 14:50:28 ... would also add DML implementation status, not in Chromium master, so can add a footnote for that 14:51:01 ... people coming to this page want to know what is in flight and coming up 14:51:23 q? 14:51:48 q? 14:52:37 Topic: WebIDL and Infra standard conventions 14:52:50 anssik: The zk-conventions-integration integration branch is now ready for review. 14:53:24 ... These changes align the entire WebNN API with modern spec conventions and add stylistic improvements on top that make navigating this specification more delightful experience. 14:53:31 ... The following resources are made available to the group to assist in this review task: 14:53:37 -> Preview of zk-conventions-integration https://zolkis.github.io/webnn/ 14:53:46 -> Diff between main and zk-conventions-integration https://services.w3.org/htmldiff?doc1=https%3A%2F%2Fwebmachinelearning.github.io%2Fwebnn%2F&doc2=https%3A%2F%2Fzolkis.github.io%2Fwebnn%2F 14:54:08 -> Issues addressed and PRs merged to zk-conventions-integration https://github.com/webmachinelearning/webnn/issues/210#issuecomment-1326361748 14:54:27 -> Commits ahead of main https://github.com/webmachinelearning/webnn/compare/main...zk-conventions-integration 14:54:39 anssik: Thanks Zoltan for this significant effort! 14:54:59 anssik: Next step is to review the entire thing and apply it as an atomic change to main. 14:55:08 ... it is up to the editors whether to squash all of it to one commit or land this with its history. 14:55:18 ... Zoltan, anything you'd like to say about this work 14:55:32 q+ 14:55:35 Zoltan: plan to use the integration branch for comments, can comment on PRs even if they are merged and closed 14:55:45 ... will create another PR for the atomic change 14:56:51 ... one more issue to discuss with Chai re sync/async PR whether to land in main of integration branch 14:56:51 ... my staging area contains those changes so prefer to clear that 14:56:51 q? 14:56:51 ack chai 14:56:51 chai: thanks Zoltan! 14:57:11 ... once approved we have everything merged, then no need for this branch? 14:57:44 Zoltan: all changes merged we have a single commit on the main, if we want to keep the history can merge from the branch 14:58:23 chai: there are multiple PRs over time that caused the issue in the queue, because reviewers did not know in which order to review, so atomicity preferred 14:59:04 ... the second issue is the ongoing updates to align with WebIDL conventions, important for the overall pipeline, also important for other upcoming changes 14:59:26 ... want to have these changes settled on the doc once for all, because subsequent changes need to align with these convention 15:00:09 ... once this is merged, is this all settled for conventions and for style? 15:01:35 ... once WebIDL conventions updates are done, any subsequent change will be more specific, correct? 15:01:38 Zoltan: correct 15:02:18 ... that is why we stay a bit longer on the integration branch to ensure we meet the editorial quality expectations wrt conventions 15:02:28 chai: once done we abandon the integration branch? 15:02:37 Zoltan: correct 15:02:43 chai: sounds good 15:03:22 q? 15:04:15 q+ 15:04:17 q? 15:04:42 ack zkis 15:05:00 Zoltan: this does not stop main branch from advancing, I will track main 15:05:14 chai: we should try to apply the new conventions asap 15:05:57 Zoltan: only open is external review, otherwise we're good 15:07:18 chai: would like to review the content in the integration branch and merge it into main in a one commit everything squashed into one commit 15:07:21 +1 15:07:24 q? 15:07:25 ack ningxin_hu 15:07:46 ningxin_hu: +1 to what chai said 15:08:16 https://github.com/webmachinelearning/webnn/issues/210#issuecomment-1612754997 15:10:52 Topic: Summer break 15:11:02 anssik: we'll take a break in July and will resume our WG calls 10 August. 15:11:12 ... GH remains open 24/7 and welcomes your contributions while the meetings are on a pause. 15:11:41 ... Thanks for the amazing first half of 2023! We've achieved a lot, published a CR, grew our group, advanced implementations. And the second half looks even more exciting! 15:11:46 RRSAgent, draft minutes 15:11:47 I have made the request to generate https://www.w3.org/2023/06/29-webmachinelearning-minutes.html anssik 15:14:54 zkis has joined #webmachinelearning 15:18:22 zkis has joined #webmachinelearning 17:10:20 Zakim has left #webmachinelearning 17:30:27 s/url-to-slides/https://lists.w3.org/Archives/Public/www-archive/2023Jun/att-0005/2023-06-29_WebNN_and_Transformers_Progress_W3C.pdf 17:30:31 RRSAgent, draft minutes 17:30:32 I have made the request to generate https://www.w3.org/2023/06/29-webmachinelearning-minutes.html anssik 18:06:16 zkis has joined #webmachinelearning 19:52:41 zkis has joined #webmachinelearning