13:57:10 RRSAgent has joined #webmachinelearning 13:57:14 logging to https://www.w3.org/2023/08/24-webmachinelearning-irc 13:57:14 RRSAgent, make logs Public 13:57:45 please title this meeting ("meeting: ..."), anssik 13:57:45 Meeting: WebML WG Teleconference – 24 August 2023 13:57:45 Chair: Anssi 13:57:45 Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2023-08-24-wg-agenda.md 13:57:53 Scribe: Anssi 13:57:53 scribeNick: anssik 13:57:56 gb, this is webmachinelearning/webnn 13:57:57 anssik, OK. 13:57:57 jsbell has joined #webmachinelearning 13:57:58 Vivek has joined #webmachinelearning 13:58:02 Present+ Anssi_Kostiainen 13:58:13 Present+ Rachel_Yager 13:58:22 Present+ Vivek_Sekhar 13:58:27 Present+ Zoltan_Kis 13:58:38 RRSAgent, draft minutes 13:58:40 I have made the request to generate https://www.w3.org/2023/08/24-webmachinelearning-minutes.html anssik 13:58:53 Present+ Joshua_Bell 14:00:03 RafaelCintron has joined #webmachinelearning 14:00:18 Present+ Rafael_Cintron 14:00:23 Ningxin_Hu has joined #webmachinelearning 14:00:41 Present+ Joshua_Lochner 14:00:54 Present+ Ningxin_Hu 14:01:12 Rachel has joined #webmachinelearning 14:02:33 Regrets+ Chai_Chaoweeraprasit 14:03:02 Joshua_Lochner has joined #webmachinelearning 14:03:31 RRSAgent, draft minutes 14:03:32 I have made the request to generate https://www.w3.org/2023/08/24-webmachinelearning-minutes.html anssik 14:04:18 anssik: Welcome to our 24 Aug call, we have a busy and an exciting agenda! 14:04:32 Topic: Google Chrome team's feedback on WebNN API 14:04:56 anssik: I asked Vivek and Joshua to share a high-level summary of the Chrome team's feedback with the WG and document that feedback in a GH issue. Thank you Vivek and Joshua for collecting all this feedback from Google Chrome teams and sharing it with the WG, this is much appreciated. 14:05:00 ... you'll find Google Chrome feedback in GH issue #453 14:05:01 https://github.com/webmachinelearning/webnn/issues/453 -> Issue 453 Google Chrome Feedback on WebNN: aiming for broad device coverage and maintainability (by vsekhar) 14:05:25 ... Vivek and Joshua, please feel free to use 10-20 minutes incl. discussion to share a high-level summary of your feedback. I expect the WG to continue discuss specific topics in the GH issue and spin new issues as appropriate. 14:05:56 Vivek: thank you Anssi and the WG 14:06:06 Vivek: Google strongly supports the work of the WebML WG 14:06:20 ... we got together at Google to gather feedback 14:06:46 ... feedback solicited from ML research and infrastructure teams at Google 14:07:14 ... reach and maintainability are important lenses for us 14:07:32 ... Chrome's key observation: 14:07:40 ... - for new OS APIs or hardware accelerators, we must assume that most Web users don't have them 14:08:11 ... - we have an obligation to ensure a workable experience for other users as well 14:08:36 ... Chrome's goal: 14:08:40 ... - achieve 80% of a device's theoretical hardware-accelerated ML runtime performance across 80% of devices on the Web, and to do so while imposing a manageable long-term support burden on browser vendors 14:08:47 ... Ecosystem issue: 14:08:52 ... - the ML ecosystem is still rapidly evolving, making it difficult for any API to keep up 14:09:36 ... Proposed steps for the WG (from the issue): 14:09:39 ... 1. Request public positions from major browser implementers 14:09:43 ... 2. Reduce the long term support burden of WebNN by streamlining the API surface 14:09:48 ... 3. Demonstrate WebNN performance for CPU and GPU execution across multiple OS platforms 14:09:54 ... 4. Demonstrate WebNN performance gains utilizing OS- and hardware-specific optimizations 14:11:39 ... Proposed steps for OS- and hardware-specific optimizations: 14:11:42 ... 1. Select 2-5 demonstrative ML models 14:11:45 ... 2. Run on a demonstrative set of platforms with accelerator hardware 14:11:49 ... 3. Evaluate latency, throughput and power efficiency between lowering to CPU/GPU vs. hardware accelerators 14:11:55 q+ 14:11:57 anssik: thanks Vivek and Joshua and the Google Chrome team for this feedback! 14:12:01 ack jsbell 14:13:12 jsbell: thanks for the summary! Just wanted to share this derives from thinking what we as Google Chrome have as requirements for shipping an API, the same criteria as we have for Intent to Ship for any feature 14:13:26 ... we stand behind what we ship for decades and these consideration are based on that expectation 14:13:58 q? 14:16:02 q+ 14:16:07 ack Vivek 14:16:53 Vivek: want to note the group has thought about the low-level and high-level ops question, appreciate that 14:17:54 ... reducing long-term support burden, if there's consensus emerging in the broader ML space we propose to align with that on op set abstraction level and scope 14:18:45 q? 14:19:00 Ningxin_Hu: thanks for this concrete feedback, a lot of good observations 14:19:13 ... I like that you've shared concrete guidance and recommendations 14:19:38 ... re "Demonstrate WebNN performance for CPU and GPU execution across multiple OS platforms" 14:19:59 ... suggestion is to implement WebNN as a polyfill on top of Wasm and WebGPU APIs 14:20:19 ... we have a JS implementation of WebNN using TF.js kernels with Wasm and WebGPU backends 14:20:38 https://github.com/webmachinelearning/webnn-polyfill 14:20:55 q+ 14:20:55 Ningxin_Hu: can you elaborate this proposal? 14:20:56 q+ 14:20:57 ack jsbell 14:21:03 q- 14:21:37 jsbell: clarifying that we are not proposing that browsers ship WebNN as a polyfill, but that the CG created polyfill would be excellent 14:22:16 ... launch process adoption question, it is adopted by developers, we want to avoid a situation that there is no polyfill and web developers code directly to CPU or GPU backend 14:22:41 q+ 14:22:45 ... if there's a quality polyfill framework authors can move to a polyfill and when browser implementations roll out we'd see immediate performance boost 14:22:59 ... this is called ecosystem activation 14:23:01 ack Vivek 14:23:26 Vivek: a polyfill helps clarify what is needed from the platform, running workloads on a polyfill will help trace where the performance bottlenecks are 14:23:41 ... e.g. if the polyfill is too large, we can see where the web platform can help 14:23:50 ... some features may remain in the user space 14:24:17 ... because they change so fast, so polyfill helps clarify those aspects and see what developers should be able to customize 14:24:18 q? 14:24:45 anssik: are we maintaining the WebNN polyfill? 14:25:03 Ningxin_Hu: probably not up to date with the very latest spec version 14:25:14 ... an opportunity to improve the WebNN polyfill 14:26:26 ... the polyfill builds on TF.js, so thanks for the TF.js team 14:26:49 q? 14:27:03 q+ 14:27:09 ack Joshua_Lochner 14:28:18 q+ 14:28:21 Joshua_Lochner: wanted to ask, re caching side of things, when you save models you don't need to redownload the models, as Transformers.js author an issue close to my heart 14:28:30 ... is this a consideration here? 14:28:40 AramZS has joined #webmachinelearning 14:28:42 ack jsbell 14:29:20 jsbell: I definitely acknowledge your concern, that is part of the broader ecosystem adoption consideration 14:29:35 ... we have folks in the Chrome team for improvements to storage 14:29:48 ... will be discussed outside this call, in another issue or a future call 14:30:05 RRSAgent, draft minutes 14:30:06 I have made the request to generate https://www.w3.org/2023/08/24-webmachinelearning-minutes.html anssik 14:30:12 q? 14:30:26 Topic: WebNN v2: review proposed new ops and data types 14:30:47 Present+ Dwayne_Robinson 14:31:13 anssik: I'd like us to review and discuss the proposed new ops and data types informed by v2 model targets and recent prototyping efforts. 14:31:25 ... Dwayne posted a well formulated list of proposals into a GH issue #375 -- thank you! 14:31:28 https://github.com/webmachinelearning/webnn/issues/375 -> Issue 375 Support for transformers (by dontcallmedom) [v2] [operation set] 14:31:30 ... Details in https://github.com/webmachinelearning/webnn/issues/375#issuecomment-1674224992 14:31:40 ... let me first summarize what was proposed and then let Dwayne fill me in 14:31:48 ... Proposed models: 14:31:58 ... - Text-to-image: Stable Diffusion unet/VAE/text encoder 14:32:09 ... - Image segmentation: Segment Everything decoder 14:32:20 ... - Speech-to-text: Whisper Tiny 14:32:33 anssik: We don't have text-to-text models proposed, so I'd like the WG to discuss if some would be applicable, examples: 14:32:53 ... Text-to-text: Summarization, translation, code completion demonstrated by Transformers.js? 14:32:57 q+ 14:33:00 dwayner has joined #webmachinelearning 14:33:02 ack Joshua_Lochner 14:33:27 Joshua_Lochner: text-to-text, I'll add some additional material 14:34:06 Joshua shared: 14:34:06 Text-to-text: 14:34:06 - Summarization: https://xenova.github.io/transformers.js/?demo=summarization 14:34:06 - Code completion: https://huggingface.co/spaces/Xenova/ai-code-playground 14:34:06 - Translation: https://huggingface.co/spaces/Xenova/react-translator 14:34:23 Joshua_Lochner: application-level tasks that are helpful in web apps 14:34:37 ... code completion would be helpful e.g. in GH or codespaces 14:36:28 ... this playground uses 300M params, no GPU, Wasm backend, reasonable performance already as is 14:36:32 ... privacy focused browser extension could make use of this, for example 14:36:32 ... star coder in collaboration with HuggingFace 14:36:38 ... I think translation is a great idea, but the concern is the model is huge, 600M parameter model, 1.3GB size 14:37:01 will do! 14:37:10 q? 14:37:23 ... Proposed new ops: 14:37:37 - Logical elementwise comparison/selection operations: equal, greater, lesser, logicalNot, elementwiseIf/ternary, greaterOrEqual/lesserOrEqual 14:37:38 - More elementwise unary operations: identity, sqrt, erf (Gauss err func), reciprocal 14:37:42 - Reshaping operations: squeeze, unsqueeze, flattenTo2d 14:37:46 - Data rearrangement operations: expand, gather 14:37:52 - Normalization operations: meanVarianceNormalization 14:37:56 - Index seeking operations: argMin/argMax 14:38:00 - Misc: cast, fillSequence, triangularMatrix, shape 14:38:04 - Others? 14:38:08 ... Proposed new data types: 14:38:13 ... - int64 14:38:17 ... - uint64 14:38:27 ... Relevant background material: 14:38:27 -> Transformers.js presentation by Joshua Lochner https://lists.w3.org/Archives/Public/www-archive/2023Jun/att-0000/Transformers_js.pdf 14:38:27 -> Transformer models presentation by Dwayne Robinson https://lists.w3.org/Archives/Public/www-archive/2023Jun/att-0005/2023-06-29_WebNN_and_Transformers_Progress_W3C.pdf 14:38:43 anssik: Dwayne, thanks for this proposal! Please feel free to share with the WG your questions and areas of focus. 14:38:50 RRSAgent, draft minutes 14:39:21 I have made the request to generate https://www.w3.org/2023/08/24-webmachinelearning-minutes.html anssik 14:39:59 Dwayne: I'd appreciate feedback from anyone else who has goals for models that are missing ops, or feedback on whether we should not have some of these ops because they can be satisfied by low-level ops 14:39:59 ... or feedback on naming 14:39:59 ... these ops enable the models we focus on, but happy to expand to other valuable models, e.g. the text-to-text models Joshua proposed 14:40:25 q+ 14:40:33 q? 14:41:33 ack Vivek 14:42:04 Vivek: I wanted to understand the motivations for data types, in our work with WebGPU we have been using less data types and plumbing it 14:42:04 ... floating point usually used in training use cases 14:42:24 dwayner: larger data types go past 4GB barrier, used by e.g. ONNX 14:43:20 q+ 14:43:26 ack Rachel 14:45:00 Rachel: Joshua_Lochner shared translation task, is it using WebNN API? 14:45:12 Joshua_Lochner: uses ONNX Runtime Wasm backend currently 14:45:54 ... I convert pretrained models to ONNX and use ORT backend to do inference, tokenization etc. other steps done in JS, everything is running in the browser 14:46:28 Source code! https://github.com/xenova/transformers.js 14:47:21 anssik: you can validate it is on-device inference by disconnecting from internet, it still works 14:47:49 Yes it uses this model: https://huggingface.co/Xenova/nllb-200-distilled-600M 14:48:30 which itself is an ONNX export of https://huggingface.co/facebook/nllb-200-distilled-600M 14:48:52 yes exactly, the performance is limited by the model itself (and this model is quite old!) 14:48:57 RRSAgent, draft minutes 14:48:58 I have made the request to generate https://www.w3.org/2023/08/24-webmachinelearning-minutes.html anssik 14:49:04 Topic: WebIDL and Infra standard conventions 14:49:12 anssik: first, thank you Zoltan for keeping this big PR updated with comments and everyone for your review, this has been a great team effort across orgs! 14:49:16 ... these changes align the entire specification with modern specification conventions and add stylistic improvements on top that make navigating this specification more delightful experience. 14:49:25 ... today I'd like us to make a decision whether we're ready to merge the zk-conventions-integration integration branch to main. 14:49:30 ... the big PR is #446 14:49:30 https://github.com/webmachinelearning/webnn/issues/446 -> Pull Request 446 Add missing algorithms, add stylistic improvements, update with spec conventions (by zolkis) 14:49:52 ... first I'd ask Zoltan to summarize the big PR #446 latest status and then Joshua Bell would like to spend a couple minutes discussing some of the motivations behind modern spec style, and the "processing model" (i.e. how JS types pass through Web IDL to become Infra types) etc. A part of Joshua's feedback is captured in issue #450 14:49:54 https://github.com/webmachinelearning/webnn/issues/450 -> Issue 450 Use web spec best practices (by zolkis) [conventions-integration] 14:49:57 ... thanks to Joshua's contributions we now have a Bikeshed build with no warnings! :) 14:50:01 RRSAgent, draft minutes 14:50:02 I have made the request to generate https://www.w3.org/2023/08/24-webmachinelearning-minutes.html anssik 14:50:18 zkis: 150 commits in that PR, squashed from many commits 14:50:37 ... adding algorithms that were missing, following modern specification best practices 14:50:54 ... jsbell helped a lot here, thank you 14:50:58 ... a lot of work over the past 2 week 14:51:25 ... other changes waiting for this to land, happy to report we did more than what we promised to do 14:51:45 ... thanks to the extended team for reviews, I included names in the ack section of the spec 14:51:55 ... next step is for editors to approve the big PR and merge it 14:52:18 ... I can do quick fixes, but planning to start holiday next week 14:53:20 anssik: can you work with Chai to merge this PR? 14:53:32 ... any concerns from anyone for merging this PR to main? 14:54:19 Ningxin_Hu: LGTM, some remaining open issues we can keep open in GH 14:54:50 ... we do a final check and let Zoltan know if last-minute changes are needed 14:55:23 jsbell: I want to acknowledge I joined the process very late, appreciate your support for my contributions 14:56:50 anssik: I'm hearing we are ready to merge after a final check by Ningxin, Ningxin to work with Chai to get his GH approval and then merge 14:57:16 [no concerns with the proposed plan] 14:57:31 anssik: we'll proceed with the merge as noted 14:57:33 anssik: thank you everyone! 14:58:12 zkis: I will handle issues, mostly will be away 14:58:16 Ningxin_Hu: we can do this tomorrow latest 14:59:00 RRSAgent, draft minutes 14:59:01 I have made the request to generate https://www.w3.org/2023/08/24-webmachinelearning-minutes.html anssik 17:00:19 Zakim has left #webmachinelearning