13:52:16 <RRSAgent> RRSAgent has joined #webmachinelearning
13:52:20 <RRSAgent> logging to https://www.w3.org/2024/04/04-webmachinelearning-irc
13:52:20 <Zakim> RRSAgent, make logs Public
13:52:21 <Zakim> please title this meeting ("meeting: ..."), anssik
13:52:23 <anssik> Meeting: WebML WG Teleconference – 4 April 2024
13:52:28 <anssik> Chair: Anssi
13:52:34 <anssik> Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2024-04-04-wg-agenda.md
13:52:39 <anssik> Scribe: Anssi
13:52:49 <anssik> scribeNick: anssik
13:53:06 <anssik> gb, this is webmachinelearning/webnn
13:53:06 <gb> anssik, OK.
13:53:14 <anssik> Present+ Anssi_Kostiainen
13:53:21 <anssik> Regrets+ Reilly_Grant
13:53:27 <anssik> RRSAgent, draft minutes
13:53:28 <RRSAgent> I have made the request to generate https://www.w3.org/2024/04/04-webmachinelearning-minutes.html anssik
13:57:11 <AramZS> AramZS has joined #webmachinelearning
13:57:31 <anssik> Present+ Michael_McCool
13:58:13 <jsbell> jsbell has joined #webmachinelearning
13:59:09 <anssik> Present+ Joshua_Bell
13:59:25 <McCool> McCool has joined #webmachinelearning
14:00:45 <Ningxin_Hu> Ningxin_Hu has joined #webmachinelearning
14:01:01 <anssik> Present+ Dwayne_Robinson
14:01:06 <anssik> Present+ Ningxin_Hu
14:01:11 <anssik> Present+ Bryan_Bernhart
14:01:19 <anssik> Present+ Joshua_Lochner
14:01:28 <phillis> phillis has joined #webmachinelearning
14:01:31 <anssik> Present+ Phillis_Tang
14:01:39 <anssik> Present+ Rafael_Cintron
14:01:40 <geoff> geoff has joined #webmachinelearning
14:01:48 <anssik> Present+ Ningxin_Hu
14:01:52 <anssik> Present+ Geoff_Gustafson
14:02:10 <Joshua_Lochner> Joshua_Lochner has joined #webmachinelearning
14:02:13 <anssik> Present+ Ilya_Rezvov
14:02:22 <anssik> RRSAgent, draft minutes
14:02:24 <RRSAgent> I have made the request to generate https://www.w3.org/2024/04/04-webmachinelearning-minutes.html anssik
14:02:31 <RafaelCintron> RafaelCintron has joined #webmachinelearning
14:02:39 <anssik> Present+ Zoltan_Kis
14:02:40 <cbacharakis> cbacharakis has joined #webmachinelearning
14:02:40 <zkis> zkis has joined #webmachinelearning
14:02:40 <anssik> Present+ Austin_Sullivan
14:02:52 <zkis> present+ Zoltan_Kis
14:02:57 <anssik> Present+ Christo_Bacharakis
14:03:38 <asully> asully has joined #webmachinelearning
14:03:54 <anssik> RRSAgent, draft minutes
14:03:55 <RRSAgent> I have made the request to generate https://www.w3.org/2024/04/04-webmachinelearning-minutes.html anssik
14:04:03 <anssik> anssik: please join me in welcoming our new WG participants:
14:04:14 <anssik> ... - Vasil Dedejski from Netcetera
14:04:21 <anssik> ... - Juan Deng from Alibaba Group
14:04:35 <anssik> ... - also Michael McCool from Intel joins in an official WG participant capacity
14:05:01 <anssik> Topic: NPU support discussion
14:05:07 <anssik> anssik: issue #623
14:05:08 <gb> https://github.com/webmachinelearning/webnn/issues/623 -> Issue 623 WebNN should support NPU and QDQ operations (by wchao1115) [v2] [opset] [feature request]
14:05:46 <anssik> anssik: The goal of this discussion is to revisit the problem space for NPU support for WebNN, review the high-level proposal for the key elements of the design, gather feedback and signals from the group regarding interest, scope, priority.
14:06:05 <anssik> ... NPU device type and support for quantized models have been explored in the group prior and have been awaiting implementation experience.
14:06:14 <anssik> ... earlier discussion in issues #128 and #302
14:06:15 <gb> https://github.com/webmachinelearning/webnn/issues/128 -> Issue 128 WebNN should support int8 quantized models (by wchao1115) [v2] [opset] [feature request]
14:06:15 <gb> https://github.com/webmachinelearning/webnn/issues/302 -> Issue 302 API simplification: context types, context options, createContext() (by zolkis) [v2]
14:06:29 <anssik> ... thanks to Chai for the proposal and Yajing for comments
14:07:04 <anssik> Dwayne: NPUs are probably familiar to most, not necessarily faster than GPUs but more efficient
14:07:19 <anssik> ... in the spec we have GPU and CPU and power preferences, data types are also in the spec
14:07:32 <McCool> q+
14:07:42 <anssik> ... we're missing NPU device type and bare minimum operations, quantization and dequantization
14:07:50 <McCool> (when Dwayne is done)
14:08:07 <anssik> ... linear 8-bit is typical, also more aggressive ones even 1-bit exists, we think should start with the most common ones
14:08:33 <anssik> ... device type and quantization ops are the key elements, moving forward need to enumerate the options what the options would look at
14:09:07 <anssik> ... bikeshedding areas how to express this in the context, CPU, GPU, NPU, how to express that? a primary device type and a fallback device type? a priority order?
14:09:07 <anssik> q?
14:09:29 <anssik> q?
14:09:35 <anssik> ack McCool
14:09:58 <anssik> Michael: I wanted to point our quantization is also important for CPU and GPU, due to memory bandwidth issues
14:10:13 <anssik> ... quantization representation should be perhaps separated from the target device
14:10:25 <anssik> ... we're looking at impacts on caching, if a model is quantized or not
14:10:26 <anssik> q?
14:11:27 <anssik> Phillis: I provided feedback in the comments, the current proposal is to have add NPU and fallbacks, it assumes NPU has a smaller set of ops, but CPU and GPU have full coverage
14:11:48 <anssik> ... based on implementation experience from TFLite and CoreML backend, op set coverage is smaller compared to CPU
14:12:01 <anssik> ... so if we want to signal fallback, need to signal it for other device types too
14:12:19 <anssik> ... CoreML is opaque in terms on in which compute unit the workload executes on
14:12:52 <anssik> ... sometimes the workload is executed on CPU if the tensor is small enough, a blackbox style design
14:12:54 <anssik> q?
14:13:04 <anssik> q?
14:13:35 <zkis> q+
14:13:41 <anssik> Dwayne: I know Mingming submitted a patch to Chromium for NPU device type to experiment that
14:13:48 <anssik> ... we don't have a concept of a fallback yet
14:13:49 <anssik> q?
14:13:52 <anssik> ack zkis
14:14:18 <anssik> Zoltan: Phillis, can we say CPU is always the fallback device? Should we separate fallback from the hints?
14:14:20 <jsbell> jsbell has joined #webmachinelearning
14:14:39 <anssik> Phillis: I think hints are more accurate, match the underlying behavior better, a soft signal to the underlying backend it prefers NPU
14:14:50 <jsbell> jsbell has joined #webmachinelearning
14:15:21 <anssik> Zoltan: CPU as a fallback device? can we spec it as such it always works
14:15:44 <anssik> Dwayne: we could spec this as a preference, and get a signal back to the developer if it doesn't work
14:15:50 <anssik> q?
14:16:31 <anssik> Dwayne: CPU as a fallback has a challenge needs to do graph partitioning
14:16:47 <anssik> Phillis: CoreML itself does graph partitioning
14:17:00 <anssik> Dwayne: similarly for DML too in the future, happens transparently
14:17:10 <anssik> Phillis: no need to worry about subgraphs partitioning then
14:17:11 <anssik> q?
14:17:49 <Ningxin_Hu> q+
14:17:56 <anssik> ack Ningxin_Hu
14:18:08 <anssik> Ningxin_Hu: I can speak for Mingming and the implementation
14:18:12 <anssik> -> Chromium implementation https://chromium-review.googlesource.com/c/chromium/src/+/5330647
14:18:27 <anssik> Ningxin_Hu: we've added NPU device type for DML path to testing
14:18:52 <anssik> ... these is no fallback in the current implementation, we are collaborating with the DML team and Intel driver team for fallback experiment
14:19:27 <anssik> ... to introduce the NPU device type, we can run a small set of models, using that as a starting point
14:19:47 <anssik> ... later with fallback to CPU and GPU
14:19:48 <anssik> q?
14:19:58 <anssik> q?
14:20:15 <anssik> Topic:  Hybrid AI exploration update
14:20:32 <anssik> anssik: I've asked the project team to give a brief update on the proof-of-concept informed by the group's feedback. Thanks everyone for your insightful feedback shared to date!
14:20:37 <anssik> -> https://github.com/webmachinelearning/proposals/issues/5
14:20:38 <gb> https://github.com/webmachinelearning/proposals/issues/5 -> Issue 5 Hybrid AI Exploration (by grgustaf)
14:20:43 <anssik> Michael: presenting a slide with a summary of feedback received from the group
14:21:09 <anssik> ... Models/Use Cases:​
14:21:17 <anssik> ... 1) MMS – ASR​
14:21:31 <anssik> ... 2) SeamlessM4T - general speech tasks
14:21:42 <anssik> ... 3) Others? LLMs? e.g. mistral-7b
14:21:49 <anssik> ... Comments
14:22:06 <anssik> ... Two forms of hybrid AI
14:22:13 <anssik> ... - Server or Client
14:22:17 <anssik> ... - Split models
14:22:42 <anssik> ... Meaning of "hybrid"?  Can also mean "heterogenous hardware"
14:22:53 <anssik> ... Pain points, priority
14:23:02 <anssik> ... Generally saving space/download latency
14:23:15 <anssik> ... 1. Sharing/reusing large models across sites
14:23:24 <anssik> ... 2. Same resources at different URLs
14:23:33 <anssik> ... 3. Same resources in different formats
14:23:54 <anssik> ... 4. Want to explose "built-in" models
14:24:29 <anssik> ... 5. Generalizing models
14:24:39 <anssik> ... 6. Need solution that can handle adapters
14:24:41 <anssik> q?
14:25:22 <jsbell> q+
14:25:26 <anssik> ack jsbell
14:25:39 <anssik> jsbell: wanted to ask if your'e engaged with privacy groups?
14:25:44 <anssik> Michale
14:25:49 <anssik> s/Michale//
14:26:16 <anssik> Michael: have thought about shared caches and privacy considerations, ways to mitigate that
14:27:35 <anssik> jsbell: wanted to make sure the privacy implementations are considered, if a small number of sites are using a specific model then that 1-bit of information is significant
14:28:34 <anssik> anssik: privacy considerations are important, proposing those are documented along with the proposal
14:28:34 <anssik> q?
14:29:06 <anssik> Topic: Open issues and PRs
14:29:15 <anssik> Subtopic: Debrief on PRs merged recently
14:29:27 <anssik> anssik: This topic is for the editors and PR authors to debrief the group on substantive PRs that got merged in the last few weeks, answer question from the group.
14:29:32 <anssik> -> Recently merged PRs https://github.com/webmachinelearning/webnn/pulls?q=is%3Apr+is%3Amerged
14:29:36 <anssik> ... PRs merged by issue type since last meeting:
14:29:42 <anssik> ... conventions: #618 #621 #627
14:29:42 <gb> https://github.com/webmachinelearning/webnn/pull/618 -> MERGED Pull Request 618 Conventions: Add and apply a few more spec coding conventions (by inexorabletash) [conventions]
14:29:43 <gb> https://github.com/webmachinelearning/webnn/pull/627 -> MERGED Pull Request 627 Conventions: Use "rank" for variable names, when appropriate (by inexorabletash) [conventions]
14:29:43 <gb> https://github.com/webmachinelearning/webnn/pull/621 -> MERGED Pull Request 621 Conventions: Ensure all dict members have definitions (by inexorabletash)
14:29:48 <anssik> ... bug fix: #616 #619 #620
14:29:48 <gb> https://github.com/webmachinelearning/webnn/pull/619 -> MERGED Pull Request 619 Bugfix: Drop "re-throw" in MLActivation creation steps (by inexorabletash)
14:29:48 <gb> https://github.com/webmachinelearning/webnn/pull/616 -> MERGED Pull Request 616 Bugfix: Unbalanced autolink brackets (by inexorabletash)
14:29:50 <gb> https://github.com/webmachinelearning/webnn/pull/620 -> MERGED Pull Request 620 Bug fix: Drop "... have been checked..." notes (by inexorabletash)
14:29:53 <anssik> ... question: #617 #632
14:29:54 <gb> https://github.com/webmachinelearning/webnn/pull/632 -> MERGED Pull Request 632 add a note for empty input (by philloooo)
14:29:57 <gb> https://github.com/webmachinelearning/webnn/pull/617 -> MERGED Pull Request 617 Update NSNet2 reference (by anssiko)
14:30:22 <anssik> anssik: anything the editors or PR authors feel important to highlight for the broader group who may not follow day-to-day GH activity?
14:30:34 <jsbell> q+
14:30:39 <anssik> ack jsbell
14:31:21 <ningxin> ningxin has joined #webmachinelearning
14:31:24 <anssik> jsbell: not specific to any of these PRs, FYI, I'm working on a local tool written in Node.js to enforce conventions, will share it with the group when better baked in, qSA against the DOM and grepping against the source
14:31:41 <anssik> Dwayne: in CI or local?
14:31:46 <anssik> jsbell: to be determined
14:32:10 <anssik> anssik: much thanks Josh for your work on this tool
14:34:07 <anssik> Subtopic: [bug] Need clarify scale factor for resample2d
14:34:23 <anssik> anssik: issue #610 is about overflow handling convention preferences
14:34:25 <gb> https://github.com/webmachinelearning/webnn/issues/610 -> Issue 610 Need clarify scale factor for resample2d  (by BruceDai) [bug]
14:34:28 <anssik> ... Josh reports part 1 of this issue was addressed by https://github.com/webmachinelearning/webnn/commit/24edf7f775ccb5105b3503449c714e2b7af563e6
14:34:34 <anssik> ... part 2 is not yet addressed in the spec IIUC:
14:34:38 <anssik> ... "clarify the limitations for the product of scale factor and spatial dimension's size"
14:34:59 <anssik> ... it looks like the validation steps of Resample2d were fixed in the implementation
14:35:11 <anssik> ... are the remaining spec changes clear? anything to discuss?
14:35:20 <anssik> q?
14:36:06 <anssik> jsbell: part 2, is around validation of clamping that may be required
14:37:00 <anssik> Dwayne: would be a lot of noise if we'd add that to all ops, can we centrally spec that?
14:37:02 <anssik> q?
14:37:22 <ningxin> +1 to handle overflow centrally
14:37:45 <anssik> q?
14:37:47 <jsbell> jsbell has joined #webmachinelearning
14:38:06 <anssik> q?
14:38:20 <anssik> Subtopic: [bug] Synchronously validate input operands/activations
14:38:23 <anssik> anssik: issue #572
14:38:24 <gb> https://github.com/webmachinelearning/webnn/issues/572 -> Issue 572 Synchronously validate input operands/activations (by inexorabletash) [bug] [question]
14:38:32 <anssik> anssik: merged PRs #591 and #605 addressed parts of this issue
14:38:32 <gb> https://github.com/webmachinelearning/webnn/pull/591 -> MERGED Pull Request 591 Content: Define operand concept, simplify graph connection steps (by inexorabletash)
14:38:34 <gb> https://github.com/webmachinelearning/webnn/pull/605 -> MERGED Pull Request 605 Synchronously validate input operands/validations (by inexorabletash)
14:38:39 <anssik> ... Josh identified the following as remaining work:
14:38:46 <anssik> - Standard phrasing for "be an operator for ..."
14:38:46 <anssik> - Introducing "Validate arguments" section for each method
14:38:46 <anssik> - Introducing "Calculate output shape" section for each method (maybe "output descriptor" is better?)
14:39:03 <anssik> jsbell: just any feedback on those is welcome
14:39:21 <anssik> ... first is asking how we should phrase that, easy PR for someone who want to say "let's do it this way!"
14:39:47 <anssik> ... others have come up in this issue, feedback wanted on those too, don't want to add too many PRs that just add text
14:39:48 <anssik> q?
14:40:15 <anssik> Subtopic: [question] Allow no-op graphs?
14:40:20 <anssik> anssik: issue #614
14:40:21 <gb> https://github.com/webmachinelearning/webnn/issues/614 -> Issue 614 Allow no-op graphs? (by inexorabletash) [question]
14:40:49 <anssik> ... Josh reports in PR #603 a step was added to build() to match the Chromium implementation, which errors out if an input operand or a constant operands is specified as an output.
14:40:50 <gb> https://github.com/webmachinelearning/webnn/pull/603 -> MERGED Pull Request 603 Content: Define build() steps more rigorously (by inexorabletash)
14:41:11 <anssik> ... Dwayne notes this would nullify the possibility of a nop graph but also noted that a caller can always insert a dummy identity node to satisfy this constraint
14:41:21 <anssik> ... Ningxin notes constant-only graph seems useful, especially for GPU or NPU device
14:41:27 <anssik> ... also good discussion for "constant MLBuffer"
14:41:41 <anssik> ... proposed by Bryan as "after builder.constant(mlBuffer), it becomes read-only (ex. no writeBuffer() or dispatch() allowed)."
14:42:13 <ningxin> q+
14:42:52 <anssik> Dwayne: I don't think this is a big deal
14:42:54 <anssik> ack ningxin
14:43:26 <anssik> ningxin: I'd like to make it clear that my comment is about constant-only graph as the use case, a model with two decoder graphs
14:43:55 <anssik> ... later in discussion with Austin and Bryan my use case is satisfied by "constant MLBuffer", so wanted to clarify this
14:44:20 <anssik> ... another comment, I want to understand if there's a use case for a no-op graph?
14:44:21 <anssik> q?
14:44:58 <anssik> Dwayne: chaining models together and get flexibility along that chain, if there's a concept that just takes input and passes that along, but dummy identity would still satisfy that
14:45:09 <anssik> ningxin: WebNN is supposed to be a backend API, so not sure if that in scope
14:45:10 <anssik> q?
14:45:27 <anssik> Dwayne: no reservations to resolve this with an empty graph
14:45:29 <anssik> q?
14:46:27 <anssik> ningxin: because Bryan mentioned this was added to TODO we can track this as part of MLBuffer proposal
14:46:30 <anssik> q?
14:46:49 <anssik> Subtopic: [question] Graph with no input
14:46:54 <anssik> anssik: issue #615 and PR #632
14:46:54 <gb> https://github.com/webmachinelearning/webnn/pull/632 -> MERGED Pull Request 632 add a note for empty input (by philloooo)
14:46:55 <gb> https://github.com/webmachinelearning/webnn/issues/615 -> CLOSED Issue 615 Graph with no input (by philloooo) [question]
14:46:59 <anssik> ... this one was fixed, thanks!
14:47:58 <anssik> Phillis: I added a note that says WebNN does support that,  if the backend does not support this, implementations can work around this.
14:48:03 <anssik> q?
14:48:19 <anssik> Subtopic: [question] Can an MLGraphBuilder be reused?
14:48:27 <anssik> anssik: issue #567
14:48:28 <gb> https://github.com/webmachinelearning/webnn/issues/567 -> Issue 567 Can an MLGraphBuilder be reused? (by reillyeon) [question]
14:48:38 <anssik> ... Reilly asks "Are there any known use cases for MLGraphBuilder reuse?
14:48:43 <anssik> ... Ningxin mentions one use case:
14:48:49 <anssik> ... "Whisper, because WebNN doesn't support If operator, there will be two WebNN sub-graphs being built. One with past Key Value (KV) cache ("with_past") and another one without past KV cache ("no_past"). The inference code will run the "no_past" sub-graph for the first iteration and run the "with_past" sub-graph for the following iterations. The two sub-graphs actually share some common weights. It would be useful if the same
14:48:49 <anssik> weights being built by builder.constant() can be taken by the operators of two sub-graphs."
14:48:59 <jsbell> jsbell has joined #webmachinelearning
14:49:03 <jsbell> q+
14:50:05 <anssik> ningxin: this is related to the previous topic, a merged model in ONNX terms, has two subgraphs with If operator, because WebNN does not support If op, in the backend the if op will fallback to Wasm, one with passed KV cache, and another w/o passed KV cache
14:50:53 <anssik> ... after the first iteration a value can be reused in KV cache, later on cache is reused until the sequence length is reached
14:51:03 <anssik> ... used in transformer models, including Whisper
14:51:46 <anssik> ... for ONNX RT Web, for subgraphs, weights are shared, we create a constant for weights in the same GraphBuilder and reuse them, in this use case we can reused MLGraphBuilder
14:52:26 <anssik> ... later if we talk about MLBuffer, if we can make MLBuffer hold the weights, persist in device memory, that'd be even better, two graphs can use the constants with MLBuffer w/o any duplication of data uploading or memory copy
14:52:37 <anssik> ... also avoid duplication of copies of device memory
14:52:50 <anssik> ... I think the two things for constant and MLGraphBuilder for reuse are related
14:53:06 <anssik> q?
14:53:11 <anssik> ack jsbell
14:53:22 <jsbell> From Reilly Grant: I'm satisfied that there are good use cases for constructing multiple MLGraphs from an MLGraphBuilder but we need example code and implementation experience before we can decide specifically how it will work. In particular I'm still concerned by the constraints it puts on implementations if build() can be called an arbitrary
14:53:22 <jsbell> number of times and so I'd like the group to consider a multibuild() variant which allows compiling multiple graphs simultaneously which is likely to be more efficient for implementations while hopefully giving the same power to developers.
14:53:53 <anssik> q?
14:54:12 <anssik> q?
14:54:15 <ningxin> +1 to explore more
14:54:47 <anssik> Subtopic: [question] Consider adopting new broadcasting rules
14:54:52 <anssik> anssik: issue #590
14:54:52 <gb> https://github.com/webmachinelearning/webnn/issues/590 -> Issue 590 Consider adopting new broadcasting rules (by a-sully) [question]
14:54:57 <anssik> ... discussed on our 21 March telcon
14:55:02 <anssik> ... to recap, Austin sees three options:
14:55:07 <anssik> ... Option 2: Adopt XLA's broadcasting rules
14:55:14 <anssik> ... Option 1: Adopt NumPy's broadcasting rules
14:55:18 <anssik> ... Option 3: Keep the status quo
14:55:23 <anssik> ... Dwayne offered one more option:
14:55:30 <anssik> ... Option 4: Dwayne's proposal
14:55:37 <anssik> ... - keep unidirectional for rare cases (expand and GEMM)
14:55:41 <anssik> ... - research more backends to potentially add restrictions that inputs must have the same rank.
14:56:07 <anssik> q?
14:56:17 <anssik> anssik: I'm hearing we let this issue sit
14:56:30 <jsbell> q+
14:56:30 <anssik> Subtopic: [question] Is "validate graph resources" backwards?
14:56:35 <anssik> anssik: issue #602 and PR #622 (thanks Josh!)
14:56:35 <gb> https://github.com/webmachinelearning/webnn/pull/622 -> Pull Request 622 Bug fix: Make "validate graph resources" test reflexive (by inexorabletash)
14:56:35 <gb> https://github.com/webmachinelearning/webnn/issues/602 -> Issue 602 Is "validate graph resources" backwards? (by inexorabletash) [question]
14:57:03 <anssik> jsbell: put up the PR for the reflexive test
14:57:26 <anssik> ... came up with a new behaviour after the telcon, reflected in the PR
14:57:40 <anssik> ... validation of input and output async, for better developer ergonomics
14:57:55 <jsbell> s/async/asymmetric/
14:58:11 <anssik> Subtopic: [question] Need clarify the usage of axes=[0,1] for resample2d
14:58:17 <anssik> anssik: issue #624
14:58:17 <gb> https://github.com/webmachinelearning/webnn/issues/624 -> Issue 624 Need clarify the usage of axes=[0,1] for resample2d (by BruceDai) [question] [operator specific]
14:58:22 <anssik> ... this implementation experience informed question has details in the issue
14:58:58 <anssik> Dwayne: came in core review, resample supports a number of different axes
14:59:33 <anssik> ... anyone knows the history of this design?
14:59:38 <anssik> q?
14:59:44 <jsbell> q-
15:00:02 <anssik> q?
15:00:21 <anssik> Subtopic: [feature-request] Gaussian error linear unit (GELU) activation
15:00:25 <anssik> anssik: issue #626
15:00:26 <gb> https://github.com/webmachinelearning/webnn/issues/626 -> Issue 626 Need Gelu operation (by mingmingtasd) [feature request] [operator specific]
15:00:31 <anssik> ... a proposed new op passes our initial tests:
15:00:36 <anssik> ... sample models: Whisper base, Stable Diffusion U-Net, Segment Everything decoder
15:00:43 <anssik> ... cross-framework support: ONNX, TF, PyTorch
15:00:50 <anssik> ... cross-platform implementability: CoreML, DirectML, OpenVINO
15:00:55 <anssik> ... this seems like a reasonable addition, any comments?
15:00:59 <anssik> RRSAgent, draft minutes
15:01:00 <RRSAgent> I have made the request to generate https://www.w3.org/2024/04/04-webmachinelearning-minutes.html anssik
15:01:07 <ningxin> ningxin has joined #webmachinelearning
15:01:07 <anssik> q?
15:01:18 <ningxin> q+
15:01:24 <RafaelCintron> RafaelCintron has joined #webmachinelearning
15:01:24 <jsbell> Seems reasonable
15:01:24 <anssik> ack ningxin
15:01:40 <anssik> ningxin: Mingming mentioned this gives perf gain if supported
15:01:59 <anssik> ... in the experimental implementation on NPU we observed perf benefit if we have Gelu over emulation path
15:02:06 <McCool> McCool has joined #webmachinelearning
15:02:21 <anssik> ... with this activation we can fuse with matmul for even better performance
15:02:22 <anssik> q?
15:02:33 <anssik> q?
15:02:53 <anssik> ... PR #628 is out already
15:02:56 <gb> https://github.com/webmachinelearning/webnn/pull/628 -> Pull Request 628 Define Gelu operation (by mingmingtasd)
15:03:23 <anssik> q?
15:04:09 <anssik> RRSAgent, draft minutes
15:04:10 <RRSAgent> I have made the request to generate https://www.w3.org/2024/04/04-webmachinelearning-minutes.html anssik
15:16:24 <anssik> s/core review/code review
15:17:06 <anssik> RRSAgent, draft minutes
15:17:07 <RRSAgent> I have made the request to generate https://www.w3.org/2024/04/04-webmachinelearning-minutes.html anssik
15:17:26 <anssik> Regrets+ Dominique_Hazael-Massieux
15:17:57 <anssik> RRSAgent, draft minutes
15:17:59 <RRSAgent> I have made the request to generate https://www.w3.org/2024/04/04-webmachinelearning-minutes.html anssik
15:27:10 <anssik> s/explose/expose
15:28:06 <anssik> s/privacy implementations/privacy in implementations
15:28:25 <anssik> RRSAgent, draft minutes
15:28:26 <RRSAgent> I have made the request to generate https://www.w3.org/2024/04/04-webmachinelearning-minutes.html anssik
16:38:12 <\join_subline> \join_subline has joined #webmachinelearning
17:25:23 <Zakim> Zakim has left #webmachinelearning