14:58:00 RRSAgent has joined #webmachinelearning 14:58:04 logging to https://www.w3.org/2023/02/16-webmachinelearning-irc 14:58:04 Meeting: WebML WG Teleconference – 16 February 2023 14:58:04 RRSAgent, make logs Public 14:58:05 please title this meeting ("meeting: ..."), anssik 14:58:33 Meeting: WebML WG Teleconference – 16 February 2023 14:58:33 Chair: Anssi 14:58:36 Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2023-02-16-wg-agenda.md 14:58:40 Scribe: Anssi 14:58:44 scribeNick: anssik 14:58:50 scribe+ dom 14:59:06 Present+ Anssi_Kostiainen 14:59:37 ghurlbot, this is webmachinelearning/webnn 14:59:37 anssik, OK. 14:59:37 RRSAgent, draft minutes 14:59:38 I have made the request to generate https://www.w3.org/2023/02/16-webmachinelearning-minutes.html anssik 15:02:10 ningxin_hu has joined #webmachinelearning 15:02:45 zkis has joined #webmachinelearning 15:05:24 Present+ Ningxin_Hu 15:05:29 Present+ Zoltan_Kis 15:11:15 Present+ Chai_Chaoweeraprasit 15:12:04 chai has joined #webmachinelearning 15:13:15 Topic: WebNN API open PRs and issues 15:13:27 Subtopic: Simplify MLContext creation 15:13:40 anssik: PR #322 was discussed extensively on our 2 February 2023 call 15:13:40 https://github.com/webmachinelearning/webnn/issues/322 -> Pull Request 322 Simplify MLContext creation (wchao1115) 15:13:46 -> [minutes] WebML WG Teleconference – 2 February 2023 https://www.w3.org/2023/02/02-webmachinelearning-minutes.html 15:13:54 anssik: following the call we reached an agreement this PR should be highlighted 15:13:58 ... in the upcoming CR "status of this document" section 15:14:14 ... IOW, we agreed not to block the initial CR with this PR 15:14:21 ... this SOTD update is in PR #340, approved and ready to merge when we commence with the CR publication 15:14:21 https://github.com/webmachinelearning/webnn/issues/340 -> Pull Request 340 Add Status of this document note for Candidate Recommendation (anssiko) 15:14:34 ... I'd like us to use this call to discuss the other topics we deferred from our last call 15:14:53 Subtopic: Rework the sync and async algorithms 15:15:05 anssik: issue #316 and PR #329 15:15:05 https://github.com/webmachinelearning/webnn/issues/316 -> Issue 316 Review sync vs async compute differences (zolkis) 15:15:05 https://github.com/webmachinelearning/webnn/issues/329 -> Pull Request 329 Rework the sync async algorithms based on #323 (zolkis) 15:15:16 anssik: issue description: 15:15:37 ... - Inputs and their validation is the same in both, so it can be factored out 15:15:46 ... - The async steps currently are not really asynchronous. 15:16:31 ... - There is only one difference between the sync and async steps (excluding promise-related): the async version compares the byte length of |outputTensor| with the length of the output descriptor corresponding to |key|, whereas the sync version compares the same with the byte length of |value|. 15:16:35 anssik: the proposed PR changes: 15:16:42 ... - Factor out graph input/output validation 15:16:49 ... - Factor out graph execution algorithm 15:16:58 ... - Use them in the sync and async algorithm steps 15:17:46 zkis: good summary, adapted per Ningxin's changes, would like to get re-review from Ningxin 15:18:11 ... after the changes it was more complicated to specify, even if diff looks complex not so much different from the previous version 15:18:22 ... we cannot review it on this call, need to be read slowly 15:19:00 i'll take a look at this PR, thanks Zoltan! 15:19:14 ... no hurry with this PR actually, can live parallel live with other higher priority PRs 15:20:02 ... this is pretty similar to what Ningxin did, validation steps are just not repeated, minimal specification principle applied in this PR 15:20:41 ... asserts in the text have no implication on implementations, consider them notes in the spec per WebIDL spec conventions 15:21:04 anssik: we want to get this in the initial CR? 15:21:10 zkis: yes, prefer that 15:21:48 https://infra.spec.whatwg.org/#assertions 15:22:00 Subtopic: Add internal slots to MLOperand and MLActivation 15:22:09 anssik: issue #336 and PR #337 15:22:09 https://github.com/webmachinelearning/webnn/issues/336 -> Issue 336 Add internal slots to MLOperand, MLActivation and basic algorithms (zolkis) 15:22:09 https://github.com/webmachinelearning/webnn/issues/337 -> Pull Request 337 Add internal slots to MLOperand and MLActivation (zolkis) 15:22:29 zkis: this is WIP, need to decide a few things 15:23:05 ... algos that are polymorphic, in those we need to internally contruct the operand, need internal or explicit contructor steps, all algos e.g. clamp() is an example, it has its own issue and PR 15:23:40 q+ 15:23:40 ... missing references mentioned 15:23:40 ... all PRs depend on MLOperand and MLActivation PR 15:23:40 q+ 15:23:40 ... do we need a constructor MLOperand, or only created by a builder? 15:24:07 ... MLActivation, no explanation how to use this construct, what this is exactly 15:24:12 q? 15:24:15 ack ningxin_hu 15:24:25 ack chai 15:25:00 chai: MLActivation and MLOperand, these are trivial, these interfaces can be trivially constructor, MLActivation is a placeholder, we don't want to define enums for all of them 15:25:11 ... we decided to have a separate interface for that 15:25:33 ... no separate interface wanted, placeholder, implementation can keep a name, it is to be able to defer construction 15:25:46 ... caller will do new activations, similar feedback for MLOperand 15:26:25 ningxin_hu: MLOperand and MLActivation, we need an internal concept, not public interface, for node inside the computational graph 15:26:42 ... API returns operand, in our programming model this is data flowing through the graph 15:27:13 ... nodes consume operands, in my implementation I repurposed MLOperand for a node inside a graph and activation can be fused into it 15:27:28 ... MLActivation connected through MLOperand 15:27:49 ... witch Chai's change we introduce MLActivation, used for fused activation from user's POV 15:28:16 ... for spec, we need node or operator concept to describe the algorithm steps, e.g. in clamp() input operand and output operand, how these are connected together? 15:29:01 zkis: if we can define those meanings in the spec level it'd be nice 15:29:01 ... now only one internal slot that is its own name 15:29:01 ... should we have input and output internal slots? 15:29:01 ningxin_hu: no we don't need that I think 15:29:29 ... underlying implementation connects MLOperand, let's see how you can write the algorithm steps e.g. in clamp() and we can start from these what internal slots are required to satisfy your needs for algorithm steps 15:29:38 zkis: we'll see in that PR how to do this properly 15:30:30 Subtopic: The clamp() algorithm 15:30:49 anssik: issue #347 and PR #348 15:30:50 https://github.com/webmachinelearning/webnn/issues/348 -> Pull Request 348 [WiP] Add the clamp() algorithm (zolkis) 15:30:50 https://github.com/webmachinelearning/webnn/issues/347 -> Issue 347 Add the clamp() algorithm (zolkis) 15:31:15 https://github.com/webmachinelearning/webnn/pull/348/files 15:31:25 zkis: please take a look at this PR and the steps there 15:31:43 q+ 15:31:46 ... I tried two different polymorphic versions depending on the first argument 15:31:48 ack ningxin_hu 15:32:21 ningxin_hu: my question, you mention it is polymorphic version, you try to handle them in this one algorithm steps 15:33:40 ... my understanding is, we need to have two algorithms: 1) clamp takes input operation without activation, 2) clamp returns an activation 15:34:18 ... probably you need an internal slot or say "underlying implementation" that realize the min and max values 15:34:29 ... there are two different algorithms to write for clamp 15:34:33 q? 15:34:47 ... in Chromium implementation we have two implementations for this 15:35:35 +1 15:35:38 q+ 15:36:14 zkis: in JS we need to have single algorithmic steps 15:36:42 ... I can change these steps, will do the same thing as with batchNorm 15:36:54 ... will clarify the implementation owns the operation 15:37:08 ... I'm not aware how I can make two algorithms here, you can do that in C++ but not in JS 15:37:29 ... if you have examples how this is done in other specs 15:37:35 s/JS/JS bindings 15:37:51 ... usually we switch on an object type passed as an argument 15:37:59 ack chai 15:38:24 chai: process question, do we need to add all these implementation nodes on all the ops? 15:40:00 ... I hear we've discussed clamp(), that is trivial op, we have more complex ops, if we need to explain implementation nodes for all of these it takes forever 15:40:17 zkis: only steps that you can see in this PR, param handling, return values, what we request the platform to do 15:40:30 ... conv is more complex 15:40:47 chai: imagine convolution and gemm and friends, very complex to define at this level 15:41:06 zkis: input and output handling need to be clarified, how do we want the implementation to be called on this 15:41:12 q+ 15:41:14 ... a lot of libraries could do the underlying work 15:41:22 chai: understand validation is important 15:41:35 ... convolution input validation layer will be more more complex 15:41:46 ... need to consider alignments and all that 15:41:57 ... need to show you the code we do in DML to explain this 15:42:23 ... it is very very tricky, in practice implementations of these ops are not going to do all this, they defer to the platform APIs 15:42:49 ... e.g. CoreML may fail with improper arguments, it is unlikely the browser implementation to do all this itself 15:43:22 zkis: we'll still need to add a lot of formal text 15:44:36 ... domenic 15:44:54 ... mentioned we should delineate normative from informative text clearly 15:45:41 ... MVP: exceptions, success path, handling input/output 15:47:29 ... we could also clarify what we mean by axis in this spec, do we use the TF definition? 15:47:46 chai: for convolution etc. we try to define the overlap between them, the popular ones 15:47:57 ... these rules, how the inputs need to line up is complicated 15:48:20 ... we don't just go with TF, a lot of how different FWs do it is copy paste from others that came before them 15:50:01 zkis: I try to lift the boilerplate work from you editors and try to do it in a minimal way 15:50:12 ... I think the dependency PRs should be merged soon 15:50:49 referring to add internal slots to MLOperand and MLActivation PR #337 15:50:49 https://github.com/webmachinelearning/webnn/issues/337 -> Pull Request 337 Add internal slots to MLOperand and MLActivation (zolkis) 15:51:29 q+ 15:51:33 ack ningxin_hu 15:51:51 ningxin_hu: questions, whether we should put shape inference steps in the spec 15:52:29 ... for clamp(), when make MLOperand and return to the user code, you define MLOperand as an internal slot, how to set the dimensions for output operand would require share inference 15:52:56 ... these is some output shape calc formula written by Chai 15:53:04 ... we need to translate that to algorithmic steps? 15:53:21 zkis: for clamp() I use trivial constructor for operand 15:53:29 ... we can factor our the shape formula 15:53:43 ningxin_hu: in MLOperand PR no internal slot for that 15:53:58 zkis: I came up on the need for clamp() PR; will add that, thanks! 15:54:32 q? 15:54:52 Subtopic: Improve batchNorm 15:55:00 anssik: issue #334 and PR #339 15:55:00 https://github.com/webmachinelearning/webnn/issues/339 -> Pull Request 339 [WiP] Fix #334: Improve the batch norm algorithm (zolkis) 15:55:00 https://github.com/webmachinelearning/webnn/issues/334 -> Issue 334 Improve batchNorm (zolkis) 15:55:06 anssik: issue description: 15:55:30 ... clarify batchNorm description, in particular clarify how "axis" is used 15:55:34 ... validate the inputs (dimensions etc.) 15:55:38 ... add an algorithm which "outsources" running the actual op, but informatively describes what's expected 15:55:47 anssik: the proposed PR is WIP but welcomes review for the parts defined 15:55:51 ... this depends on PR #337 15:55:52 https://github.com/webmachinelearning/webnn/issues/337 -> Pull Request 337 Add internal slots to MLOperand and MLActivation (zolkis) 15:55:52 https://github.com/webmachinelearning/webnn/pull/339/files 15:55:55 q+ 15:56:01 ack ningxin_hu 15:56:17 ningxin_hu: I recall we discussed this before, want to hear from Chai 15:56:40 ... in this batchNorm PR we say "issue ... for the underlying platform" 15:56:53 ... I'd like to get a clarification should we do that or skip in the builder methods for any ops 15:57:12 Chai: the builder should be cheap, what goes to builder should be only input validation 15:57:32 ... I had some conversation over at Apple to see how they feel about implementing some of this 15:57:52 ... aligned with our beliefs that when implementers look at the spec they see compilation step being the big step 15:58:01 ... compilation has to have all the information 15:58:12 ... build method is the time when they process everything 15:58:28 zkis: do we need to split what we have in the PR right know, to just record the structure? 15:58:37 ... no exec time steps for batchNorm and others? 15:58:58 ... exec steps just name ops as text such as enum, no description on how to use axes and such 15:59:28 Chai: no different from other ops, just construct the graph, I don't think you can provide impl guidance for the build method, it is different 15:59:37 ... how e.g. Apple might implement the method could be different 15:59:48 zkis: what are the interop elements we can make here for the API users? 16:00:02 ... if everything is deferred to impl interop is challenging 16:00:25 Chai: Web API is an interface that provides guidance to impl nodes, but cannot guarantee it is actually how it is implemented in detail 16:00:37 ... documenting input validation at graph validation time we can do 16:00:42 ... even that is not going to be thorough 16:01:11 zkis: please help me get the steps right for this op and I'll make the others following the blueprint we formulate for this op 16:01:24 https://github.com/webmachinelearning/webnn/pull/339/files 16:02:03 zkis: I got the feedback, essence of it, we need to be very light 16:02:13 Chai: blueprint for these ops is some input validation, that'd be helpful 16:02:29 ... that can be used when we explain when one builds a graph, what happens 16:02:42 ... you cannot be fully confident until you compile the graph 16:02:54 ... I have no idea how to write these steps for the build method 16:03:20 zkis: I'll try to figure it out, I got your guidance 16:03:46 Chai: I don't think you can accurately describe the implementation steps 16:04:17 ... for build you have to pick one way to do it, but that one way may not be correct for other ways of doing it, it is an implementation detail 16:04:59 ... if you can explain compile and compute in very simple way 16:04:59 zkis: we can explain where impl specific optimization can take place 16:05:22 ningxin_hu: another piece of feedback, for input validation, e.g. batchNorm, why not translate the declarative text we currently have in place? 16:05:56 ... e.g. second input param is min, you can check if it is 1D otherwise throw an exception 16:06:35 ... if we cannot validate like this, we push validation to build method for graph build time validation by the implementation 16:06:55 ... batchNorm method could do input validation translating the current declarative text 16:08:09 zkis: we need to be clear on what happens at build phase, compute phase 16:08:49 ningxin_hu: build phase, we need to make sure the graph constructed with build methods, architecture/topology, attributes of nodes are validated before feeding into the native implementation 16:09:00 ... other things are done by the native implementations 16:09:16 zkis: feel free to comment anything on these PRs 16:09:27 ... if I can make you both happy we're probably on the right track 16:15:21 s/I had some conversation over at Apple/I had some conversation with another browser vendor 16:18:14 RRSAgent, draft minutes 16:18:15 I have made the request to generate https://www.w3.org/2023/02/16-webmachinelearning-minutes.html anssik 17:59:35 Zakim has left #webmachinelearning