13:58:11 RRSAgent has joined #webmachinelearning 13:58:11 logging to https://www.w3.org/2020/09/17-webmachinelearning-irc 13:58:16 Zakim has joined #webmachinelearning 13:58:24 RRSAgent, make logs public 13:58:27 zkis has joined #webmachinelearning 13:58:34 Meeting: WebML CG Teleconference – 17 September 2020 13:58:41 Chair: Anssi 13:58:46 Agenda: https://github.com/webmachinelearning/meetings/blob/master/telcons/2020-09-17-agenda.md 13:58:51 Scribe: Anssi 13:58:56 scribeNick: anssik 13:59:04 Present+ Anssi_Kostiainen 13:59:08 RRSAgent, draft minutes v2 13:59:08 I have made the request to generate https://www.w3.org/2020/09/17-webmachinelearning-minutes.html anssik 13:59:38 ningxin_hu has joined #webmachinelearning 13:59:54 Present+ Ningxin_Hu 14:00:43 RRSAgent, draft minutes v2 14:00:43 I have made the request to generate https://www.w3.org/2020/09/17-webmachinelearning-minutes.html anssik 14:00:45 baul_eun has joined #webmachinelearning 14:01:42 Present+ Ganesan_Ramalingam 14:01:50 Present+ Baul_Eun 14:02:00 rama has joined #webmachinelearning 14:02:22 Present+ Paul_McDaniel 14:02:55 Present+ Rafael_Cintron 14:05:02 Present+ Chai_Chaoweeraprasit 14:05:30 TOPIC: Model Execution API 14:05:58 anssik: Discuss proposed improvements to the existing execution API 14:06:04 -> https://github.com/webmachinelearning/webnn/issues/87 Model Execution API #87 14:06:19 Chai has joined #webmachinelearning 14:06:30 present+ Zoltan_Kis 14:06:59 ningxin_hu: current discussion is around the Ping's first comment 14:07:25 today's execution API requires user to provide an output buffer at execution time, so need to know the shape of it and allocate the right sized buffer 14:07:38 ... Ping identified that as an ergonomic issue 14:07:58 ... also in some models shape is dynamic, not known beforehand, as identified by Rama 14:08:12 ... need to address these two issues in the current execution API 14:08:36 ... also Chai proposed how to simplify the execution interface, under discussion currently 14:08:41 ... Chai other perspectives? 14:09:10 Chai: this is a long thread, but I'm glad we had this discussion 14:09:35 ... to add what Ningxin described, the original ask is very specific, but in the discussion later on we touch other related topics 14:09:46 ... this is becoming an API discussion, all interlinked 14:10:05 ... supporting dynamic input shapes and output buffers is reasonable, we should support that 14:10:16 ... we're almost in agreement with respect to these points 14:10:40 ... if we look at the last replies in this issue, we can conclude we're on the same page 14:11:07 ... related to the topic, how to simplify, I think Kenneth has raised many good points around why do we need the complication step 14:11:26 ... raises good discussion points, it is good for perf to have a complication step, app has more control 14:11:41 ... but we should be able to collapse complication and execution into one 14:11:59 ... another related topic, whether we want in the future support eager execution 14:12:14 ... Ningxin told me at one point we had that discussion in this group, I wasn't in this group at that time 14:12:21 ... if we want to do that, it should be natural 14:12:34 ... to support eager we shouldn't need to change everything 14:13:00 ... simplifying complication and execution will help make the API amenable for eager more 14:14:14 anssik: can we split out other issues from this one? e.g. for eager we do have past issue 14:14:32 Chai: good to have all the discussion in context in this issue 14:14:48 ningxin_hu: not sure about quantized support? 14:15:24 Chai: we have one issue for that, let's lean on it, not specifically issue for adding quantization support to the API 14:15:35 ... can open a new issue or piggy-pack on the existing one? 14:15:57 ningxin_hu: today's spec has kinds of quantization support, I suggest we comment on that issue 14:16:09 ... see if we remove quant from Operand 14:16:13 Chai: fine either way 14:16:46 ningxin_hu: float32 and scalar are represented by Operand, that might be confusing 14:16:53 Chai: that'd be a separate issue 14:17:40 Present+ Ping_Yu 14:18:52 Ping: I'll back to issue #87 and review feedback 14:19:09 ping_yu has joined #webmachinelearning 14:19:18 Rama: I just looked at #87, it seems execution of subgraph has not been discussion yet? 14:19:36 q+ 14:20:08 anssik: was subgraph execution discussion yet? 14:20:35 Chai: I need more information to understand this, in the new API compilation is immutable 14:20:51 ... if you want to compile a subgraph, it creates a separate compilation 14:21:14 ... compiling part of the graph only should be already solved, as a byproduct of where we arrive now 14:21:16 There are my comments regarding to subgraph execution in the polyfill PR review: https://github.com/webmachinelearning/webnn-polyfill/pull/1#issuecomment-689939624 14:21:33 q? 14:21:37 ack Chai 14:21:45 https://github.com/webmachinelearning/webnn/issues/87 14:21:55 RafaelCintron has joined #webmachinelearning 14:22:48 Ping: current compilation is static, cannot be changed, my questions is what if people want to execute a subgraph, extract the feature vector, execute somewhere in the middle before your softmax 14:23:08 ... either the user has to execute the whole graph, or you allow people to execute the subgraph 14:23:15 ... I think this is normal to have this situation 14:23:28 ... or sometimes, you repeatedly feed a layer 14:23:44 q+ 14:23:48 ... how can the API be able to handle this type of use cases? 14:23:55 Chai: thanks, this is more clear now 14:24:04 ... can you explain the use case a bit more? 14:24:12 ... are you thinking of transfer learning? 14:24:45 Ping: let's say MobileNet is not good for classification as is, but people use transfer learning 14:24:59 ... you need to be able to execute toward a node that's not the output of the model 14:25:03 ... is that clear or no? 14:25:08 Chai: I understand this know 14:25:25 ningxin_hu: actually, you raised this issue in the polyfill PR review, I had some comments these 14:25:25 https://github.com/webmachinelearning/webnn-polyfill/pull/1#issuecomment-689939624 14:25:44 ... to my understanding you can create a subgraph directly 14:26:07 ... e.g. in the LeNet example, before matmul layers you can create the graph 14:26:19 ... that's the feature extractor you can create and reduce weight there 14:26:29 ... perhaps that satisfies the requirement? 14:26:36 Ping: that could be the resolution? 14:27:05 ... many users of the models do not have an ability to create another model, they just take a pre-trained model and use it as is, fine-tune its feature vector 14:27:24 ... this scenario should be solved by execution API itself, not by creating a new model 14:27:47 ... if we can stop subgraph execution, there are other cases when people want to execute just one layer 14:28:02 ... fine-tuning is widely used scenario 14:28:16 Chai: I think this is the same request as eager execution 14:28:42 ... the requirement is very similar to making the API support eager mode 14:28:53 Ping: to me they are different 14:29:20 ... in eager I'm executing much faster, I'm not dynamically creating a graph, I have a pre-trained model, I want to compile it to make it faster 14:29:32 ... that's part of the origin model 14:30:01 ningxin_hu: my understanding, as illustrated in the example, developer has the flexibility to create the graph as it winds 14:30:30 ... the all topology is available, it is up to the developer to create any model and compile and execute it 14:30:49 ... do you want to pick some intermediate modes to get their outputs? 14:31:10 Ping: like you described, but more flexible way to define input and output nodes 14:31:35 ... WebNN does not necessarily need that since we are use case-driven in the API design 14:32:19 ... as a JS dev, I'd want to execute a part of a pre-trained model 14:32:55 Chai: you want an ability to only part of the pre-trained model? If so, I think the latest changes discussed in this issue address thi 14:32:59 s/thi/this 14:33:56 Rama: the interface is sufficient, can you specify intermediate values as inputs and outputs? 14:34:14 Chai: at some point we discussed optional output argument on the execute method 14:34:35 Rama: the API signature does not changes, but can we assume the outputs can be intermediate, impacts implementations 14:34:45 s/changes/change 14:35:02 Rama: the underlying implementation must do something smart if it needs to stop somewhere in the middle 14:35:20 ... e.g. remove unnecessary nodes before execution 14:35:36 Ping: one questions regarding output shape 14:36:31 ... now compilation is done prior to execution, without knowing subgraph how to satisfy the new execution plan(?) 14:36:57 ningxin_hu: compute method can return the result with out dictionary that has dimensions and buffer, you can spec input dim and shape 14:37:18 afk 14:37:21 Ping: is it true, every time I ask you compile 14:37:22 q? 14:37:38 ack ningxin_hu 14:38:17 ningxin_hu: in today's spec, Operand can have negative value in one dim to say not specified 14:38:44 ... when you compute, you spec the dim of the input in detail in concrete shape, and compute will infer the output shape and return it to you 14:38:59 Ping: how we handle that usually, is complication is part of the execution 14:39:04 ... compilation is cached 14:39:46 ... also the shape does not need to be set prior 14:40:21 ... the main concern was that there are a lot of pre steps that is needed prior to execution, may not be known, also need to find out what is the output shape 14:40:35 ... because the shape is dynamic and tedious for user to do that 14:41:06 ningxin_hu: exactly, you're discussing today's execution API and we came up with a solution to that issue 14:41:15 ... no need to know the shape of the output beforehand 14:42:03 TOPIC: Packing operations for gemm / matmul 14:42:16 anssik: anssik: Discuss optional packing ops and related optimization opportunities 14:42:22 -> https://github.com/webmachinelearning/webnn/issues/86 Packing operations for gemm / matmul #86 14:43:31 Rama: the use of constant operands to operations like GEMM, matmul where there is opportunity to transfer the layout as an optimization step 14:43:48 ... question was, should this be explicitly exposed through the API? 14:43:59 back now 14:43:59 ... my position is better left as an implementation detail 14:44:16 +1 14:44:42 Chai: I agree with Rama 14:44:58 ... most of the issues re packing have been addressed 14:45:08 ... process of packing is very hardware specific 14:46:22 Chai: I'll respond on the issue to comment on the group's position 14:46:41 TOPIC: Fingerprinting 14:46:54 anssik: Discuss possible fingerprinting vectors and mitigations 14:46:59 -> https://github.com/webmachinelearning/webnn/issues/85 Fingerprinting via matmul #85 14:47:29 "an efficient matmul implementation can be fingerprinted to determine hardware capabilities." 14:48:35 q+ 14:49:02 ack ningxin_hu 14:49:29 ningxin_hu: some Intel hw was mentioned, so I can follow up from that perspective 14:49:53 ... another comment, Kenneth mentions 8bit multiplication, this is related to our quant design 14:50:52 ... related to our quantization operator design, as discussed in packing, we can hide this in implementation 14:51:02 ... let me follow up from Intel hardware perspective 14:51:23 anssik: any other comments on the fingerprinting issue? 14:51:27 [none heard] 14:51:32 TOPIC: WebNN polyfill and samples 14:51:46 anssik: Continue discuss review feedback and suggestions for the foundational implementation and LeNet sample 14:51:51 -> https://github.com/webmachinelearning/webnn-polyfill/pull/1 Add the foundation implementation #1 14:51:56 -> https://github.com/webmachinelearning/webnn-samples/pull/1 Add LeNet example #1 14:52:25 ningxin_hu: comments addressed raised by Ping for the WebNN polyfill 14:52:32 ... also separate issues created for couple 14:53:52 ... Node.js support was a topic in the workshop, last week enabled the polyfill for Node.js running Mocha tests 14:55:15 q? 14:55:38 ... update to the LeNet example, added a Table of the LeNet topology 14:56:42 TOPIC: TAG review 14:56:55 -> https://github.com/webmachinelearning/webnn/issues/89 TAG review #89 14:57:59 https://github.com/webmachinelearning/webnn/blob/master/explainer.md 14:59:05 anssik: TAG review would depend on a more complete explainer https://github.com/webmachinelearning/webnn/blob/master/explainer.md 15:00:28 Chai: the explainer would likely need some code snippets to explain the API, but given we're changing the API a little so better land those API changes after which update the explainer 15:00:32 q+ 15:00:52 ack ningxin_hu 15:01:05 anssik: Sangwhan is a good person review out explainer PRs 15:01:22 ningxin_hu: proposal to not block the polyfill and example PRs on spec changes 15:01:36 ... then revise them based on the new API design 15:02:33 that works 15:02:40 anssik: any concerns with that proposal from Ningxin? 15:02:46 i can help with the explainer 15:02:48 [no concerns] 15:03:29 PROPOSED RESOLUTION: Land WebNN polyfill and samples when existing review comments have been addressed, do not block on in-flight spec PRs and API design discussion 15:03:43 PROPOSED RESOLUTION: Land WebNN polyfill and samples PRs when existing review comments have been addressed, do not block on in-flight spec PRs and API design discussion 15:04:10 RESOLUTION: Land WebNN polyfill and samples PRs when existing review comments have been addressed, do not block on in-flight spec PRs and API design discussion 15:04:59 TOPIC: Adjourn 15:05:04 RRSAgent, draft minutes v2 15:05:04 I have made the request to generate https://www.w3.org/2020/09/17-webmachinelearning-minutes.html anssik 17:27:36 Zakim has left #webmachinelearning 17:38:38 zkis has joined #webmachinelearning 19:01:11 Alan has joined #webmachinelearning