14:53:49 RRSAgent has joined #webmachinelearning 14:53:49 logging to https://www.w3.org/2022/01/27-webmachinelearning-irc 14:53:52 RRSAgent, make logs Public 14:53:52 please title this meeting ("meeting: ..."), anssik 14:54:08 Meeting: WebML WG Teleconference – 27 January 2022 14:54:13 Chair: Anssi 14:54:17 Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2022-01-27-wg-agenda.md 14:54:22 Scribe: Anssi 14:54:27 scribeNick: anssik 14:54:34 Present+ Anssi_Kostiainen 14:54:39 RRSAgent, draft minutes 14:54:39 I have made the request to generate https://www.w3.org/2022/01/27-webmachinelearning-minutes.html anssik 15:00:02 Present+ Ganesan_Ramalingam 15:00:38 ningxin_hu has joined #webmachinelearning 15:00:40 Geun-Hyung has joined #webmachinelearning 15:00:43 Present+ Bruce 15:00:48 zkis has joined #webmachinelearning 15:00:57 Present+ Ningxin_Hu 15:01:04 rama has joined #webmachinelearning 15:01:11 Present+ Geunhyung_Kim 15:01:37 Present+ 15:01:39 Present+ Dominique_Hazael-Massieux 15:01:47 Bruce has joined #webmachinelearning 15:02:28 Present+ James_Fletcher 15:02:35 Present+ 15:03:11 Present+ Zoltan_Kis 15:03:33 Present+ Raphael_Cintron 15:03:55 RafaelCintron has joined #webmachinelearning 15:03:59 Topic: Security review 15:04:13 bbcjamesfletcher has joined #webmachinelearning 15:04:27 anssi: received feedback from the chrome security team 15:04:43 ... they help us build our wide review of the WebNN spec on our path to CR 15:04:49 Subtopic: General Security Questions - 1. new scripting language 15:04:52 chai has joined #webmachinelearning 15:05:04 -> https://github.com/webmachinelearning/webnn/issues/241 General Security Questions #241 15:05:29 -> https://github.com/webmachinelearning/webnn/blob/main/security-privacy.md WebNN Security and Privacy Self-Review Questionnaire responses 15:05:37 -> https://www.w3.org/TR/security-privacy-questionnaire/#string-to-script 2.9. Do features in this specification enable new script execution/loading mechanisms? 15:05:50 Anssi: this touches on question 2.9 of the security questionnaire wrt new script execution 15:05:54 anssik: questionnaire section 2.9 reads: "New mechanisms for executing or loading scripts have a risk of enabling novel attack surfaces. Generally, if a new feature needs this you should consult with a wider audience, and think about whether or not an existing mechanism can be used or the feature is really necessary." 15:06:41 Anssi: the Google security reviewer suggests WebNN introduces a new scripting execution that gets executed in different contexts (CPU, GPU, etc) 15:06:53 ... this creates new attack surface for malicious sites 15:07:26 q+ 15:07:26 q? 15:07:27 ... any concern in updating our response to 2.9 as acknowledgement to that report 15:07:30 ack RafaelCintron 15:07:52 RafaelCintron: I strongly disagree with the characterization of this as a scripting language 15:08:06 ... I agree with the risks around out-of-bound access 15:08:22 ... that will need goo tests and validation to prevent that from API 15:08:29 q? 15:08:36 ... as would be needed with any graph API 15:09:00 Anssi: so you're suggesting we respond by focusing on the out-of-bounds aspects in the spec / security explainer 15:09:08 q? 15:09:19 ... while disagreeing with the characterization of a new scripting language 15:09:20 Subtopic: General Security Questions - 2. ops that change shape mid-calculation 15:09:58 > Operations such as split/slice/squeeze that change the shape of tensors mid-calculation may lead to incorrect assumptions in later operations - for instance if eliding bounds checks this could lead to out of bounds accesses. It would be good for their to be operation level metadata that might be consumed by implementors to help prevent such problems. 15:10:05 q? 15:10:18 q+ 15:10:18 q? 15:10:21 ack RafaelCintron 15:10:23 anssi: are there effective mitigations against this already? 15:10:40 RafaelCintron: similar to previous question - I'm not sure what they really mean by operation metadata 15:10:51 ... we need to clearly spec what each operator ingest 15:10:58 q? 15:11:06 Present+ Chai_Chaoweeraprasit 15:11:06 ... and mark graphs as invalid when they generate out of bond access 15:11:08 q+ 15:11:13 ... we shoul definitely prevent these problems 15:11:25 q? 15:11:33 ack chai 15:11:33 ack chai 15:12:03 chai: the question is a bit unclear, but it is true that in some cases that the exact shape of the tensor is unknown until runtime 15:12:21 ... it's not insecure per se, but implementations need to mitigate against risks for out-of-bond access 15:12:29 ... we should spend a bit more time looking at this 15:12:46 q? 15:13:09 Subtopic: General Security Questions - 3. op availability and deprecation 15:13:17 > The universe of operations is likely to vary in future - how will consumers discover which operations are available (short of enumerating them through failures to instantiate)? How will operations be deprecated (for instance if they turn out to be badly implemented?) 15:13:58 anssi: I interpret this as a question on feature detection and deprecation 15:14:07 ... can we shape the API to make deprecate easier? 15:14:27 ... the spec does a great job to explain how to polyfill higher level ops in terms of lower level ops 15:14:39 q? 15:14:43 ... this would help in case higher level ops need to be deprecated, they could still be polyfilled 15:15:12 rama_ has joined #webmachinelearning 15:15:18 ... I'm thinking we should note that concern in our security considerations and develop a clear answer 15:15:30 Subtopic: General Security Questions - 4. async APIs 15:15:41 > It feels like .build and .compute should be asynchronous in all cases? 15:15:47 -> https://github.com/webmachinelearning/webnn/issues/229 Should restrict the sync APIs to only exist in Workers? #229 15:15:52 -> https://github.com/webmachinelearning/webnn/issues/230 Should WebNN support async APIs? #230 15:15:52 Anssi: we're discussing this in issue #229 and #230 15:16:15 q? 15:16:26 anssi: not sure if there is a security-specific rationale behind that one 15:16:26 Subtopic: General Security Questions - 5. Side channels from shared resources 15:16:31 > New side channels will be made available from shared resources (cpu/gpu). Timeable things should be out of process so incur at least some ipc to achieve anything. Probably not a massive worry when compared with already sharing a cpu between processes running renderers. 15:16:57 a? 15:17:00 q+ 15:17:01 s/a?// 15:17:04 ack RafaelCintron 15:17:27 RafaelCintron: I don't understand why we would need to run timeable things out of process 15:18:49 dom: worth a clarification given timing attacks have been of interest in the past 15:19:07 RafaelCintron: we could make the timing less precise as a mitigation 15:19:14 ... as has been done e.g. in WebGL 15:19:35 ... doesn't necessarily need out of proc, but worth getting more information 15:20:09 q+ 15:20:13 ack anssik 15:20:19 Subtopic: General Security Questions - 6. Permission delegation 15:20:30 > Verify: Sites must delegate permission to host/run models. 15:20:54 anssi: if I understand this correctly, we're covered with the permission policy integration we have in place for the spec 15:21:09 ... the top-level Web site needs to delegate permission to iframe to use this feature 15:21:18 q? 15:21:21 q+ 15:21:25 q+ 15:21:39 ack RafaelCintron 15:21:54 RafaelCintron: +1 on permission delegation as important 15:22:06 ... permission policy gets us this indeed 15:22:09 ack dom 15:22:38 dom: my reading is similar to Anssi's, we satisfy the requirement, given complexity of security model, let's loop back and confirm 15:22:40 q? 15:22:50 Subtopic: General Security Questions - 7. serialization and caching 15:22:52 > Verify: No serialization or caching yet - although this is likely in future. 15:23:21 q? 15:23:26 q+ 15:23:30 +1 to Anssi 15:23:30 ack ningxin_hu 15:23:50 Ningxin: that's also an implementation question 15:23:57 anssik: serialization or caching is out of scope for the WebNN API spec 15:24:04 ... does directml imply caching e.g. in the driver for comparison? 15:24:18 q+ 15:24:24 q+ 15:24:45 q? 15:24:49 ack dom 15:25:13 dom: I think in the context of security reviews, caching creates timing attack surface 15:25:45 ... likely implementation considerations, what is asked is to call out this risk to implementations 15:26:45 dom: questionnaire is a tool for us, the responses should be reflected in the spec either normative language or security considerations 15:26:53 q? 15:26:55 ack chai 15:27:54 chai: the OS security deals with this, probably doesn't depend specifically on WebNN 15:28:04 q+ 15:28:21 ack dom 15:29:14 dom: I fully trust Chai on that, I think there could be mitigations on the WebNN side to complement that, e.g. timing attacks would be need to be considered and protected on the browser level, browser exec code from anywhere and from anyone 15:29:18 q+ 15:30:10 ack RafaelCintron 15:30:38 RafaelCintron: following up - why specifically timing attacks on caching? e.g. shaders have the same property 15:32:33 dom: you go to your bank site that runs shader, then iframe runs the same shader and loads faster, can detect history of browsing 15:33:16 ... of top of my head attack, general principle to understand how much information you can get x-origin from existing caches using timing attack vectors 15:33:27 s/of top/on top/ 15:33:33 q? 15:33:57 dom: we should think about this, not sure if there's a mitigation, worth investigating 15:34:05 q? 15:34:14 Subtopic: General Security Questions - 8. Control over how a model is run 15:34:50 > Control over how a model is run - (selecting cpu/gpu/tpu say) - is this too much power for the consuming site - it will for instance make it possible to more directly target a flawed implementation. It's not clear why this is required. 15:34:57 q? 15:35:08 q+ 15:35:13 ack RafaelCintron 15:35:38 RafaelCintron: goes back to how much choice we should give the developer on where to run 15:35:53 ... this is a decision that really matters in terms of performance 15:36:03 ... some models really don't run well on GPU, some don't run on CPUs 15:36:30 q? 15:36:31 anssi: one aspect is that this is a hint, it's left to the implementation 15:36:32 q+ 15:36:36 q+ 15:36:42 ack dom 15:37:01 dom: I think a hint is a partial answer 15:37:19 ... if the UA runs on a platform with specific narrow vuln on some processing unit, higher risk for exploitation 15:37:26 ack zkis 15:37:49 zkis: I was wondering hinting could be one think, but how do you handle errors when something is not available? 15:38:32 ... is the following a fingerprinting concern: you have CPU and GPU on most devices, whether some dedicated accelerator, less common 15:38:41 q+ 15:39:05 ... should not allow enumerating devices, let implementations to decide whether it respects the hint or not, but how to handle errors if that causes the model to fail? 15:39:16 ... an issue in OpenVINO, you can run hybrid CPU-GPU models 15:39:26 q? 15:39:41 ack chai 15:39:55 anssi: I don't think we should allow device enumeration, not sure about how WebGPU deals with this 15:40:06 chai: I'm confused about how this would be a fingerprinting issue 15:40:20 q? 15:40:34 ... I'm more worried about not being clear about what devices is going to be used to run this 15:40:40 ... it has big implications on sync/async 15:40:51 +1 to chai's point re sync/async impact 15:40:57 q+ 15:41:02 ack dom 15:41:19 dom: I don't this the current API shape is so fingerprintable 15:41:54 ... agree with Chai's point that if CPU vs GPU is a hint 15:42:05 q? 15:42:21 Subtopic: Guidelines/philosophy for new operations, including security principles 15:42:27 s/is a hint/is a hint it impacts greatly our discussion on sync vs async 15:42:28 -> https://github.com/webmachinelearning/webnn/issues/242 Guidelines/philosophy for new operations, including security principles #242 15:43:12 -> https://www.w3.org/TR/webnn/#api-mloperand WebNN API MLOperand 15:43:23 -> https://lists.w3.org/Archives/Public/www-archive/2021Nov/att-0000/W3C_Adding_new_Operators.pdf Adding new operators, view from ONNX by Michal Karzynski 15:43:27 anssi: this reinforces the value of creating guidance for creating new ops 15:43:27 -> https://www.w3.org/2021/10/26-webmachinelearning-minutes.html#t02 Rationale/criteria for adding new ops to the WebNN API (TPAC 2021 minutes) 15:43:35 ... could be part of the Operand section 15:43:39 +1 15:43:41 q? 15:44:02 q? 15:44:22 Subtopic: Op metadata that helps avoid implementation mistakes 15:44:28 -> https://github.com/webmachinelearning/webnn/issues/243 Op metadata that helps avoid implementation mistakes #243 15:45:47 q? 15:46:14 Subtopic: A conformance suite with disallowed intra-op examples would be helpful for hardening 15:46:25 -> https://github.com/webmachinelearning/webnn/issues/244 A conformance suite with disallowed intra-op examples would be helpful for hardening #244 15:46:57 q? 15:47:04 q+ 15:47:07 ack dom 15:47:29 dom: I heard earlier out of bounds we need to look into, formalizing that into test cases for w-p-t makes a lot of sense to me 15:47:42 q? 15:47:55 anssik: on behalf of the group, I want to thank Alex Gough and Chrome Security team for this security review! 15:48:45 dom: really valuable indeed 15:48:49 ... incl for wide review 15:48:55 dom: this is well beyond the expectation of a security review, very good feedback for CR horizontal review purposes 15:49:18 Topic: Integration with real-time video processing 15:49:25 -> https://github.com/webmachinelearning/webnn/issues/226 Integration with real-time video processing 15:49:29 Subtopic: Review proposed prototype next steps 15:49:34 -> https://github.com/webmachinelearning/webnn/issues/226#issuecomment-1016104142 Proposed prototype next steps 15:49:38 -> https://github.com/webmachinelearning/webnn-samples/issues/124 Proposed detailed GPU pipeline processing steps for semantic segmentation prototype 15:49:55 Anssi: Ningxin has proposed a plan to make progress here 15:50:03 q+ 15:50:14 ack dom 15:50:29 dom: Ningxin thanks for formalizing this into concrete next steps! 15:50:42 ... is this blocked until WebCodecs proposal is implemented in Chromium? 15:50:54 ningxin_hu: GPU Import of VideoFrame? 15:51:04 ... is a dependency 15:51:14 dom: can be proceed without this? 15:51:39 ningxin_hu: another is WebGPU-WebNN interop, per request by Corentin opened an issue with WebGPU WG to investigate this 15:51:45 ... that is another dependency in this proposal 15:51:55 ... we can look into that in parallel 15:52:11 ... so this proposal has these two dependencies and we can work on these in parallel 15:52:33 dom: I'm hearing we need improvements in both specs and implementation to make meaningful progress on the prototype 15:53:00 ningxin_hu: I need to confirm with people working on the WebCodec import to WebGPU, there's some prototype code for that in Chromium 15:53:27 q+ 15:53:28 q? 15:53:31 ack dom 15:54:00 dom: I followed the great discussion on WebNN-WebGPU integration, Ningxin, is this going to a good direction? Need for a joint meeting? 15:54:05 q+ 15:54:31 ningxin_hu: I'm fine with GH discussion on this, also checked with WebGPU people and they have monthly and welcome us to have an agenda item there to discuss 15:54:43 ... this issue also marked as "post v1" in WebGPU 15:54:48 q? 15:55:11 anssi: I understand WebGPU people are focusing on shipping their v1 15:55:12 ack chai 15:55:32 chai: the outstanding topic remains the support for async 15:55:55 ... the control over the GPU timeline matters when integrating with WebGPU 15:56:03 ... WebNN isn't clear on this timeline intersection 15:56:25 ... since we're also dealing with CPU, this could have a really big impact on the API shape 15:56:37 ... we need to resolve that issue sooner rather than alter 15:56:44 https://github.com/webmachinelearning/webnn/issues/230 15:57:01 q? 15:57:09 Subtopic: Review proposed use case for spec inclusion 15:57:15 -> https://github.com/webmachinelearning/webnn/pull/249 Add real-time video processing use case #249 15:57:28 i/Subtopic:/... #230 mentions integration with WebGPU as a consideration in this discussion/ 15:57:46 anssi: do we refer to WebGPU/Webcodecs, or remain abstract? 15:57:46 q+ 15:57:59 ack dom 15:58:43 dom: the use case combines use cases and requirements 15:59:02 s/the use case/this/ 15:59:39 dom: I guess if we want to highlight technical aspects, then that would be more of requirements derived from the use case 15:59:54 q? 15:59:54 sgtm 16:00:35 q? 16:00:44 Topic: Double-precision baseline implementation of WebNN operations for testing 16:00:59 anssik: Review the double-precision baseline implementation of WebNN operations for web-platform-tests purposes 16:01:04 -> https://github.com/webmachinelearning/webnn/issues/245 The baseline implementation of WebNN ops #245 16:01:08 -> https://github.com/huningxin/webnn-baseline webnn-baseline (staging repo) 16:01:15 anssik: any concerns with proceeding with this implementation work? 16:02:00 ningxin_hu: would like to get confirmation it is fine to move this repo to WebML GH 16:02:11 would like to set up the repo, initial PR, people can review then 16:02:22 s/would/... would 16:03:49 RRSAgent, draft minutes 16:03:49 I have made the request to generate https://www.w3.org/2022/01/27-webmachinelearning-minutes.html anssik 16:16:54 zkis has joined #webmachinelearning 16:31:25 zkis has joined #webmachinelearning 16:42:42 scribe+ dom 16:42:48 RRSAgent, draft minutes 16:42:48 I have made the request to generate https://www.w3.org/2022/01/27-webmachinelearning-minutes.html dom 16:43:19 i/Security review/scribe+ dom 16:43:21 RRSAgent, draft minutes 16:43:21 I have made the request to generate https://www.w3.org/2022/01/27-webmachinelearning-minutes.html dom 16:44:58 s/goo tests/good tests 16:47:55 zkis_ has joined #webmachinelearning 18:01:42 Zakim has left #webmachinelearning 19:26:58 zkis_ has joined #webmachinelearning 22:28:30 zkis_ has joined #webmachinelearning 22:38:53 zkis__ has joined #webmachinelearning