<Nikhil> :)
Define the set of operations and their specification #17
anssik: we had a review of the proposed resolution and receiver good feedback we need to resolve, let's discuss that now.
… the objective of this call is to resolve objections raised for the proposed resolution and clarify proposed resolution based on feedback where appropriate
To start, I captured the following questions from issue #17 we need to resolve:
nsthorat: "An important part of this specification will be ensuring this set of ops are compatible with the major ML JavaScript frameworks [...] it's not possible for us to move forward with this resolution without understanding compatibility."
jbingham: "what's the plan for dealing with versioning?"
jbingham: "How are custom ops defined and included in the graph?"
walrusmcd: "How many ops?"
jbingham: "Decide if a graph API is the right thing to standardize on"
anssik: To summarize, need to choose a set of operations to be included in the API that enables adequate compatibility in the major ML frameworks
<Zakim> Nikhil, you wanted to talk about something and to talk about onnx & tf lite compatibility doc: https://docs.google.com/document/d/1RXCkZ9mliWbqSakYvNlWhsRH4yFtnpe1YQQNFAIRZo8/edit
Nikhil: shared doc on the chat, please take a look
… spend time looking at compat, started with basic 2 ops, tried to understand diff
… starting with low number of ops is our preference and grow that over time to understand compat issues of each op
danielsmilkov: this is about diffing libs, looking into possible compat issue, 1) when comparing with NN API e.g. some ops allow fusing, we propose separate ops no fused kernel, under the hood implementer could fuse the ops so that it runs great on a particular hardware
… ONNX is opinionated regarding the layout, TF lite wants channels come last, different hw depend channels first or channels last
… which layout is better changes over time
Nikhil: would prefer to start very small with a POC that works, and have a plan how to grow that set of ops
… probably need a way to deal with custom ops, have a way in app space to describe custom ops share memory with matmul
ack?
ac?
Rafael: I agree with a plan to keep this hardware agnostic
<Nikhil> awesome! that would be great.
<Nikhil> regarding the script to convert onnx / tensorflow
Paul: everything Rafael basically yes, goal is to be hw agnostic
… ONNX has done work on channel formats and hit these same issues, proposed solutions
<Zakim> anssik, you wanted to ask about something and to
ack?
<Zakim> Ningxin_Hu, you wanted to talk about op set & use cases
Ningxin_Hu: thanks for the efforts of Nikhil and Daniel, great work! Agree with approach of starting with a small set of ops and validate compat with JS libs
… proposal how to grow the op set: add ops that are needed to implement identified use cases
https://webmachinelearning.github.io/webnn/#usecases
<Ningxin_Hu> op set and use cases: https://github.com/webmachinelearning/webnn/issues/17#issuecomment-508426036
[silence, agreement]
Paul: we took an approach where we selected the ops that benefit from hw acceleration
… a bit similar approach to CUDA
Ningxin_Hu: if we only select expensive ops that benefit from hw, that may impose perf penalty when doing context switching
Paul: I agree, it might be worth prototyping that now, assumption we're proposing is this hybrid approach (w/ WebGL) is viable
<jonathan> What other ML frameworks should review each op, like Daniel did for TensorFlow, and confirm compatibility before we finalize the definition?
Ningxin_Hu: agree with Paul's comments, interleaving with Wasm in POC, overhead was significant
Rafael: CPU readback is slow, staying with GPU compute shaders should work pretty well
jdarpinian: I'm on the Chrome team and think custom ops based on WebGL can work, but will be very complex to implement
<Nikhil> We think it's important to be able to have custom operations share memory with conv2d / matmul without doing a readback. for cpu-accelerators, share the buffer with WASM, for gpu-accelerators share the buffer with WebGL
jdarpinian: portability between custom ops between different systems, CPU and GPU not very good
<Nikhil> this allows us to grow the spec slowly and not have tail-end ops be bottlenecks and the webnn accelerated ops can get quick wins by accelerating the bottleneck ops (conv2d, matmul, etc)
Paul: I think Ningxin_Hu posted an architecture diagram
Paul: frameworks will do the heavy lifting, web developer won't see the complexity
Nikhil: we think the same, but not all devices have WebGL backend so fallback to Wasm for example
Ningxin_Hu: about custom ops, folks talked about memory transfer overhead
… even long SIMD instructions on CPU can require tensor memory re-layout, an expensive operation
anssik: it was asked on the issue whether graph is the right abstraction?
jonathan: what are the other JS frameworks we need to take into compatibility study?
Paul: in ONNX we considered all frameworks that matter, they have a voice in ONNX project
… in ONNX we have considered PyTorch, Caffe, Intel's frameworks, Microsoft's frameworks, TensorFlow, we have ONNX to TF converter, Apple's CoreML
… CoreML was part of the opset 1 compatibility
Nikhil: specifically interested in JS ML frameworks
… for compatibility
… for example, Brain.js
Paul: we don't want to have two bodies managing op schema, right?
Nikhil: we want to grow slowly, right?
… focus on web stuff to figure out an intersection of JS ML libraries, does that sounds reasonable?
Paul: ONNX does have namespace and versioning concepts, so we could create our own ONNX namespace for the ops references by Web NN API
Rafael: it is up to us to decide how many ops to adopt, the op definitions themselves would come from ONNX standards body
danielsmilkov: that makes sense, want to be clear, because of portability issues and JS libs as users, some changes needed to ONNX may be needed e.g. memory layout
Paul: that's fairly reasonable, ONNX community would certainly welcome that
danielsmilkov: relaxing, not breaking existing ONNX behaviour
… going to custom ops
… we deal with real models every way, need to add ops to TF, interoperability important for e.g. pre and post-processing of media, video
jdarpinian: also need to look into hardware we want to support, there's a lot of hardware out these and new coming up, e.g. neural engines coming up in ARM chips
Nikhil: that's a good point, e.g. for matmul would be good to do homework checking how that works across all hardware
anssik: Daniel and Nikhil could you move your doc https://docs.google.com/document/d/1RXCkZ9mliWbqSakYvNlWhsRH4yFtnpe1YQQNFAIRZo8/edit#heading=h.n1gbg8k8lggq into a GH issue
Nikhil: yes, we'll do that
danielsmilkov: GH issue #17 there's a comment where Ningxin_Hu proposed 14 ops, we could do the work to split these 14 ops into 3-4 GH issues with some logical bundling
PROPOSED RESOLUTION: The specification will reference the ONNX operations and if there are any improvements desired for ONNX the work should be there.
<Ningxin_Hu> 14 ops proposal: https://github.com/webmachinelearning/webnn/issues/17#issuecomment-512651711
PROPOSED RESOLUTION: The specification will reference a subset of the ONNX operations, starting small, adding more ops when compatibility with major ML JavaScript frameworks has been validated
<Zakim> want, you wanted to talk about custom ops technical details
<Zakim> AI, you wanted to discuss custom ops
<Zakim> kainino, you wanted to discuss jdarpinian, want to point out it's important to not only understand the current and upcoming hardware, but since the browser runs in userspace we also need to run on top of the userspace apis (NNAPI, CoreML, DirectML) so we are constrained by how they expose things
kainino: we want to point out it's important to not only understand the current and upcoming hardware, but since the browser runs in userspace
… we also need to run on top of the userspace apis (NNAPI, CoreML, DirectML) so we are constrained by how they expose things
Nikhil: sharing memory with custom ops needs to be better understood
… can you Ningxin_Hu do that investigation?
Ningxin_Hu: with help from james or kai we could make progress with custom ops issue
Rafael: have bandwith to help, but not time to drive
jdarpinian: the same, can help not drive
Ningxin_Hu: I can take the lead, with help from others
PROPOSED RESOLUTION: The specification will reference a subset of the ONNX operations, starting small, adding more ops when compatibility with major ML JavaScript frameworks has been validated
<kainino> Ningxin_Hu: Please reach out to us as needed
<kainino> oops that's supposed to be @Ningxin_Hu
<Ningxin_Hu> thanks @kainino
https://www.w3.org/2019/09/TPAC/
any concerns with the amended proposed resolution?
[hearing no concerns]
Resolved: The specification will reference a subset of the ONNX operations, starting small, adding more ops when compatibility with major ML JavaScript frameworks has been validated
Maybe present: anssik, danielsmilkov, jbingham, jdarpinian, jonathan, kainino, Nikhil, nsthorat, Paul, walrusmcd