WebML CG Teleconference – 5 September 2019

Meeting minutes

<nsthorat> +nsthorat

Ops compatibility study

ONNX vs TF Lite op comparison: Conv2D, Matmul / Fully Connected

nsthorat: spent time with Googlers discussing this topic, Google consensus seems to be standardizing on ops is too early at this point
… TensorFlow in general is not in a position to endorse ONNX in a web spec, prefer create a new spec for ops
… we think ops are not the right level of abstraction that lasts the test of time
… MLIR might be it, but we're not ready yet
… there's a lot of valuable exploration with e.g. custom ops

<Zakim> anssik, you wanted to speak up

Rafael: nikhil I'm curious about the rationale re ONNX, is it political or thinking we cannot find a middle ground?

nsthorat: TensorFlow has not publicly endorsed ONNX and does not want to do that for the purpose of the web spec

daniel: we feel ONNX is too big of a spec as of now, question of neutrality as well

Rafael: I understand this would be something we start small, all ISVs are part of ONNX, Amazon, it is not meant to be one company driven effort

nsthorat: I think the (ONNX) issue is more organization than technical

paul: we started looking at things that could be hardware accelerated

nsthorat: feedback we got internally was creating a spec on an ops level there would be a lot of issues for hw vendors(?)
… TF thinks it is too early to standardize ops set, but we do not want to remove momentum from this CG and do explorations that could evolve, maybe with custom ops, sharing memory

paul: thinking what would be next steps in the light of this new information
… we're also working with vendors, working with IRs, we're on a similar journey

anssik: does it make sense to phase work e.g. phase 1 explore ops + custom ops, phase 2 look into MLIR or whatever comes in the future

nsthorat: we should do there explorations we have ongoing

<Jonathan_> James is on the queue with an idea

nsthorat: looking ops and custom ops with shared memory in parallel would be reasonable exploration

jdarpinian: James from Chrome, thinking what we can do that's minimal and simplest thing that could possibly work
… looking at doing a WebGL extension, benefits: WebGL extensions are optional, if we ship it we can always unship it later
… almost all NN frameworks already make use of WebGL
… could be simple to add couple of API calls to access vendor-specific kernels
… seems like a simplest way as a CG to achieve the goal, not needed to be supported forever

Rafael: doing a WebGL extension sounds good
… custom ops could use compute shaders of WebGPU

anssik: WebGL compute extension status?

jdarpinian: not shipping on Mac

Rama: about ops abstraction, does that mean ops are not sufficient as part of this standard?

daniel: because NN/ML is evolving so quickly new ops coming into place all the time
… we want all hw vendors implement them efficiently, otherwise we'll fall back to common low-level abstractions such as Wasm
… ops keep on growing, ONNX, TF Lite keeps on growing, and web would be unable to catch up with their ops sets

Rama: this could be also modeled on higher-level ops
… we could identify a collection of higher-level abstractions, would something like that address this with easy extensibility?

nsthorat: I hear you, those are good ideas, these explorations are being done umbrella of compilers and MLIR
… these explorations are happening also outside this group and will evolve significantly over the next 6 months

<Rama> rama: can we address the extensibility question using a small collection of higher-order ops, like element-wise-op?

"Multi-Level Intermediate Representation" Compiler Infrastructure

anssik: is compat study exploration still valid?

nsthorat: yes, would prioritize custom ops exploration

ningxinhu: question to james and rafael re WebGPU extension, will it be op level abstraction or lower-level abstraction?

jdarpinian: I think it would be op-level, since there's nothing really concrete to propose otherwise at this time

ningxinhu: the idea is to add ops-level extension to WebGL/GPU?

jdarpinian: yes, we could implement those ops that would give biggest speedup

ningxinhu: we still need ops compat study to look into MPS and DirectML compatibility

ningxinhu: maybe we need to look (more) into compat not on the framework level, but native API level MPS etc.

ningxinhu: do you expect this group cound do ops study, and how do you see collaboration with WebGPU and WebGL groups

jdarpinian: WebGL not sure, but WebGPU probably easier since also W3C group

ack?

ningxinhu: another question, james and rafael propose WebGL and WebGPU extension route, how to support other types of accelerators including CPU-based.
… another device class is standalone accelerators, how to expose those capabilities to the web

ack?

jdarpinian: I'm very interested in standalone accelerators, unclear what type of API on native side will be used to interface with them long term
… would be great to be able to have a mechanism to unship

kainino: there has been W3C-Khronos collaboration with canvas and HTML specs that has worked via shared membership and people, has been easy in practice
… WebGPU does not meet at TPAC 2019 formally, but e.g. Myles and Dean from Apple will be there

Rafael: what is the roadmap of TF.js over the few next months?

nsthorat: good question, we're on Wasm backend for TF.js and work on WebGPU backend, trying to ship higher-level models for e.g. PoseNet
… MLIR will evolve and we'll watch that space

F2F agenda building

WebML F2F agenda

anssik: would nikhil want to give a briefing on MLIR?

nsthorat: can do that

[no objection]

ONNX vs TF Lite op comparison: Conv2D, Matmul / Fully Connected

nsthorat: I think we should still do compat study

anssik: can you share more in DirectML POC?

ningxinhu: that POC is a contrib to help ops compat study for DirectML and MPS

nsthorat: TensorFlow Dev Summit 2020 dates not yet decided

– DRAFT –
WebML CG Teleconference – 5 September 2019

05 September 2019

Attendees

Meeting minutes

Ops compatibility study

F2F agenda building

Adjourn

Diagnostics