Meeting minutes
New ops for WebNN API
anssik: as you know we have added conf and matmul to the API
anssik: now is the time to discuss what the next ops we think we should add
anssik: the proposal is, prioritize ops that address identified use cases
<anssik> WebNN API use cases
anssik: the space of usecases is expanding rapidly, we should also maintain these use cases that inform the API priorities to ensure we target the future and not the past
anssik: currently the usecases informantly reference papers from publicly available locations
anssik: we should continue to bring to the group's attention the latest developments in this space
anssik: with that background, let's hear proposals for new ops to add to the WebNN API
anssik: the floor is open
<Zakim> anssik, you wanted to ask
ningxin_hu: I think I like the approach to identify the models and usecases and we can use operators that are used in those models
<ningxin_hu> https://github.com/huningxin/webnn/wiki/WebNN-first-wave-models-and-ops
ningxin_hu: with that, I recommend to do some homework and make a table with the operators for the models
ningxin_hu: on each row there is an operator and models
ningxin_hu: this does not fully cover all usecases but does cover a lot of them
ningxin_hu: add, multiply and some activations as well as pooling ops (avg pool)
ningxin_hu: there are some that we already support, like convolution
ningxin_hu: I left some questions open in the table as well. That's my initial try to identify the ops
anssik: thank you ningxin_hu, this is very helpful
anssik: I think we could add this content into the spec as informative content
Action: ningxin_hu submit a PR for model ops table into spec or wiki
Nikhil: this is a bigger question, we've been spending quite a bit of time about what's going on with TF ops
Nikhil: is there any other standard that is going to make its way into an open body
Nikhil: I'm slowly thinking we should target a lower level of abstraction
Nikhil: we'll be chasing the ops forever and they'll be iterating at a high rate. I know this is kind of a left turn but this at about the level of XLA
Nikhil: the 80 or so ops within XLA covers the majority of neural nets they've seen
Nikhil: I'd challenge us to consider lower level abstraction
Nikhil: there are only 80 where as in TF there are 1K
Nikhil: there are tensor compute primitives and XLA, another Google one...
Nikhil: something at the same level of XLA
Nikhil: that's my current thinking, but to answer your question regarding ops. I have intuitions but I think looking at real models and consider the next steps
Nikhil: interested in hearing people's thoughts
RafaelCintron: I acknowledge what Nikhil is saying, my worry if we get too low level we at the browser can't make them performant. But we'd have to see what they are, you say matmul and conv2d are included. What would be the other ones?
Nikhil: that's a great point, what you mentioned. I'm not saying this is 100% what we do, but I think it will be easier if we have a smaller opset
Nikhil: If we ride the wave of another standard looking at ops
Nikhil: there's the other approach is that you would never go upwards, you'd ship that lower level ops and it would go directly to x86, arm, etc. If we have to call something higher than that - that's an interesting question
RafaelCintron: looking at the ops that ningxin_hu proposed are those too high level?
paul_mcdaniel_msft: Nikhil one thing that may help, low level vs high level. Some scenarios we've hit in the past if we go too low you hit real issues with optimizations
paul_mcdaniel_msft: there is an lstm sitting in there but there was no way of knowing this and inferring the structure. Keras added to the TF model the information to allow for optimization
Nikhil: yeah, there are a lot of problems here and that's a valid point. There is no golden bullet. That said it is valid to have lower level, smaller opset that is pretty solid and not moving
paul_mcdaniel_msft: the ONNX group has been trying to do, is function which allows you to go down to these lower level ops while retaining the higher level concept of what it is
paul_mcdaniel_msft: ningxin_hu is talking about activation but you don't want to know its a mul and add because the GPU can do things when it knows it's an activation
daniel_smilkov: just wanted to add something to Nikhil suggestion
daniel_smilkov: I don't think we want to go extremely low level, we want to find intermediate so that we don't have to add ops every 3 months
daniel_smilkov: XLA for example has LTSM information to allow for optimization on GPU/TPU, etc
<paul_mcdaniel_msft> FWI: how the linux foundation is starting to craft "how to propose new ops"
<paul_mcdaniel_msft> https://github.com/onnx/onnx/blob/master/docs/AddNewOp.md
daniel_smilkov: issue with high level is we'll need to keep adding ops. Whatever we standardize today, a year from now we'll have a new architecture that will have more opsets that we'll need to add. It's just the growth
daniel_smilkov: my worry is us keeping up
daniel_smilkov: we'll have to add custom ops if we don't which we've lost the hardware benefit
paul_mcdaniel_msft: that's a good point
paul_mcdaniel_msft: the ONNX foundation, we were worried that we'd have unchecked ops - but it's slowed down and we've started to deprecate ops
paul_mcdaniel_msft: I don't know if we're at steady state, but we were worried about that as well and I don't think we've seen that
<paul_mcdaniel_msft> onnx model architectures
<paul_mcdaniel_msft> https://github.com/onnx/models
daniel_smilkov: the issue is we're trying to guess it
paul_mcdaniel_msft: I'm saying one step stronger, we're not guessing in ONNX
paul_mcdaniel_msft: we have over two dozen model arch that have fed the opsets in ONNX
paul_mcdaniel_msft: it's worth looking at as a case study
daniel_smilkov: it would be a good exercise to go back a year and half ago to understand what models wouldn't run due to missing ops in ONNX
anssik: is that something you all can look at? Or folks in ONNX
daniel_smilkov: yep we can look at that
Action: daniel_smilkov to look at the opsets from a year ago to see what couldn't be ran on ONNX opsets
Nikhil: the goal wasn't to derail this, just to propose this thought because others at Google is considering this
<Zakim> anssik, you wanted to re-ask this question from RafaelCintron: looking at the ops that ningxin_hu proposed are those too high level?
<anssik> https://github.com/huningxin/webnn/wiki/WebNN-first-wave-models-and-ops
Action: Nikhil and daniel_smilkov to compare ningxin_hu ops and XLA to see what, if any, are too high level
<Nikhil> XLA HLO: https://www.tensorflow.org/xla/operation_semantics
ningxin_hu: want to mention that I, the operator list, I did initial investigation into XLA. I reviewed that and there are some overlaps for example convolution and matmul
ningxin_hu: there are some other things such as batch normalization and concat, elementwise, etc
<paul_mcdaniel_msft> NIkhil do you have a copy to that other lower level op you mentioned? was it "TCP"? thx !
ningxin_hu: we have something common - so my initial take is to grow that set slowly doing our due diligence by studying
ningxin_hu: so we can bring in other element wise ops
ningxin_hu: ningxin_hu will make a PR that will allow Nikhil and daniel_smilkov comment on
RafaelCintron: just the question we have from earlier, ONNX is not in the thousands but in the hundreds and XLA seems to be similar. Maybe we can look at an intersection of those
RafaelCintron: that we can agree on that allow for hardware optimizations but not high enough where we're adding numerous ops every year
Rama: I feel that actually it is helpful to look at it as tensor and scalar ops
Rama: you can identify key categories
Rama: you end up covering a large aspect of ops
Rama: one way to limit it is to identify the op categories
Rama: I think if we do that exercise we'll find the commonalities across XLA and ONNX
Rama: the question that arises is, that if you express it this way can you reference them at a higher level?
Rama: there is value in doing that because of the existing implementations
Rama: hand crafted compilers is normally better, while there is innovation in MLIR, custom is normally better
Rama: so there is value in defining higher level ops
Action: paul_mcdaniel_msft to do an intersection between XLA & ONNX
Inference API to load and run a model
anssik: to quickly summarize where we are
<anssik> https://github.com/webmachinelearning/model-loader
anssik: we reached consensus on incubating this idea in this group
anssik: next steps is to start drafting the actual spec text
anssik: the use cases are already covered in the existing explainer
anssik: can anyone speak to prototyping? Possibly folks on the call
RafaelCintron: we can help with the mission of the load model API
RafaelCintron: I think the group needs to define a canonical format, we don't want to have fragmentation which is not good for web developers.
RafaelCintron: that obviously will bring us full circle to the ops
anssik: yeah that's the good long term goal but the API shape doesn't require us to have that
anssik: our current charter blocks us from making a declaration of a model format
RafaelCintron: really, we can't define a format but we can define an opset
<anssik> https://webmachinelearning.github.io/charter/#out-of-scope
<anssik> https://github.com/webmachinelearning/charter/issues
gregwhitworth: I recommend we do start the charter changes, but yes it shouldn't block the API
Action: anssik create a strawman spec for others to iterate on
Virtual workshop agenda building
<Jonathan> Sorry, my internet is down. Catching up on meeting notes. @Anssi
<anssik> Virtual workshop agenda
anssik: I'd like this group to help figure out what to discuss on the virtual workshop
Jonathan: no worries :)
<Jonathan> Hard to wfh with no internet and poor cell phone reception
anssik: Nikhil it would be good to have a talk around different abstractions, such as XLA, MLIR, etc
anssik: we would have offline short talks that would allow for async consumption
anssik: Nikhil & daniel_smilkov do you all think that's a good topic?
<anssik> Virtual workshop agenda
Action: Nikhil and daniel_smilkov to put together abstraction talk proposal
anssik: I'm trying to figure out which parts of the topics would make good for a virtual workshop
anssik: I'd expect us to try and cover 30-50% of the content. We would have 4 90minute sessions to discuss these topics
anssik: for example have folks from Google & XLA to answer questions and have a discussion around these topics
anssik: if you can just vote for which talks you think should be created for offline consumptions
anssik: for now just add a comment of support, if necessary I'll create a more formal voting process
anssik: we have settled on Zoom + Slack for infra
<anssik> Zoom @ W3C
<anssik> anssik: On that page there are tips how to mitigate some of the known security/privacy issues raised recently
<anssik> anssik: Slack has been used in the past workshops with success, it is more accessible to people than IRC
<anssik> ... I will setup our next CG call with Zoom so we can dogfood the system and iron out any kinks
anssik: we're trending for a time in June
Adjourn
<anssik> anssik: thanks Greg for scribing!!