14:57:19 Meeting: WebML WG Teleconference – 17 November 2022
14:57:24 Chair: Anssi
14:57:28 Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2022-11-17-wg-agenda.md
14:59:36 Present+ Anssi_Kostiainen
14:59:42 Regrets+ Dominique_Hazael-Massieux
15:01:07 Present+ Jonathan_Bingham
15:01:53 Present+ Dwayne_Robinson
15:01:59 Present+ Zoltan_Kis
15:02:12 Present+ Ningxin_Hu
15:02:25 Present+ Rafael_Cintron
15:02:25 special guest: Eugene Burmako, who will co-present with Stella
15:02:43 Present+ Eugene_Burmako
15:07:05 Topic: XLA, OpenXLA Project, StableHLO
15:07:24 Slideset: https://lists.w3.org/Archives/Public/www-archive/2022Nov/att-0001/XLA-Stella-Laurenzo.pdf
15:08:11 [slide 1]
15:08:48 [slide 2]
15:09:13 Present+ Chai_Chaoweeraprasit
15:10:29 [slide 3]
15:17:43 [slide 4]
15:21:11 [slide 5]
15:23:48 [slide 6]
15:24:41 [slide 7]
15:28:05 [slide 8]
15:29:41 [slide 9]
15:30:11 [slide 10]
15:31:46 q+ to ask about StableHLO adoption by compiler projects
15:32:07 ack anssik
15:32:07 anssik, you wanted to ask about StableHLO adoption by compiler projects
15:32:59 anssik: StableHLO adoption in XLA compiler and IREE, which compiler is the leading adopter?
15:33:32 Eugene: re main target for StableHLO, we provide equal support for these two compilers
15:34:22 ... both take StableHLO input and can reflect all the functionality in XLA, cannot speak for IREE personally, but we're closely working with that team on new interesting applications
15:34:39 ... Stella could confirm we can cover ~95% of use case 15:35:22 zkis has joined #webmachinelearning 15:36:04 anssik: do we still have an experimental compiler that interacts with StableHLO? 15:36:43 Eugene: I wish Stella could speak for this topic, IREE is addressing some of the most important emerging ML use cases 15:38:19 ... consumers of StableHLO, there's also TFLite 15:39:08 ... our aspirations go beyond support in Google initiated compiler projects 15:40:22 Chai: we are aware of XLA, we did some work TF codebase 15:40:47 ... on Msft side we haven't leveraged XLA HLO yet, but are familiar with it, moving to open governance is a great thing 15:41:09 ... in the process of defining the WebNN op set we also spent time looking at HLO so we can translate to HLO 15:41:21 ... when we start with a set of ops it must make sense elsewhere 15:41:41 ... looking at HLO op set is very helpful to understand how a reduced op set maps to actual models 15:41:49 ... thanks for your work on this 15:42:15 Chai: one question re interop with GPU 15:42:41 ... you mentioned XLA is used a lot inside Google, what is the model to interop with the GPU stack? E.g. WebGPU with Project Dawn, how is that supported? 15:43:00 Eugene: currently HLO is focused on datacenter usages 15:43:25 ... this is the most tested path, we have aspirations to expand beyond servers but cannot speak for details there 15:44:03 Chai: Computer Vision running on the server usage, this will eventually touch the graphics stack, how does that work with HLO? 15:45:04 Eugene: within Alphabet we support multiple frameworks, TF, JAX, PyTorch etc. if involved with XLA compiler, JAX operated on it, the framework has to produce HLO to feed into compiler 15:45:50 zkis_ has joined #webmachinelearning 15:45:57 ... TF has graph IR, stored, loaded, then we have TFXLA bridge that translates the graphs and compiles to XLA 15:46:14 ... a bunch of ops cannot be compiler to XLA 15:46:35 ... PyTorch uses lazy tensor tracing based mechanism to transform python programs into HLO 15:47:39 ... within the compiler we have target independent optimizations, starting with simple things to advanced optimizations 15:47:48 ... GPU compiler is available publicly, now in TF will be a separate project soon 15:48:17 ... XLA GPU architecture on the high level, we also rewrite this arch to use MLIR more and more 15:48:22 Github has joined #webmachinelearning 15:48:34 ... MLIR is very influential tech we love here at Google 15:49:11 ... to recap, we support many frameworks with the same interface 15:49:14 q? 15:49:27 q+ 15:50:44 q+ 15:51:10 anssik: StableHLO spec stability, timeline? 15:51:20 Eugene: targeting the first version of the spec EOY 15:51:32 ... calling it v1.0 or v0.9 15:51:43 ... feature complete EOY 15:52:02 ... active work in progress, we have half of it specced and the rest in open issues 15:52:33 ... not a perfect time to look at the spec right now, maybe beginning of 2023 is a good date for WebNN op set compatibility effort 15:52:41 -> StableHLO Spec draft https://github.com/openxla/stablehlo/blob/main/docs/spec_draft.md 15:52:46 -> StableHLO Spec open issues https://github.com/openxla/stablehlo/labels/Spec 15:52:52 -> StableHLO Spec index of ops https://github.com/openxla/stablehlo/blob/main/docs/spec_draft.md#index-of-ops 15:52:57 stlukey97 has joined #webmachinelearning 15:53:30 Eugene: areas of major development after 1.0: sparsity, uniform quantization 15:53:52 ... aligning ops between WebNN and StableHLO is a great idea, happy to support there 15:54:45 ningxin_hu: thanks Eugene, great presentation! 15:55:01 ... two groups looking to coordinate on the op sets makes a lot of sense 15:55:19 ... question re op sets, you mentioned op set categories, also usage on server and mobile 15:55:56 ... my impression is some ops are not so useful on mobile, distribution on multiple nodes, do you plan to have profiles in StableHLO e.g. for server and other classes? 15:56:25 Eugene: we are interested in having those profiles, haven't done a full study yet on this topic 15:56:40 ... we have this as a 2023 goal 15:56:57 ... I agree distribution ops have no usefulness on mobile 15:57:33 ningxin_hu: another questions re compilers, you mentioned XLA and IREE compilers, for web usage we want to understand if these compilers support JIT? 15:57:51 ... on-device compilation would be an interesting feature for us 15:58:08 Eugene: in general non-server story currently is where we have less clarity, an area of active work 15:58:30 ... starting with server, XLA compiler is a JIT compiler predomantly 15:58:43 ... OTOH, IREE is ahead-of-time 15:59:22 ... on HLO side, a limitation is no dynamic shapes, no unknown tensor sizes, in StableHLO we address this, if we want to do AOT compilation with dynamic tensor sizes it becomes feasible, that is what IREE does 15:59:38 ... many practical programs are dynamically sized 15:59:49 ... on mobile we look both AOT and JIT use cases 16:00:52 ningxin_hu: question re TFLite, WebNN works with TFLite Wasm version, in your slides you mentioned StableHLO is consumer as a flatbuffer schema, also TFLite delegate could leverage StableHLO 16:01:09 ... is TFLite a consumer of StableHLO? 16:01:33 Eugene: what I shared on the slide is as much as I could share on behalf of the TFLite team, WIP, a bit too early to share the details 16:01:34 q? 16:01:44 ack ningxin_hu 16:01:51 ack chai 16:02:08 chai: Eugene you said you want to push this for mobile too 16:02:29 ... WebNN is the op set for the web, so I think there's some synergy around mapping StableHLO on top WebNN for mobile use case
16:02:40 ... this is an area of collaboration where we can connect the dots
16:04:16 anssik: we may want to add OpenXLA in the coordination for this WG's charter
16:04:23 Eugene: sounds great
16:04:30 +1
16:13:20 Thank you everyone for the invitation to present! It was great to meet you all, and I'm looking forward to collaborating.
16:13:47 For the meeting minutes, is there a way to propose some edits?
16:13:55 And I'll also share the slides shortly.
16:14:04 anssik: thank you Eugene for the presentation!
17:06:58 Hi Anssi! Not sure if my previous message reached the channel (got what seems to be an error message), but I emailed you the slides and proposed edits. 