13:20:54 <RRSAgent> RRSAgent has joined #webmachinelearning
13:20:54 <RRSAgent> logging to https://www.w3.org/2018/10/26-webmachinelearning-irc
13:21:02 <Zakim> Zakim has joined #webmachinelearning
13:21:16 <anssik> RRSAgent, make logs public
13:21:37 <anssik> Meeting: Machine Learning for the Web CG F2F at TPAC 2018
13:21:44 <anssik> Chair: Anssi
13:21:57 <anssik> Agenda: https://github.com/webmachinelearning/meetings/blob/master/2018-10-26-lyon/README.md
13:23:14 <Bryan> Bryan has joined #webmachinelearning
13:23:58 <anssik> Present+ Anssi_Kostiainen, Ningxin_Hu, Greg_Whitworth, Eric_Siow, David_Signer, Tomoyuki_Shimizu, Myles Maxfield, Bryan_Bernhart
13:24:16 <ningxinhu> ningxinhu has joined #webmachinelearning
13:24:21 <anssik> RRSAgent, draft minutes v2
13:24:21 <RRSAgent> I have made the request to generate https://www.w3.org/2018/10/26-webmachinelearning-minutes.html anssik
13:24:43 <ningxinhu> the webml examples URL: https://huningxin.github.io/webml-examples/
13:26:35 <myles_> myles_ has joined #webmachinelearning
13:26:42 <gregwhitworth> Greg Whitworth, Microsoft
13:26:46 <anssik> scribenick: myles_
13:26:58 <anssik> RRSAgent, draft minutes v2
13:26:58 <RRSAgent> I have made the request to generate https://www.w3.org/2018/10/26-webmachinelearning-minutes.html anssik
13:27:07 <ningxinhu> Ningxin Hu, Intel
13:27:08 <myles_> Myles C. Maxfield, Apple
13:27:13 <takio> takio has joined #webmachinelearning
13:27:53 <myles_> present+ myles_
13:28:00 <anssik> Present+ Tatsuya_Igarashi
13:28:02 <takio> Present+ Takio_Yamaoka
13:28:24 <gregwhitworth> Present+ gregwhitworth
13:28:45 <myles_> anssik: it has been a long week. This is the last meeting possible at TPAC. It's exciting topic!
13:28:47 <anssik> Present+ Barbaba_Hochgesang
13:28:54 <myles_> anssik: let's start in a couple of minutes.
13:29:05 <anssik> Present+ Yang_Gu
13:29:09 <yang_gu> yang_gu has joined #webmachinelearning
13:29:16 <anssik> RRSAgent, draft minutes v2
13:29:16 <RRSAgent> I have made the request to generate https://www.w3.org/2018/10/26-webmachinelearning-minutes.html anssik
13:29:31 <Eric> Eric has joined #webmachinelearning
13:29:50 <igarashi_> igarashi_ has joined #webmachinelearning
13:30:55 <myles_> anssik: who was at the breakout session on Wednesday?
13:31:12 <myles_> anssik: who was not there? 2 people
13:31:25 <myles_> Topic: Welcome, Introductions
13:31:47 <dsinger> dsinger has joined #webmachinelearning
13:32:35 <BarbaraH_> BarbaraH_ has joined #webmachinelearning
13:32:36 <myles_> anssik: on Wednesday we had a breakout session. Most of you saw the demo. We saw a couple of slides about the findings based on teh Chromium implementation. To recap: the performance is pretty close to native, so we can implement the use cases on the web that require low latency. So you saw object recognition, human pose estimation, as examples of use cases that require low latency. and image classification.
13:32:55 <myles_> anssik: we had discussion around the scope of the work briefly. The idea was to use more of this meeting to go into more detail.
13:33:08 <myles_> anssik: ??? has prepare an API proof of concept proposal based on his implementation work
13:33:41 <myles_> anssik: ??? has reviewed the existing platforms that provide native APIs for doing inference: Android, macOS, Windows, and we were going to review the API mapping table.
13:34:12 <myles_> anssik: I would like to connect you guys with each other so we can continue work outside of this meeting. It's unreasonable to expect us to be at our best on Friday afternoon and make progress during these two hours.
13:34:16 <myles_> anssik: Let's get started.
13:34:31 <myles_> anssik: let's start with the review of the charter.
13:34:40 <anssik> -> https://webmachinelearning.github.io/charter/ Charter
13:35:06 <myles_> anssik: this is the charter. With a bunch of you and with the public, we reviewed over the course of the last month.
13:35:09 <Tomoyuki> s/???/Ningxin/g
13:35:20 <myles_> anssik: the goals section is there. We want to keep it seimple. The goal is to define a low level API for machine learning specifically for inference.
13:35:52 <myles_> anssik: the constraint: Not going to define an API that doesn't work across major platforms. So we are fairly tightly scoped. We had some questions during the breakout sessiona bout that. We don't wnat to descriminate against any platforms about that.
13:36:15 <myles_> anssik: Details: The Web API allows construcing neural network computational graph
13:36:25 <myles_> anssik: Can compile a neural network to native form
13:36:32 <myles_> anssik: and can accept input data from somewhere
13:36:43 <myles_> anssik: from various sources (Array buffers, media streams, etc.)
13:37:02 <myles_> anssik: We list teh platform constraints that we have.
13:37:26 <myles_> anssik: Android Neural Networks API, Windows DIrectML, macOS/iOS Metal Performance Shaders and Basic Neural Network Subroutines
13:37:34 <myles_> anssik: there are privacy implications.
13:37:43 <myles_> anssik: we take that seriously, and we document them here.
13:38:37 <myles_> anssik: in scope is inference, out of scope is training. This is because of practicalities. This is because platform's don't expose training facilities. Also we don't expose any hardware facilities. We are not interested in doing overlapping work, so we don't re-invent the wheel.
13:38:48 <myles_> anssik: we coordinate with WebGPU, WebGL, and WebAssembly
13:39:04 <myles_> anssik: Some people here are from those communities, that's great
13:39:18 <myles_> anssik: out of scope: We don't attempt to mandate a model schema or format.
13:39:24 <myles_> anssik: There are other groups that will do that.
13:39:32 <wseltzer> wseltzer has joined #webmachinelearning
13:39:33 <myles_> anssik: any questions?
13:39:35 <wseltzer> present+
13:39:51 <myles_> anssik: Deliverable: Web Neural Network API
13:40:17 <jungkees> present+
13:40:23 <myles_> anssik: We also work with WebRTC for MediaStream for providing input for inference. Also audio. and devices and sensors.
13:40:34 <myles_> anssik: Google proposed coordination with the immersive web working group and community group.
13:40:41 <myles_> anssik: they want to use our work. e.g. object recognition
13:40:52 <anssik> -> https://webmachinelearning.github.io/charter/ Charter
13:41:02 <anssik> Present+ Michael_McCool
13:41:19 <myles_> mmccol: Please add the Web of Things group. We are looking at virtual services. There might be a service that accepts an image and produces JSON.
13:41:36 <myles_> anssik: Thanks. This charter is on GitHub, please open an issue. We need review to add a normative change. This doesn't sound normative.
13:41:44 <myles_> anssik: we will integrate your proposal
13:42:15 <myles_> barbarah: Is this only working groups or are you also working wiht the video interest groups? Video is importatn for machine learning. They may want collaboration
13:42:23 <myles_> anssik: definitely.
13:42:23 <myles_> anssik: this is informative.
13:42:36 <myles_> anssik: the rest of the charter is less exciting.
13:42:48 <myles_> anssik: We work on GitHub.
13:43:10 <myles_> anssik: Consensus-based decisions. I'm the initial chair, but I'm happy to share the workload. Please get in touch if you want to help me chair
13:43:28 <myles_> anssik: We start wtih the tight scope, but we have a plan to expand teh scope, but it requires a 30-day vote with 2/3 support.
13:43:45 <myles_> anssik: If someone proposes expanding the scope, we want to make sure the community agrees with that. Any questions?
13:44:01 <myles_> Michael_McCool: Are you deciding on deliverables?
13:44:03 <myles_> anssik: yes.
13:44:11 <myles_> Michael_McCool: It's a browser-based API?
13:44:12 <myles_> anssik: yes.
13:44:41 <myles_> Michael_McCool: W3C doesn't usually do server-side things. When I was thinking about edge computing, where machine learnign is applicable to edge computing. It's a reasonable thing to do there.
13:44:58 <Aritamk> Aritamk has joined #webmachinelearning
13:44:59 <myles_> dsr: If you could reach out to the node.js community, that would be good.
13:45:05 <myles_> dsr: Please coordinate with the;m
13:45:06 <anssik> -> https://www.w3.org/community/webmachinelearning/ CG home
13:45:09 <myles_> Michael_McCool: Also test cases too
13:45:28 <myles_> Michael_McCool: If we can move this in the server-side stuff, that's good
13:45:40 <myles_> gregwhitworth: When you say server-side, you mean JS right? If so, then it's in-scope
13:45:48 <myles_> dsr: It duplicates browser APIs
13:45:54 <myles_> gregwhitworth: it doesn't actually say browser.
13:45:57 <myles_> Michael_McCool: It shouldn't
13:46:39 <myles_> Regarding the current POC API, it's JS, so it'spretty straightfoward in node.js. We also put the inputs and outputs. Today, Array buffers are standard, but in the futurre, we will do media streams
13:46:46 <myles_> Michael_McCool: There are security concerns
13:47:05 <myles_> anssik: It's good if this was implementable in node. But the browser is the primary target, and node is secondary.
13:47:38 <myles_> For node, because it's different secuirty model, it can access native code, some vendors already provide a solution to expose their native API to node.
13:47:46 <myles_> Michael_McCool: We're already seeing it for native APIs
13:48:08 <myles_> s/For node/ningxinhu: For node/
13:48:49 <myles_> s/Regarding the current/ningxinhu: Regarding the current/
13:48:49 <gregwhitworth> Greg Whitworth, Microsoft
13:49:00 <anssik> Present+ Jungkee_Song
13:49:22 <helena> helena has joined #webmachinelearning
13:49:34 <helena> present+ helena
13:50:09 <anssik> Present+ Wendy_Seltzer
13:51:17 <anssik> Present+ Helena_Rodriguez
13:51:39 <anssik> Present+ Mark_Arita
13:51:52 <anssik> Present+ Dave_Raggett
13:52:51 <anssik> -> https://www.w3.org/Data/events/data-ws-2019/ W3C Graph Data Workshop
13:53:53 <myles_> anssik: thank you.
13:53:58 <myles_> anssik: we know who we are
13:54:01 <anssik> TOPIC: Target demographic, use cases and requirements
13:54:23 <myles_> anssik: In a discussion with gregwhitworth: What is the target audience.
13:54:41 <myles_> anssik: We want to have a good design, so we need to understand the user. Use cases and requirements. Let's discuss it.
13:55:00 <myles_> anssik: Expectation: It's a low-level API, users would be machine learning framework library authors. We want to understand their needs.
13:55:20 <anssik> -> https://github.com/webmachinelearning/meetings/issues/1 WebML use cases by Tomoyuki/KDDI
13:55:21 <myles_> anssik: ??? contributed a document as a startign point fo the discussion, which describes other APIs and use cases for the framework.
13:56:36 <myles_> Tomoyuki: Previous cases for high level use cases. That could clarify what kind of applications can be built on top of WebML. These are application examples. first and second are strongly related to the demo. First: person detection. Recent image recognition can recognize what kind of objects are in a picture frame: human or otherwise. We can detect where the person is in the image.
13:56:46 <myles_> Tomoyuki: Second: skeleton detection
13:56:55 <wseltzer> Agenda: https://github.com/webmachinelearning/meetings/tree/master/2018-10-26-lyon
13:57:16 <myles_> anssik: One observation: On the client, you can do depersonalized, without sending the data. Consider being at home with expensive objects, you dont' want to tell the world you ahve these expensive objects
13:57:19 <wseltzer> s/???/Tomoyuki/
13:57:34 <myles_> ningxinhu: You also need per-pixel segmentation to know whic hpixels are okay and which aren't
13:57:46 <myles_> anssik: this is a good example
13:58:03 <myles_> ningxinhu: this is like photo uploading, but this doesn't require real time processing.
13:58:23 <myles_> Tomoyuki: it requires depersonalization. This depersonalization needs to be done before uploading.
13:58:32 <myles_> ningxinhu: So, you can get a preview of what will be uploaded.
13:58:41 <myles_> ningxinhu: Removing hte background is useful for live streaming
13:58:54 <myles_> Michael_McCool: If we want to do it for live streaming, the performance requirements become higher
13:59:02 <myles_> dsr: Also blurring background
13:59:25 <myles_> High level comment: These are application capabilities or features. I don't see them as use models. Is there anyindustry or type application that would utilize that
13:59:40 <wseltzer> s/High/Barbara: High/
13:59:47 <myles_> Barbara: Would commerce applications use this?
14:00:12 <myles_> anssik: It would cut across everything. It's like a JS library. There are libraries like TensorFlowjs that provide higher-level abstractions for web developers to enable implementing this use case easily.
14:00:23 <myles_> ningxinhu: One is a social network.
14:00:23 <myles_> gregwhitworth: ???
14:00:33 <myles_> ningxinhu: Social network website.
14:00:39 <myles_> gregwhitworth: is there a specific social network?
14:00:49 <myles_> anssik: it's a real problemthat people upload photos to social network without the content of others
14:01:30 <myles_> gregwhitworth: I don't want to tangent too much, but I would love to know who would actually use this. Who is ready to consume this API. Which businesses are waiting fo rit.
14:02:12 <myles_> aritamk: Google photos used to do image recognition, so you upload them to google drive, it does face detection across all your photos. Also, Amazon image recognition capabilities monitoring peopel that they sold to
14:02:26 <myles_> Michael_McCool: Doing it on the client has advantage.
14:02:42 <myles_> Which companies would this actually support their business models?
14:02:48 <myles_> Barbara: Who would utilize this?
14:02:56 <myles_> Barbara: Otherwise it's a science proejct
14:03:14 <myles_> gregwhitworth: I'm working with our teams on this. They aren't chomping at the bit for this. I want to hear Facebook saying that they really want this.
14:03:30 <myles_> Michael_McCool: If the governmetn says you can't share photos of people without their concent, suddenly everyone will want this
14:03:40 <myles_> gregwhitworth: i don't want to spend my engineering resources if that isn't a reality
14:03:52 <myles_> anssik: We need to have outreach to these companies so they can help influence the design
14:04:31 <myles_> Michael_McCool: Instead of person detection, let's do video conferencing. Gesture detection: raising your hand in a meeting. Do you need to run multiple workloads? Do you need to have a queue, single accelerator, or multiple workloads?
14:04:57 <Aritamk> Aritamk has joined #webmachinelearning
14:04:57 <myles_> Michael_McCool. If you have a single accelerator, this becomes an issue, or you may need a queue manager. But here you need social media and video conferencing, rather than technical things.
14:05:09 <myles_> anssik: This is Tomoyuki's initial contribution, which is great.
14:05:15 <myles_> anssik: let's move this to GitHub.
14:05:25 <myles_> anssik: maybe we can move this in the spec for "these are the problem we want to solve"
14:05:36 <myles_> do we ahve any existing client side applciations that are using native?
14:05:52 <myles_> Antialiasing is now shipping with ML trainined models. Hardware vendors would love to see the deployment.
14:06:06 <myles_> Michael_McCool: Texture generation - running it backwards
14:06:13 <myles_> gregwhitworth: who wants antialiasing? gaming?
14:06:13 <myles_> yes
14:06:35 <myles_> anssik: Re-emerging interest to investigate web-based gaming. There's a workshop.
14:07:07 <myles_> anssik: We discussed skeleton detection. Video conferencing is an easy to understand valuable use case. Background removal, raise your hand, etc.
14:07:14 <myles_> Michael_McCool: There are multiple workloads per application
14:07:24 <myles_> ningxinhu: Communication with AI to do sign language recognition.
14:07:43 <myles_> ningxinhu: Currently their solution is based on [inaudible] so we would liek to see if they could use this.
14:07:46 <dsinger> dsinger has joined #webmachinelearning
14:08:23 <myles_> Skeleton detection is also useful to detect the form of surfaces. E.g. the fashion industry. Because when you have a scarf, they take a special position that help them to put ads around the fashion place
14:08:52 <myles_> Michael_McCool: Image generation is a technical area, not a use case. We need to say something like "gaming" instead.
14:08:59 <myles_> anssik: Maybe we can put this in the wiki and massage it.
14:09:14 <myles_> anssik: low-level use cases. These are by definition more like capabilities.
14:09:54 <BarbaraH_> Additional use cases options are retail/commerce, healthcare,  marketing/advertising,fraud detection, online search, Natural Language Processing and audio.
14:10:27 <myles_> Tomoyuki: Three examples: Sometimes neural network developer uses extension on their frameworks like TensorFlow. One exapmle: Custom layer. We sometimes face a kind of layer that is not permitted in the frameworks. So we often want how to extend such kind of layer to already permitted in existing framework.
14:10:31 <myles_> anssik: comments?
14:10:32 <BarbaraH_> Review the use cases on which ones have a web value.
14:11:44 <myles_> ningxinhu: This is important. Faster development of Machine Learning community. New operators come year by year. The idea here is we propose some operators that are in scope, so those might be well-optimized to existing hardware or platform. That's for the existing support of operators. For new operators, we can coordinate with other web APIs like web assembly or web gpu compute shader to allow developers to implement custom layer in a
14:11:44 <myles_>  programmable way and connect the graph built by the WebML to WebGPU.
14:11:44 <myles_>  programmable way and connect the graph built by the WebML to WebGPU.
14:11:52 <sangwhan> q+
14:12:00 <anssik> RRSAgent, draft minutes v2
14:12:00 <RRSAgent> I have made the request to generate https://www.w3.org/2018/10/26-webmachinelearning-minutes.html anssik
14:12:23 <myles_> ningxinhu: Cross hardware boundary issues exist. For our implementation we saw this kind of combination in native code. E.G. Apple's Metal Performance Shader combines with Metal on teh same GPU.
14:12:36 <anssik> Present+ Sangwhan_Moon
14:12:44 <myles_> ningxinhu: We saw some it there. But on teh web we know we need to expose this capability as well as expose performance cliffs.
14:12:53 <t_homma> t_homma has joined #webmachinelearning
14:13:31 <dsinger> dsinger has left #webmachinelearning
14:13:33 <myles_> Michael_McCool: This drives toward an architecture decision. Is there a compilation tool? Is there a Machine independent compiled model? Or do you specify it on the fly in the API? If you want performance, you probably want precompiled model. There probably should be a statement saying you want to maintina performance
14:13:42 <myles_> Precompiled will be tricky to maintain performance
14:13:54 <myles_> Michael_McCool: You want custom code to not blow up performance.
14:14:04 <myles_> Michael_McCool: If you're going to lose performance a lot on some platforms, we need to know early
14:14:34 <myles_> ningxinhu: simd.js is an example. We want to let the developer to test whether or not there is native support.
14:14:39 <Bryan> Bryan has joined #webmachinelearning
14:14:46 <myles_> Feature detection? should we do it?
14:14:56 <myles_> Michael_McCool: Compiling a language is a good strategy.
14:15:08 <BarbaraH_> Which use cases and API are the target for short term versus long term based on developers needs.  That will help drive the MVP with an enhancement roadmap.
14:15:14 <myles_> Sean: Out of the devices, which are programmable?
14:15:20 <myles_> ningxinhu: API should be hardware agnostic.
14:15:33 <myles_> VPU, FPGA?
14:16:04 <myles_> Michael_McCool: There are 2 ways: 1) A full compile from scratch, or use an FPGA
14:16:13 <myles_> Michael_McCool: ASICs
14:16:29 <myles_> IF you have discovery in a low-level API, how do you push that down to device.
14:16:56 <myles_> Sean: You could throw in a capability "is this programmable or not"
14:17:12 <myles_> ningxinhu: for GPU case, WebGPU has a shader language.
14:17:25 <myles_> anssik: Network concatenation
14:18:09 <myles_> Tomoyuki: Some recent network model of deep neural network uses module concatenation. For example, many network modules like MobileNet, ResNet, etc. Most neural network developers have insufficient time for training. We can reduce the training time by importing pre-trained models.
14:18:16 <myles_> Tomoyuki: That gives us image feature extraction.
14:18:26 <myles_> Tomoyuki: The following layer can be trained according to our use cases.
14:18:40 <myles_> ningxinhu: is it really for training or for inference?
14:18:59 <myles_> Tomoyuki: Developing process is just training but the result of the trained model. Trained modules and custom modules [inaudible]
14:19:16 <BarbaraH_> Media use case potential - how do you find the right content to serve your audience quickly?  The Media & Entertainment IG would have more insights.
14:19:17 <myles_> ningxinhu: After training, when you want to deploy, is there a concatenation layer? Or is it already solved when training?
14:19:27 <myles_> ningxinhu: Network concatenation is during training or inference?
14:19:30 <myles_> Tomoyuki: Both.
14:19:53 <myles_> Tomoyuki: In inference phase, we can either prepare the concacenated model or two pretrained modesl separetely. So the developercan select eithe ro fthem
14:20:01 <myles_> ningxinhu: So it's possible in inference
14:20:15 <myles_> Tomoyuki: We can also change the models. A large pretrained model or a small pretained model for low bitrate networks.
14:20:34 <wseltzer> myles_: will there be built-in models?
14:21:03 <myles_> anssik: Shape detection API can use built-in models. They're not in scope. But It's an itneresting idea. We haven't explored it.
14:22:12 <myles_> gregwhitworth: Usually use cases revolve around 7 or 8. I've never heard of anyone asking for antialiasing. This is one of the major hurdles in Edge. We talked to Office, they are hundreds of megs big. What is the benefit of the end user. Id' like to narrow down to specific modesl and specifc use cases for those modesl. Faces, bar codes, etc. Let's have a small set of pre-canned models.
14:22:23 <myles_> gregwhitworth: i don't want long load times.
14:22:23 <myles_> anssik: There's no conflict there.
14:22:34 <myles_> RESOLVED: We should look into pre-canned models.
14:22:52 <myles_> sean: Google proposed "layered APIs" with known names. So that could work
14:23:00 <myles_> sean: "std::facedetection"
14:23:11 <sangwhan> s/sean/sangwhan/
14:23:21 <myles_> sangwhan: Who owns the models? Would loading your own model be secure?
14:23:32 <myles_> s/sangwhan: Who/Who/
14:23:37 <myles_> sangwhan: W3C, presumably
14:23:54 <ReinaldoFerraz> ReinaldoFerraz has joined #webmachinelearning
14:24:23 <myles_> Michael_McCool: Pretrained models allow hardware accelerators. Caching is useful. These names won't change, that helps caching. We will have cross-site problems.
14:24:32 <myles_> gregwhitworth: Caching is by-origin anyways, even if they have the same name.
14:24:46 <myles_> What's the type of the file?
14:24:49 <myles_> sangwhan: undefined.
14:24:59 <myles_> gregwhitworth: We are purposely putting that off until after this.
14:25:53 <myles_> ningxinhu: Experience in POC focuses on mobile, small models. We try to cache the model. Service worker. Javascript TensorFlow.js introduces "web-friendly" model format, which is JSON-based topology description, and small-sized files. To try to fit into the cahce
14:26:09 <myles_> gregwhitworth: The ones i'm worried about thigns like spell checking
14:26:57 <myles_> gregwhitworth: Because of blind training, Are you referring to not do full-training, but slight pivots a little bit? Because not having access to [inaudible] makes it difficult to do it. Am I not able to say "I want the pre-canned one but I will provide 400k with different weights?"
14:27:09 <myles_> ningxinhu: Like loading a delta on top of a model?
14:27:14 <myles_> gregwhitworth: yes
14:27:37 <myles_> ningxinhu: This is for transferrable training. You might need the on-device model to do that.
14:27:40 <myles_> gregwhitworth: We want to do it.
14:28:06 <myles_> sangwhan: For fine-tuning, you will need to expose gradient propagation, because that's also a propagation.
14:28:32 <myles_> Michael_McCool. In some cases you might want flame points, other places you might want 8-bit, it's a tradeoff between size and quality. So each model should have multiple versions.
14:28:53 <myles_> ningxinhu: Regarding these capabilities. In the POC, some native APIs support it, but not all.
14:29:25 <myles_> gregwhitworth: I don't want to over-index on what native APIs do and don't support. I'd rather say "here are the use cases, and V0 is just some demos" and then native implementations can go further.
14:29:45 <myles_> Michael_McCool: Some devices can only support certain quantizations. 8-bit or 32-bit. There may be a constraint on what is supported on each device.
14:29:54 <myles_> sangwhan: This is a discussion bout formats, not about weights
14:30:04 <myles_> Michael_McCool: It directly affects the size of the file.
14:30:14 <myles_> gregwhitworth: It would be valuable to have that even if it's a V2 thing.
14:30:32 <yang_gu> yang_gu has joined #webmachinelearning
14:31:13 <myles_> Tomoyuki: Some devices can do GPU acceleration. Others cannot support GPU acceleration, only runs on CPU. Some web developer is taking care of battery consumption, so they offer image recognition only for accelerated devices.
14:31:58 <myles_> Michael_McCool: There's a whole other category of what do applications that use multiple models do. For a computer, you can do CPU and GPU at the same time. You can support two use cases at the same time. There's another use case of multi-model.
14:32:08 <myles_> gregwhitworth: Performance is super important
14:32:26 <myles_> CPU and GPU may already be super busy. Developers need control over low-level device usage
14:32:38 <myles_> Michael_McCool: Support for multiple workflows is important.
14:32:48 <myles_> sangwhan: Most applications use CUDA BLAS
14:33:24 <myles_> Michael_McCool: There's a CUDA vs graphics issue. We already have APIs have been extended recently to support multiple applications, when more than one device can handle the workloads. We can argue whether that is necessary or not.
14:34:03 <myles_> ningxinhu: There are multiple hardware usecases. We have to select the hardware. CPU, GPU, MPU. We allow the developer to specify the preference of where they want it to run.
14:34:06 <sangwhan> s/CUDA BLAS/CUDA, which locks - so multi-task is probably hard/
14:34:33 <myles_> ningxinhu: GPU and CPU both have pros and cons, and the site can serve different models depending on where the model will be run on.
14:34:52 <myles_> anssik: Does the current scope capture the needs of the target audience?
14:35:10 <myles_> anssik: Is this the tightest possible scope that we can start with?
14:35:35 <myles_> Michael_McCool: Is it possible to precompile the model?
14:36:02 <myles_> gregwhitworth: POC is a POC, and the answer is not yet defined. It's desirable for web developers, but you want both.
14:36:08 <myles_> gregwhitworth: There will be divergent opinions.
14:36:19 <t_homma> t_homma has joined #webmachinelearning
14:36:35 <myles_> Barbara: If you look at the use cases, which ones fit into the MVP vs an extension? Don't try to boil the ocean. MVP please.
14:37:08 <myles_> Michael_McCool: We could decide this now. We could choose either way
14:37:14 <myles_> gregwhitworth: It shouldn't be in the charter.
14:37:21 <myles_> anssik: The group decides what is MVP out of this scope.
14:37:27 <myles_> anssik: We can add more bits
14:37:36 <myles_> anssik: So it sounds like we can start with this scope.
14:37:53 <anssik> TOPIC: Review mapping to platform APIs
14:38:05 <myles_> anssik: [inaudible] has published a mapping table.
14:38:09 <anssik> -> https://github.com/intel/webml-polyfill/blob/master/docs/native_mapping.md Mapping to platform APIs
14:38:31 <anssik> s/[inaudible]/Ningxin/
14:39:27 <myles_> ningxinhu: Before we go through the table, this is an overview. Convolution, depthwise convolution, concatenation. ADD and MUL, RESHAPE, SOFTMAX. We will support them by fuse into convolution or others. Of the9 operators, we have some support for models in teh POC.
14:40:46 <myles_> ningxinhu: Examples: TFLite models exist. Our example grabs these models, load and parse it, and use it as a web API to construct the graph and run inference. SqueezeNet because it's small-sized. Also we tried large model like Inception V3. Except its size is bit but we can get a good speedup for the computation. For the object detection, we have SSD MobileNet. Also we have TensorFlow.js model like MobileNet and PoseNet. Also ONNX models.
14:40:59 <myles_> ningxinhu: MobileNet V2 and SqueezeNet.
14:41:38 <myles_> ningxinhu: Only implemented 9 operators so far, so we tried tehse models and underlying the implementation these operations on native APIs we have implementation supporting ops for MPS, BNNS on macOS, NNAPI on android, and clDDN on Linux and Window.
14:41:45 <myles_> ningxinhu: clDDN is from Intel.
14:41:53 <myles_> ningxinhu: We would liek DirectML POC very soon.
14:42:00 <myles_> ningxinhu: that's the overview.
14:42:41 <myles_> ningxinhu: It's driven from the models. There are different ecosystem's models. We start with some small models optimized for mobile, then add necessary ops to support these models. And implement them across different APIs to get performance data.
14:43:33 <myles_> ningxinhu: We have data! To map the data type Float32 and Float16 is what we have looked at so far. NNAPI doesn't support Float16, everybody else supports both.
14:44:31 <myles_> ningxinhu: For convolution we have how the operator can be mapped to different API. Input, filter (aka "weights"), and bias. There are some differences between the APIs, the notes are in this chart in red.
14:44:48 <myles_> ningxinhu: Also for stride and fused activation, dilation rate, and output
14:45:01 <myles_> ningxinhu: This is just one case for convolution. For the other 9 ops, we have the same kind of data in the table.
14:45:07 <sangwhan> s/liek/like/
14:45:37 <myles_> anssik: We made this to satisfy this requirement that the API is implementable on top of platform APIs. We did the work so you don't have to.
14:45:44 <myles_> anssik: if you spot issues, please let us know.
14:46:24 <myles_> anssik: We dont' have a formal startin gpoint spec. Instead, we have POCs and sketch APIs.
14:46:28 <anssik> TOPIC: Review & discuss Web Neural Network API spec proposals
14:46:36 <myles_> anssik: ningxinhu has a API sketch proposal.
14:47:03 <myles_> anssik: We have some ergonomic issues in this sketch proposal that we already know aboutl
14:47:28 <anssik> -> https://github.com/intel/webml-polyfill/blob/master/docs/api.md WebML API proof-of-concept
14:47:30 <myles_> ningxinhu: Here's the example: Construct (tensor0 + tensor1) * (tensor2 + tensor3). 0 and 2 are constants 1 and 3 are inputs.
14:48:22 <anssik> RRSAgent, draft minutes v2
14:48:22 <RRSAgent> I have made the request to generate https://www.w3.org/2018/10/26-webmachinelearning-minutes.html anssik
14:48:38 <myles_> ningxinhu: ::describes the chart in the doc::
14:49:02 <myles_> ningxinhu: So first you need a neural network context. It's inside navigator.ml.getNeuralNetworkContext()
14:49:30 <myles_> ningxinhu: to build the model, you can see the details on the site. Promises are involved.
14:49:42 <myles_> ningxinhu: Then you speicfy the tensor type
14:50:23 <myles_> ningxinhu: The API is typed, so you have to annotate types along with the data
14:50:42 <myles_> ningxinhu: You can set operands like the upper value in the graph to describe the shape of the tensor. Later you can upload the data to the graph.
14:50:50 <myles_> ningxinhu: We use array buffer view to do the upload
14:50:58 <myles_> ningxinhu: So it's just a scalar in this example.
14:51:24 <myles_> ningxinhu: Tensor 1 is an input, so there's no data to upload. Then same for tensors 2 and 3
14:51:36 <myles_> ningxinhu: Then you need to connect the operators together to define the graph.
14:52:07 <myles_> ningxinhu: Then you add the operations, you specify the flow of computation graph. ::works through the example::
14:53:41 <myles_> myles_: Why sin't htis a programming langue?
14:54:22 <tidoust> tidoust has joined #webmachinelearning
14:54:38 <myles_> ningxinhu: Someone has invented a languge to do it, in python. This is supposed to define a common model for hardware dispatch. In this model we just define the computation graph (DAG) Then you compile it to different hardware. API allows you to define the workload
14:54:47 <anssik> Present+ Francois_Daoust
14:54:56 <myles_> Michael_McCool: Another option is to use a string instead of the graph.
14:55:54 <anssik> RRSAgent, draft minutes v2
14:55:54 <RRSAgent> I have made the request to generate https://www.w3.org/2018/10/26-webmachinelearning-minutes.html anssik
14:56:26 <myles_> Michael_McCool: Dynamic programming allows you to glue graphs together. Having a whole langauge is an option but it has it's own problems. OpenCL uses a string for the string, and therefore views the program as data so you can send it around. The graphs can be serialized to a blob
14:56:41 <myles_> Michael_McCool: The fact that it's close to TensorFlow is valuable.
14:56:47 <myles_> ningxinhu: It's close to native APIs
14:57:23 <myles_> Michael_McCool: If you have an object that represents the graph, you can export and import operations
14:57:48 <myles_> sangwhan: Why do we want export API if we dont have a filesystem API?
14:57:54 <myles_> sangwhan: each model will have a URL anyway
14:58:12 <myles_> Michael_McCool: The serialized blob would be opaque.
14:58:54 <myles_> Michael_McCool: You wouldn't have to define what's in the blob as long as there interoperability
14:59:38 <myles_> Michael_McCool: Okay, maybe that's not true
14:59:53 <myles_> ningxinhu: You can also specify a device selection preference
15:01:08 <myles_> ningxinhu: Execution model, it's liek a tight loop. After you compile your code, you can upload your input data using an array buffer view. Lines like "execution.setInput(0, thingy); execution.setInput(1, thingy)"
15:01:19 <myles_> ningxinhu: Your output can be an input for post-processing for WebAssembly code
15:01:40 <myles_> Michael_McCool: If you wanted to support multiple devices, wouldn't the compilatino need to know which device it will be run on?
15:01:42 <myles_> ningxinhu: yes
15:02:03 <myles_> ningxinhu: The last step: Do the work, returns a promise
15:02:11 <myles_> ningxinhu: "execution.startCompute()"
15:02:35 <myles_> ningxinhu: so that's it
15:02:47 <myles_> Michael_McCool: You could do multiple computations in parallel using Promise.all()
15:02:51 <myles_> ningxinhu: yes
15:03:57 <tidoust> [not willing to bikeshed on the method names for now, but I would expect a Promise to a method called "startCompute" to resolve when the computation has started, not when it's over. In other words, I'd simply call the method "compute" if the results of the computation are available when the promise resolves]
15:04:04 <myles_> ningxinhu: Here, the output is an Array Buffer view, so you have to be careful with validation. You have to map you rinput data to the output data. Ideally the promise resolution will incorporate this, but that would create a new object which is an issue. For the POC, we followed this simple design, but we're flexible.
15:04:51 <sangwhan> 1. Doesn't the current model suggest double memory allocation?
15:04:55 <sangwhan> 2. What is "fast single answer"? Is this going to be defined in the specification later on?
15:04:58 <sangwhan> 3. Choice of integer constants over string enums. Sad panda face.
15:05:01 <sangwhan> 4. Choice of procedural programming practices for model definition seems inconvenient, especially given that this is pretty much for inference only.
15:05:04 <sangwhan> 5. Output layer being re-used isn't nice. Neither does it follow idiomatic javascript practices.
15:05:07 <sangwhan> 6. Plans to define/make consistent poorly defined ops? e.g. "EMBEDDING_LOOKUP"
15:05:11 <sangwhan> 7. Constructors for cases where it makes sense? (for initializations that probably won't reject)
15:06:14 <myles_> sangwhan: When you allocate float32 arrays, they get copied to the GPU, right? Double allocation! Shouldn't those be other types?
15:06:52 <myles_> ningxinhu: it's not an allocation here. Here you're just describing the graph. When you compile the code's model, it will then allocate device memory.
15:07:59 <myles_> ningxinhu: Embedding lookup is poorly defined. Will we define this?
15:09:12 <myles_> ningxinhu: Our POC just calls into android calls. We didn't define anything.
15:09:28 <myles_> s/ningxinhu: Embedding/sangwhan: Embedding/
15:10:49 <myles_> myles_: so it isn't interoperable?
15:10:56 <myles_> gregwhitworth: we will define it in an interoperable way.
15:11:24 <myles_> sangwhan: It's a lot of code. If we're just loading from a URL, shouldn't have to write that much code.
15:11:35 <myles_> ningxinhu: We want to follow the extensible web manifesto
15:11:47 <gregwhitworth> to extend on what I meant - this is a rough API shape, it's not a spec - it's a POC so they want to show what they did - but in no way is this actually defined.
15:12:56 <myles_> myles_: Is this the basis for a future API, or is this just something to show that it's possible on the web?
15:13:05 <myles_> anssik: just something to show that it's possible on the web