12:27:11 RRSAgent has joined #webrtc 12:27:15 logging to https://www.w3.org/2023/09/15-webrtc-irc 12:27:18 Zakim has joined #webrtc 12:27:36 Meeting: WebRTC WG / Media WG Joint Meeting 12:28:51 tidoust has joined #webrtc 12:28:52 ryo has joined #webrtc 12:29:07 present+ Chris_Needham, Xiaohan_Wang, Francois_Daoust, Elad_Alon, Harald_Alvestrand, Eugene_Zemtsov, Jean-Yves_Avenard 12:29:09 RRSAgent, draft minutes 12:29:10 I have made the request to generate https://www.w3.org/2023/09/15-webrtc-minutes.html tidoust 12:29:57 RRSAgent, make logs public 12:30:14 present+ Hisayuki_Ohmata, Ryo_Yasuoka, Paul_Adenot, Rijubrata_Bhaumik 12:30:41 ohmata has joined #webrtc 12:32:15 Riju has joined #webrtc 12:32:52 Present+ 12:32:54 Xiaohan_Wang has joined #webrtc 12:33:00 hta has joined #webrtc 12:33:02 Present+ 12:33:05 present+ Cullen_Jennings 12:33:08 present+ 12:33:12 eugene has joined #webrtc 12:33:22 present+ Eugene Zemtsov 12:33:31 present+ Hisayuki_Ohmata, NHK 12:34:01 q? 12:34:02 scribe+ tidoust 12:34:12 sprang has joined #webrtc 12:34:20 marcosc has joined #webrtc 12:34:27 ken has joined #webrtc 12:35:24 Present+ 12:35:24 Slideset: @@ 12:35:38 [Slide 2] 12:35:42 cpn: [goes through reminders] 12:35:54 [Slide 5] 12:36:13 s|@@|https://lists.w3.org/Archives/Public/www-archive/2023Sep/att-0014/WEBRTC-MEDIA-2023-09-15.pdf 12:36:50 [slide 6] 12:37:14 jean-yves: If I have audio related issues, may I raise them today? 12:37:26 cpn: We'll see how we manage the schedule, if time allows 12:37:33 [slide 7] 12:37:41 cpn: [reviewing tips] 12:37:44 Topic: Agenda 12:37:49 [slide 8] 12:38:06 topic: Introduction 12:38:17 [slide 10] 12:38:24 tetter has joined #webrtc 12:38:32 Bernard: Some background. Streaming and RTC converging in general. 12:38:50 ... Game streaming, broadcast with fan-out, perhaps to be called low-latency. 12:39:04 ... Point is to combine things at large scale. 12:39:17 ... WebCodecs combined with WebRTC Data channel. 12:39:25 ... We see this solved differently. 12:39:34 ... Raises concerns about duplication of efforts. 12:39:57 ... WebRTC encoded transform often used as Poor man's WebCodecs 12:40:10 ... Some things built into WebCodecs but not in WebRTC. 12:40:20 ... Also two distinct code paths in the browsers. That creates issues 12:40:22 [slide 11] 12:40:36 Bernard: Here are some examples of similar issues in both worlds. 12:41:06 ... Example of QP-based rate control issue in Chromium. We have it in WebCodecs, not in WebRTC. 12:41:48 ... [goes through other examples, including HDR support, encoding/decoding times] 12:42:21 ... These encoder/decoder APIs need to run across a huge range of hardware, and platforms. 12:42:25 ... That's difficult to test. 12:43:46 ... Also differences in codec support, e.g., HEVC and AV1 with subtle differences. 12:43:46 ... And then support for SVC and simulcast. 12:43:46 ... Issues opened in WebCodecs. 12:43:51 [slide 12] 12:44:26 Bernard: Another question has come up in WebRTC: whether goal is to support every desirable feature or to enable apps to build their own support? 12:45:07 ... Examples: In WebRTC streaming, interest in HEVC which is not in WebRTC (work in progress in WebCodecs). In music contexts, AAC. 12:45:25 ... Some of the the use cases that may addressable with a combination of WebCodecs and WebRTC transport. 12:46:04 ... Unified encoder API, which Erik proposes. Under the cover, not a JavaScript API, but it illustrates some of the issues we're seeing that might benefit from being addressed in a more uniform way. 12:46:14 [slide 13] 12:46:28 Bernard: This is an example of an issue I discovered yesterday. 12:46:31 [slide 14] 12:46:52 Bernard: Look at the frame RTT. 12:46:55 [slide 15] 12:47:06 Bernard: The glass-to-glass latency is slightly larger. 12:47:27 ... Somewhere in the system, we're adding 200ms of delay and it's not due to network. That's in the browser. 12:47:30 [slide 16] 12:47:41 Bernard: Encoding latency is pretty low, that looks good. 12:47:44 [slide 17] 12:47:58 Bernard: But the decoding latency is excessive] 12:48:05 s/excessive]/excessive 12:48:11 ... That seems pretty weird. 12:48:30 ... Example of something that does not happen in WebRTC but happens in WebCodecs, and that needs testing. 12:49:20 Randell: Have you validated that the bug is a decoder stack issue or due to the codev AV1. 12:49:40 Bernard: It's not the API, something to do with the decode pipeline. 12:50:20 q? 12:50:59 Jan-Ivar: In Firefox, we now support VideoDecoder, feel free to give it a try. 12:51:22 Topic: QP-based rate control in WebCodecs 12:51:26 (this only work in Fx Nightly right now fwiw, so don't use a release build) 12:51:29 [slide 20] 12:51:57 eugene: Recent change in WebCodecs to allow app to ask about bitrate mode and quantizer use. 12:52:30 ... Some AV1 specific option, which is why it appears in that specific part. 12:52:39 [slide 21] 12:52:56 eugene: I was able to create a demo. 12:53:19 ... which shows how to achieve desired bitrates. 12:53:48 ... Feel free to give it a try 12:53:50 [slide 22] 12:54:18 eugene: My point is that, even with the most basic algorithm, I was able to achieve pretty good results for bitrate control. 12:54:27 ... I think that makes it valuable. 12:54:43 ... Also, very quick response, frame-level response to changing conditions. 12:55:53 ... It allows to work around bugs in GPU drivers. We see in Chrome that, sometimes, their rate control algorithms contain bugs. 12:55:58 ... It gives ability to set lower bounds on image quality: "never give images lower than something", as no one likes pixelated images. 12:56:12 ... I encourage people to try. 12:56:33 q? 12:57:05 hta: Very interesting. I tried your demo. You don't touch resolution at all, is that correct? 12:57:08 eugene: Yes. 12:57:31 hta: I was impressed by the result. For that codec, that seems like a very useful mechanism. 12:57:58 ... May be room to harmonize between codecs. 12:58:57 Bernard: Very interesting exercice. That's an example of how you can write a PR in WebCodecs that would require a complex process in WebRTC. Not everyone might want this in WebRTC. Lots of use cases to validate. 12:59:39 Randell: Issue is not that QP values vary from one codec to another, but also between implementations of a given codec itself. 13:00:00 cpn: That variability, should we test it? 13:00:18 Randell: Yes. I would imagine hardware implementations could vary in their response as well. 13:00:55 eugene: Correlation between the bitrate becoming smaller and the resolution is the same regardless of the implementation. 13:01:26 Erik: In Chrome on Windows, we use this type of external controlled per-frame QP. 13:02:07 eugene: Yes, this allows us to workaound bugs in rate control, as I mentioned earlier. 13:02:27 ... I wanted to encourage other browser vendors to implement this as well. 13:03:55 Topic: Hardware Encode/Decode Error Handling 13:04:15 [slide 25] 13:04:30 Bernard: The related issues are listed here 13:04:37 [slide 26] 13:05:00 Bernard: Little bit of background for issue 146. You can get encode/decode error outside of SDP negotiation. 13:05:13 ... Slide lists examples of when that can happen. 13:05:37 ... You can switch from hardware to software and vice versa. 13:05:55 ... Also, we're seeing increasingly that some profiles are hardware-only. 13:06:04 [slide 27] 13:06:04 I don't understand the case when we get a parsing error for encoder. Someone point me the right way ? 13:06:39 fippo: Some things that we can do. 13:06:50 ... [goes through the list] 13:07:07 ... More telemetry is always a good thing. 13:07:13 [slide 28] 13:07:30 fippo: For WebRTC, how are we going to expose the decode errors? 13:07:46 ... [goes through list of options in the slide] 13:08:12 ... We need to come with a precision on where we want to expose the event 13:08:15 [slide 29] 13:08:28 Bernard: Two main directions to go. This is proposal A. 13:08:35 ... Reuse RTCError event. 13:08:55 ... You can see in the dictionary and enum that we can list a number of reasons. 13:09:26 ... Might be a good idea to add in the timestamp so that you know when something happens. 13:09:30 [slide 30] 13:09:39 Bernard: Proposal B would be to create a custom event. 13:09:50 ... Some sketching in the slide on how that might work. 13:10:39 ... Just wanna get some feedback on which one of these proposals makes sense to people. 13:11:18 fluffy: Supportive of this either way. When we talk about parsing error for encoder, I wonder what that can be. 13:11:26 Bernard: More a decoder thing indeed. 13:11:37 ... Parse error. 13:11:45 fluffy: OK, regardless, much needed. 13:12:00 ... No preference from me. 13:12:19 Henrik: I prefer proposal B because I think that the error is different enough from other errors. 13:12:37 q+ 13:12:37 ... Rather than an unsigned short error number, we should rather have an enum. 13:12:43 Bernard: Yes, we can do that. 13:13:18 q? 13:13:22 florent: [missed] 13:13:25 ack hta 13:13:37 hta: I also prefer proposal B, to avoid coupling. 13:14:14 cpn: The naming here is all RTC specific. If we were to introduce that in WebCodecs, we might need a more general name. 13:14:52 Bernard: Instead of RTCRtpSender or RTCRtpReceiver, we might want to use Encoder/Decoder. 13:15:25 [maybe inherit from ErrorEvent rather than Event? and thus moved the error specific info in the error attribute?] 13:15:50 Henrik: One event handler per decoding could be used. 13:16:03 [slide 31] 13:16:48 Bernard: WebCodecs does have errors. EncodingError for errors about data and OperationError for resource issue. 13:17:08 eugene: Done spec-wise. In Chromium, nothing done. 13:17:35 ... It would be nice to make this recommendation more explicit in the spec so that people know what to expect. 13:18:00 Topic: New Video Encoder API 13:18:08 [slide 34] 13:18:37 Erik: This is the view of how it works today in Chrome for WebRTC. Huge entangled ways of doing things. 13:18:53 ... The most important thing is scalability. 13:19:09 ... In the end, that is implemented in the WebCodecs wrapper. 13:19:15 [slide 35] 13:19:38 Erik: Plan to do an overhaul of the internal WebRTC video encoder API. 13:19:56 ... Anything realted to RTP/Transport, we want that to be external. 13:20:00 s/realted/related 13:20:17 ... And we want everything to be asynchronous. 13:20:22 [slide 36] 13:20:45 Erik: We think that's a good opportunity to aligne WebCodecs and WebRTC to avoid duplicate code. 13:20:54 [slide 37] 13:21:05 Erik: What I would like to see is in this slide. 13:21:23 ... One scalability controller in WebRTC. 13:21:47 ... If you want to do that yourself with WebCodecs, you can. 13:22:21 [slide 38] 13:23:23 Erik: Things we'd like to solve include codec selection, flexible reference structures, as much as possible to minimize codec-specifics and rate control. 13:23:23 [slide 39] 13:23:23 Erik: The browser can be smart but cannot always make the best choice automatically. 13:23:32 ... Maybe one choice is optimal for the sending, but suboptimal for the receiver. 13:23:39 [slide 40] 13:24:19 Erik: To solve this, we want the app to be in full control. If you know the context, you know how to select and prioritize. 13:24:26 q? 13:24:32 [slide 41] 13:24:50 Erik: We had all of these scalability mode systems. 13:24:57 ... and yet they are not enough 13:25:01 [slide 42] 13:25:11 Erik: So many other things you could do. 13:25:44 ... E.g. If you might want to do B-frames, or whatever magic your scenario might need. 13:25:54 ... Not feasible to support everything in the browser. 13:26:05 [slide 43] 13:26:17 Erik: Again, solution is to let the app be in charge. 13:26:29 ... [goes through slide] 13:26:33 [slide 44] 13:26:59 Erik: With these hooks, you can implement all of the scalability modes yourself, and do more, in a codec-agnostic way. 13:27:21 ... As a side effect, if you do this, you need minimal feedback from the decoder. 13:27:36 [slide 45] 13:27:51 Erik: Not going to talk about rate control, Eugene covered it already. 13:27:55 [slide 46] 13:28:25 Erik: Illustration of the concepts that were discussed. Take it as an abstraction for now. 13:28:31 [slide 47] 13:28:40 [slide 48] 13:29:13 Erik: Some mechanism to query bitrate control capabilities. CQP or CBR. 13:29:18 [slide 49] 13:30:25 Erik: Total number of buffers you have avilable. Max number or references, max temporal and spatial layers (output frames per input frame). 13:30:34 [slide 50] 13:30:48 Erik: Which input format is accepted. 13:30:55 ... What pixel formats. 13:30:58 [slide 51] 13:31:06 Erik: Same thing for the output. 13:31:12 [slide 52] 13:31:27 Erik: A bunch of other discussions about what else we could have. 13:31:47 [slide 53] 13:31:55 q? 13:32:08 Erik: How do we actually select and create an encoder 13:32:13 [slide 54] 13:33:06 Erik: Enumeration. Gives you capabilities, implementation name, codec name, code specifics. 13:33:12 [slide 55] 13:33:29 Erik: encoder settings that will apply to the lifetime of the encoder. 13:33:38 [slide 56] 13:33:51 Erik: The main method is encode() 13:34:20 [slide 57] 13:34:54 Erik: The input frame is just a frame. The content hint, the speed setting and how should you do frame drop. 13:35:00 [slide 58] 13:35:14 Erik: Params you can give to the encoder 13:35:18 [slide 59] 13:35:27 q+ 13:35:34 Erik: Apart from control, you have these layers parameters. 13:35:55 [slide 60] 13:36:05 s/[slide 60]// 13:36:40 fluffy: Ignoring all of the details of the API, arbitrary buffers referenced cannot be passed around in the underlying codecs. 13:37:09 ... I think that you'll have a hard time guessing what pieces might work for a given type of hardware. 13:37:56 Erik: For hardware, it depends on drivers. In Chrome OS, we already do that under the scenes. 13:38:26 fluffy: So, works for VP8? 13:38:30 Erik: Yes. 13:38:34 fluffy: HEVC? 13:39:00 Erik: We talked with a few vendors. Some can do it. Some API limits. 13:39:10 q- 13:39:30 [slide 60] 13:39:38 [slide 61] 13:40:02 Erik: This is a complete example of how that would work. 13:40:10 ... Can skip these frames, look at them offline! 13:40:21 [slide 62] 13:40:24 [slide 63] 13:40:30 [slide 64] 13:41:17 ryo_ has joined #webrtc 13:41:18 Erik: The API is not as bad as it looks regarding fingerprinting. All of it can be derived somehow. 13:42:09 Bernard: One of my questions would be: what would be the effects on the JavaScript that we have? 13:42:13 q? 13:42:32 Erik: My understanding is that we have some sort of software fallback that could clash with this. 13:42:45 ... I don't really have an opinion on what the best route forward is this. 13:43:27 jan-ivar: In WebRTC, we had a problem getting powerEfficient. I'm having a hard time seeing how we can expose so many stuff to JavaScript. 13:44:12 ... Double-edged sword is that there's a lot of copy-and-paste on the Web. Good defaults are needed. 13:44:31 ... Tying browser vendors to do the right thing. 13:44:33 q+ to wonder about enumerateDevices 13:45:14 ... Puts a lot of pressure on the client to implement things correctly. 13:45:44 Erik: Agree. I think we could have something separate for WebCodecs that gives some help. No sure I like that. 13:45:58 ... Would one of use write that? Or would we hope that the community does? 13:46:46 Jan-Ivar: I worry about how people may approach these expert APIs. 13:47:06 q+ elad 13:47:11 q+ 13:47:15 Erik: Yes, I'm thinking about 3D cases where WebGL is not your go-to target but rather your game engine. 13:47:20 q- elad 13:47:21 q+ 13:48:10 Elad: With fingerprinting, would giving some capabilities through permissions on microphones, cameras help? 13:48:28 ... Regarding libraries, people are good at creating them. 13:48:50 jan-ivar: Asking for cameras, microphones could be seen as permission escalation. 13:48:57 ... Better direction. 13:49:21 q? 13:49:25 ack hta 13:49:47 hta: Is this an API that you would expose in workers, main thread? 13:50:18 Erik: Not an expert in that. 13:50:52 hta: The current position in WebRTC encoded transform is that it's worker-only. 13:50:52 eugene: Everywhere where WebCodecs is available seems like a good approach. 13:50:52 scribe+ cpn 13:51:24 correction to minutes: I said that the current position in webrtc encoded transform is that we're still quarreling. 13:51:24 Francois: The API allows enumarating decoders, why do you need the exactly list? 13:51:57 q+ 13:51:57 ack tidoust 13:51:57 tidoust, you wanted to wonder about enumerateDevices 13:51:57 erik: if you just ask for a particular codec it's hard to reason what you do with it 13:52:30 paul: most of this is doable in WebCodecs, so prefer you reframe it in terms of WebCodecs 13:52:42 ... do a gap analysis between web exposed capabilities and what's needed 13:53:30 ... e.g., automatic fallback is not a thing. we have capabilities, a registry with per-codec settings 13:53:30 ... we're duplicating a lot here, which we should avoid 13:53:30 +1 paul 13:54:05 erik: is this feasible? should we move towards doing that gap analysis? 13:54:11 paul: file issues. professional creator users and rtc users both have needs, lots of communities engaging 13:54:27 ... reach a uniform API is good, but a lot of what you describe is doable 13:54:42 ... avoid duplicating lots of work 13:54:43 q? 13:54:47 ack pa 13:55:56 hta: you're emphasising precise user control of features, we want to have these these base features and leave the higher level modes like SVC and rate control and simulcast as documented as implemented in terms of these primitives 13:56:05 +1 hta 13:56:30 ... we should do that style of spec more. I want WebCodecs core to be a primitive used by WebRTC and a more user friendly WebCodecs interface, but the core is clear and simple as possible 13:56:44 Erik: My thoughts as well 13:57:14 Florent: On complexity of the API, shouldn't be a problem, there are a few expert APIs on the web, WebCrypto and WebGPU 13:57:24 ... If this were introduced, libraries would make it easier to use 13:58:06 ... similar happened with WebCrypto 13:58:46 Bernard: A cautionary note - it looks like we're on the verge of a major hardware change 13:59:12 marcosc has joined #webrtc 13:59:22 ... ML based codecs for audio. The nature of the hardware is likely to change in the coming years, how would these fit the framework? 13:59:39 ... Per-macroblock QP for segmentation - this would require API changes to WebCodecs 13:59:57 ... The API meets demands over the last few years, but need to look to future demands 14:00:16 Erik: On inter-picture references, I haven't seen them breaking the mold drastically, something to watch for 14:00:26 q? 14:00:30 q- 14:00:30 ack hta 14:01:36 ryo_ has left #webrtc 14:04:33 Xiaohan: How many of these can be option, and have good defaults? So it remains a simple higher level API? 14:05:18 Chris: Agree on the need for a gap analysis. Previous approach has been to add to WebCodecs incrementally, e..g, the per-frame QP. Do we want to move to exposing everything? We've heard concerns about enumeration and potential fingerprinting 14:05:30 .... Next step is to meet again when we have a gap analysis 14:06:54 Topic: Web Audio 14:07:18 jya: Currently, use canPlayType with opus, etc, not sufficient 14:07:40 ... have something on top of MediaRecorder, do you support recording with multiple channels 14:07:56 ... the hope is if you can play it you can decode it, not always true 14:08:05 Xiaohan: Is that the MSE or WebRTC case? 14:08:17 jya: It's MSE, file playback also 14:10:22 ... It's a Media Capabilities decoding query. The requirements to play aren't always the same for playing 14:10:22 Xiaohan: Not so familiar with WebAudio, but the next step could be to raise an issue in Media Capabilities API, and we can follow up 14:15:47 Topic: Media Capabilities issue 185 14:15:47 Chris: [recaps the issue] 14:16:27 hta: When you pass a mime type, question is whether it contains parameters or not 14:17:38 jan-ivar: it stil references the webrtc spec, would we want it to move to MC API? 14:17:56 hta: that was on purpose, so you can pass it to the setCodecCapabilities 14:20:03 ... should we make the capabilities convertible, ideal if they both were the same, but they're both deployed 14:20:54 chris: Discussion needs youenn, so let's follow up in a future needs 14:21:29 Topic: Wrap up 14:22:21 chris: Nothing else to discuss, so let's close here. We'll follow up in future calls 14:22:26 [adjourned] 14:22:31 rrsagent, draft minutes 14:22:32 I have made the request to generate https://www.w3.org/2023/09/15-webrtc-minutes.html cpn 14:22:43 rrsagent, make log public 16:03:05 Zakim has left #webrtc