Media WG call – 11 April 2023

Meeting minutes

Repository: w3c/webcodecs

Allow decoder to ignore corrupted frames

Slideset: https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0000/MEDIAWG-04-11-23.pdf

Bernard: WebCodecs encoder and decoder errors are fatal. It queue's a task to close the encoder/decoder
… The issue is: Is there some way to not close?

<tidoust> #656

<ghurlbot> Issue 656 Allow decoder to ignore corrupted frames (matanui159) agenda

[Slide 4]

Bernard: Dale clarified that perhaps the text could clarify fatal vs non-fatal
… The safest thing to do could be to close it

Dale: All errors are fatal, not sure why I wrote just software there

Bernard: It doesn't interfere with resiliance, FEC, redundant error coding, etc
… Discussed if we could have more tests for sending errors
… Chromium reports the wrong error type

Dale: We have a bug open to fix that

Bernard: Is it a bug? Is more info needed in the error?
… Question: should all errors be fatal? Dale makes a good point why they should be
… Potential issues with security team review if we don't make it fatal

Dale: If the author wants to handle the error and resume the decoding, could be for them to decide

Bernard: But they can't if it's closed

Dale: They can create a new decoder

Bernard: Yes. That would require a keyframe
… Second question: There are various reasons why you could get an error. Hardware decoders may error where a sofware decoder wouldn't
… Hardware resources could be acquired by another app. Reconfigure with prefer-hardware could then fail, then you'd have to fall back to software
… Paul asked what's the difference between reset and close, and impact on performance?

Bernard: Any opinions? I've had developers ask whether there's truly been a decoder error, or something else, such as a GPU crash
… Does only having EncodingError provide enough info?

Dale: We're limited on the information available to us. Where a software decoder is more permissive it's in a way non-compliant to the spec
… I though we decided among editors that it should be fatal

Bernard: Is there any objection in the WG to that?

(none)

[Slide 5]

Bernard: So what to do next? Reconfiguring with prefer-hardware could fail.

Dale: Developers would have to handle it either way. We could provide more docs to advise on use a more professional analysis tool
… or ffmpeg

Bernard: Is EncodingError right?
… Any other changes to the spec?

Dale: No spec change, just MDN documentation improvements. My team have been working on that

Eugene: Could add a new optional exception such as corrupted stream or corrupted chunk, to say it's something to do with the stream
… and not some kind of infrastructure issue underneath

Bernard: That would be helpful though
… Developers would appreciate that. Would it be an error message inside EncoderError?

Eugene: That error type doesn't sound right

Dale: I'm not sure why we didn't add that. Technically it's an error in the encoding...

Bernard: Not sure it's a requirement to change the type, but the extra info would be useful

Eugene: If we can see the data is noncompliant, and maybe an OperationError when it's a GPU crash or out of resources, would be useful distinction
… Hardware decoders don't always give the reason for error though

Bernard: Next step would be to see if that's feasible and prepare a PR if so

Eugene: I can do that

Allow configuration of AV1 screen content coding tools

[Slide 6]

<tidoust> #646

<ghurlbot> Issue 646 Support for Screen Content Coding (aboba) PR exists

Bernard: We've added a PR to initialise the AV1 quantizer

<tidoust> PR #662

<ghurlbot> Pull Request 662 Enable configuration of AV1 screen content coding tools (aboba)

Bernard: This PR adds a boolean for forceScreenContentTools. Default is false
… when true, the AV1 spec sets a seek force screen content tools, then it uses the palette and block copy tools

[Slide 7]

Bernard: The PR adds this boolean attribute, default false, with an explanation

Eugene: Difference with the other PR?

Bernard: I rebased it due to a merge conflict

Eugene: Another one proposed adding the flag to VideoEncodingConfig. The distinction is you configure it once, but now you'd set it per-frame

Bernard: Quantizer is per-frame as well

Eugene: So need to decide where it goes: per frame or not

Bernard: Should be per frame. Content could change during screen capture, e.g., go from slides to a video presentation where you'd disable screen content tools
… I closed the other PR

Eugene: It belongs per-frame. I checked libaom, it does allow setting per-frame

Bernard: If can change, but how you know to change it is a different story...

Dale: Quantizer makes sense as an encoder, but the screen tools feels like a per-frame metadata thing
… Then under the hood we'd automatically do the right thing

Bernard: WebRTC does it that way, by checking whether it comes from a screen - but could be a game or sports event which are not amenable to screen content tools

Eugene: IIRC, we'll have the same for VP9

Bernard: I guess so, the AV1 tools are more sophisticated. Would that use the same kind of parameter?

Eugene: As far as I know there's a global setting, not per frame

Bernard: HEVC has screen content tools, but it's hardware only, so doesn't make sense to add it, as it woulnd't be used
… Looking at the WebRTC code, it mostly changed the quantizer

Chris: So proposed resolution is to add this per-frame for AV1, then consider VP9 separately. It's very much codec-specific

Extend EncodedVideoChunk metadata for SVC

[Slide 8]

<tidoust> #619

<ghurlbot> Issue 619 Consistent SVC metadata between WebCodecs and Encoded Transform API (aboba) agenda, PR exists

<tidoust> PR #654

<ghurlbot> Pull Request 654 Extend EncodedVideoChunkMetadata for Spatial SVC (aboba)

Bernard: For background: In WebCodecs we support temporal scalablity, and WebRTC supports temporal and spatial scalability
… In the WebRTC encoded transform, we provide metadata for these encoded frames
… WebCodecs has an encodeded metadata, but there's a mismatch - if you want temporal scalability you need to use the WebRTC API
… Since it's in WebRTC, why not bring it also to WebCodecs?

[Slide 9]

Bernard: First question is compare WebCodecs and WebRTC APIs. WebCodecs API is more structured, not just one big blob of stuff as in WebRTC

[Slide 10]

Bernard: The SVC sub-dictionary design has a frame number (unsigned short), not the same as frame id
… frameId is a globally unique id for the frame
… It's something you want to serialize on the wire, so the sender and receiver could want the information. Frame number is a modulo 2^16 of the frame id, to use less space in the wire serialization
… When you describe dependencies you are referencing a series of frame numbers
… In real life you typically don't have 2^16 or 2^32 frame-ago dependencies
… Different names compares to the WebRTC version
… Decode targets and chain links. A forwarder keeps state, and the frame rate and spatial resolution for a particular client. This determines what layers I forward to that client
… It's useful to have this from the encoder, as the forwarder has state about the target, and compare against the frame itself
… It makes it possible forwarder to quickly decide whether to forward or not
… Protect the WebCodecs decoder against things that would cause a decoder error
… Chain links. If you get a frame as a receiver, with dependencies, is it true that if I submit it ot the WebCodecs decoder I should get an error?
… The dependencies might have dependencies? So you'd still get a decoder error
… Chain links look at the whole chain of dependencies, and see if you'd get an error
… Easier for the encoder to send the data than for the receiver to calculate the dependency graph, and avoid duplicate work across each client
… We're thinking this should go in the SVC dictionary
… One thing is there's no unsigned short, just unsigned long. Is this headed in the right direction?

Eugene: As a superficial comment, we should everything as unsigned long. It wouldn't cost anything

Bernard: Agree

Eugene: I'm not sure when this would be implemented

Bernard: It's in the Chromium code base. The SVC modes and depedency descriptor is there

Dale: I think we'd need to check what our encoders produce. We should get one software encoder working before landing the spec change

[Slide 11]

[Slide 12]

[Slide 13]

Bernard: I think it's possible to enable it. Next steps?

Dale: Can you share a link to the Chromium code?

Bernard: Will do

Chris: Prototyping as Dale suggested?

Bernard: It's in WebRTC spec, concern about the alternative version showing up in WebRTC

Dale: I feel like having it working end to end, either WebRTC or WebCodecs, would be good

Bernard: If the WG approves the approach, could follow it up in WebRTC

Dale: Would prefer to see it working first though

Eugene: We only have the temporal layer ID, as that's the only thing implemented, and we can add more dictionary entries when we're happy

Bernard: We could submit a PR to remove from WebRTC, then when it's implemented add it in both places
… It's dangeous for both to be out of sync

Chris: Has this shipped in WebRTC or is it only a spec change for now?

Bernard: Would have to check

Chris: Happy to organise joint meeting discussion between both groups

Chris: So check if shipped, get working end to end in either WebRTC and WebCodecs, then ensure specs are consistent across both

Media WG rechartering

Repository: w3c/charter-media-wg

#38

<ghurlbot> Issue 38 Document Picture-in-Picture (steimelchrome)

<Github> w3c/media-wg#38 : Add webcodecs quantizer mode to agenda listing.

cpn: Open question related to rechartering and the Document Picture-in-Picture.
… The Media WG was suggested in the TAG review as venue for Recommendation track progress.
… Suggestion at this stage would not be that it become a WG deliverable but rather a potential normative deliverable as we've done in the past with the Audio Focus API.
… With the goal being to avoid rechartering when spec is ready to enter the Recommendation track.
… Question here is whether we should consider the document to our scope as a potential normative spec.
… François pointed out that this would change the scope of the WG a little, because the group is focused on media related features.
… Document PiP is broader in scope.
… While it has interest from media companies to using it for media content, it's not restricted to that at all.
… Question is: is the Media WG the right venue to continue work on the spec?
… In favor: Picture-in-Picture is developed by the Media WG
… I have a concern about the broader scope though.

jernoble: The original media element supported fullscreen mode. The Fullscreen API is a more general purpose API. I think that is a WHATWG deliverable and that there are lots of parallel there.
… As an implementer, I don't know that working on the spec in the Media WG makes sense.

cpn: I think François suggested an alternate ohme for the spec, WebApps WG.
… I don't think there's a barrier to landing that on the Recommendation track somewhere.
… I'll just start an email thread with Tommy and chairs to thrash this out.
… We don't want to hold up too much on that because draft charter is mostly ready otherwise.

Dale: Most use cases are media-related but argument and comparison with Fullscreen makes sense to me.

jernoble: If it becomes necessary to integrate with other specs, such as CSS, etc. it also makes sense to use a group that's more used to doing that.

cpn: I agree. My proposal would be not to do it here.

jernoble: Even more importantly, I think it would be even more successful in another group.

TPAC 2023 joint meetings

cpn: I think what I'm going to propose for our group is similar to last year: full morning or full afternoon to go through discussions. The other thing I'm interested in exploring is joint meetings.

Bernard: Ongoing poll to figure out who's going to show up in-person. If not enough people, my sense is that W3C guidelines are not to request TPAC time.
… I'll be remote.

cpn: I'm planning to be there in-person.

tidoust: I wouldn't worry too much about the requirement for number of people. The venue is large enough

Bernard: Joint meeting with WebRTC will be useful. Lots of ongoing discussions.

cpn: I'll follow-up via email on the specifics of that.

Attendees