W3C

Media WG call

11 April 2023

Attendees

Present
Bernard Aboba, Chris Needham, Dale Curtis, Eugene Zemtsov, Francois Daoust, Jer Noble, Peter Thatcher
Chair
Chris, Jer
Scribe
cpn, tidoust

Meeting minutes

Repository: w3c/webcodecs

Allow decoder to ignore corrupted frames

Slideset: https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0000/MEDIAWG-04-11-23.pdf

[Slide 2]

[Slide 3]

Bernard: WebCodecs encoder and decoder errors are fatal. It queue's a task to close the encoder/decoder
… The issue is: Is there some way to not close?

<tidoust> #656

<ghurlbot> Issue 656 Allow decoder to ignore corrupted frames (matanui159) agenda

[Slide 4]

Bernard: Dale clarified that perhaps the text could clarify fatal vs non-fatal
… The safest thing to do could be to close it

Dale: All errors are fatal, not sure why I wrote just software there

Bernard: It doesn't interfere with resiliance, FEC, redundant error coding, etc
… Discussed if we could have more tests for sending errors
… Chromium reports the wrong error type

Dale: We have a bug open to fix that

Bernard: Is it a bug? Is more info needed in the error?
… Question: should all errors be fatal? Dale makes a good point why they should be
… Potential issues with security team review if we don't make it fatal

Dale: If the author wants to handle the error and resume the decoding, could be for them to decide

Bernard: But they can't if it's closed

Dale: They can create a new decoder

Bernard: Yes. That would require a keyframe
… Second question: There are various reasons why you could get an error. Hardware decoders may error where a sofware decoder wouldn't
… Hardware resources could be acquired by another app. Reconfigure with prefer-hardware could then fail, then you'd have to fall back to software
… Paul asked what's the difference between reset and close, and impact on performance?

Bernard: Any opinions? I've had developers ask whether there's truly been a decoder error, or something else, such as a GPU crash
… Does only having EncodingError provide enough info?

Dale: We're limited on the information available to us. Where a software decoder is more permissive it's in a way non-compliant to the spec
… I though we decided among editors that it should be fatal

Bernard: Is there any objection in the WG to that?

(none)

[Slide 5]

Bernard: So what to do next? Reconfiguring with prefer-hardware could fail.

Dale: Developers would have to handle it either way. We could provide more docs to advise on use a more professional analysis tool
… or ffmpeg

Bernard: Is EncodingError right?
… Any other changes to the spec?

Dale: No spec change, just MDN documentation improvements. My team have been working on that

Eugene: Could add a new optional exception such as corrupted stream or corrupted chunk, to say it's something to do with the stream
… and not some kind of infrastructure issue underneath

Bernard: That would be helpful though
… Developers would appreciate that. Would it be an error message inside EncoderError?

Eugene: That error type doesn't sound right

Dale: I'm not sure why we didn't add that. Technically it's an error in the encoding...

Bernard: Not sure it's a requirement to change the type, but the extra info would be useful

Eugene: If we can see the data is noncompliant, and maybe an OperationError when it's a GPU crash or out of resources, would be useful distinction
… Hardware decoders don't always give the reason for error though

Bernard: Next step would be to see if that's feasible and prepare a PR if so

Eugene: I can do that

Allow configuration of AV1 screen content coding tools

[Slide 6]

<tidoust> #646

<ghurlbot> Issue 646 Support for Screen Content Coding (aboba) PR exists

Bernard: We've added a PR to initialise the AV1 quantizer

<tidoust> PR #662

<ghurlbot> Pull Request 662 Enable configuration of AV1 screen content coding tools (aboba)

Bernard: This PR adds a boolean for forceScreenContentTools. Default is false
… when true, the AV1 spec sets a seek force screen content tools, then it uses the palette and block copy tools

[Slide 7]

Bernard: The PR adds this boolean attribute, default false, with an explanation

Eugene: Difference with the other PR?

Bernard: I rebased it due to a merge conflict

Eugene: Another one proposed adding the flag to VideoEncodingConfig. The distinction is you configure it once, but now you'd set it per-frame

Bernard: Quantizer is per-frame as well

Eugene: So need to decide where it goes: per frame or not

Bernard: Should be per frame. Content could change during screen capture, e.g., go from slides to a video presentation where you'd disable screen content tools
… I closed the other PR

Eugene: It belongs per-frame. I checked libaom, it does allow setting per-frame

Bernard: If can change, but how you know to change it is a different story...

Dale: Quantizer makes sense as an encoder, but the screen tools feels like a per-frame metadata thing
… Then under the hood we'd automatically do the right thing

Bernard: WebRTC does it that way, by checking whether it comes from a screen - but could be a game or sports event which are not amenable to screen content tools

Eugene: IIRC, we'll have the same for VP9

Bernard: I guess so, the AV1 tools are more sophisticated. Would that use the same kind of parameter?

Eugene: As far as I know there's a global setting, not per frame

Bernard: HEVC has screen content tools, but it's hardware only, so doesn't make sense to add it, as it woulnd't be used
… Looking at the WebRTC code, it mostly changed the quantizer

Chris: So proposed resolution is to add this per-frame for AV1, then consider VP9 separately. It's very much codec-specific

Extend EncodedVideoChunk metadata for SVC

[Slide 8]

<tidoust> #619

<ghurlbot> Issue 619 Consistent SVC metadata between WebCodecs and Encoded Transform API (aboba) agenda, PR exists

<tidoust> PR #654

<ghurlbot> Pull Request 654 Extend EncodedVideoChunkMetadata for Spatial SVC (aboba)

Bernard: For background: In WebCodecs we support temporal scalablity, and WebRTC supports temporal and spatial scalability
… In the WebRTC encoded transform, we provide metadata for these encoded frames
… WebCodecs has an encodeded metadata, but there's a mismatch - if you want temporal scalability you need to use the WebRTC API
… Since it's in WebRTC, why not bring it also to WebCodecs?

[Slide 9]

Bernard: First question is compare WebCodecs and WebRTC APIs. WebCodecs API is more structured, not just one big blob of stuff as in WebRTC

[Slide 10]

Bernard: The SVC sub-dictionary design has a frame number (unsigned short), not the same as frame id
… frameId is a globally unique id for the frame
… It's something you want to serialize on the wire, so the sender and receiver could want the information. Frame number is a modulo 2^16 of the frame id, to use less space in the wire serialization
… When you describe dependencies you are referencing a series of frame numbers
… In real life you typically don't have 2^16 or 2^32 frame-ago dependencies
… Different names compares to the WebRTC version
… Decode targets and chain links. A forwarder keeps state, and the frame rate and spatial resolution for a particular client. This determines what layers I forward to that client
… It's useful to have this from the encoder, as the forwarder has state about the target, and compare against the frame itself
… It makes it possible forwarder to quickly decide whether to forward or not
… Protect the WebCodecs decoder against things that would cause a decoder error
… Chain links. If you get a frame as a receiver, with dependencies, is it true that if I submit it ot the WebCodecs decoder I should get an error?
… The dependencies might have dependencies? So you'd still get a decoder error
… Chain links look at the whole chain of dependencies, and see if you'd get an error
… Easier for the encoder to send the data than for the receiver to calculate the dependency graph, and avoid duplicate work across each client
… We're thinking this should go in the SVC dictionary
… One thing is there's no unsigned short, just unsigned long. Is this headed in the right direction?

Eugene: As a superficial comment, we should everything as unsigned long. It wouldn't cost anything

Bernard: Agree

Eugene: I'm not sure when this would be implemented

Bernard: It's in the Chromium code base. The SVC modes and depedency descriptor is there

Dale: I think we'd need to check what our encoders produce. We should get one software encoder working before landing the spec change

[Slide 11]

[Slide 12]

[Slide 13]

Bernard: I think it's possible to enable it. Next steps?

Dale: Can you share a link to the Chromium code?

Bernard: Will do

Chris: Prototyping as Dale suggested?

Bernard: It's in WebRTC spec, concern about the alternative version showing up in WebRTC

Dale: I feel like having it working end to end, either WebRTC or WebCodecs, would be good

Bernard: If the WG approves the approach, could follow it up in WebRTC

Dale: Would prefer to see it working first though

Eugene: We only have the temporal layer ID, as that's the only thing implemented, and we can add more dictionary entries when we're happy

Bernard: We could submit a PR to remove from WebRTC, then when it's implemented add it in both places
… It's dangeous for both to be out of sync

Chris: Has this shipped in WebRTC or is it only a spec change for now?

Bernard: Would have to check

Chris: Happy to organise joint meeting discussion between both groups

Chris: So check if shipped, get working end to end in either WebRTC and WebCodecs, then ensure specs are consistent across both

Media WG rechartering

Repository: w3c/charter-media-wg

#38

<ghurlbot> Issue 38 Document Picture-in-Picture (steimelchrome)

<Github> w3c/media-wg#38 : Add webcodecs quantizer mode to agenda listing.

cpn: Open question related to rechartering and the Document Picture-in-Picture.
… The Media WG was suggested in the TAG review as venue for Recommendation track progress.
… Suggestion at this stage would not be that it become a WG deliverable but rather a potential normative deliverable as we've done in the past with the Audio Focus API.
… With the goal being to avoid rechartering when spec is ready to enter the Recommendation track.
… Question here is whether we should consider the document to our scope as a potential normative spec.
… François pointed out that this would change the scope of the WG a little, because the group is focused on media related features.
… Document PiP is broader in scope.
… While it has interest from media companies to using it for media content, it's not restricted to that at all.
… Question is: is the Media WG the right venue to continue work on the spec?
… In favor: Picture-in-Picture is developed by the Media WG
… I have a concern about the broader scope though.

jernoble: The original media element supported fullscreen mode. The Fullscreen API is a more general purpose API. I think that is a WHATWG deliverable and that there are lots of parallel there.
… As an implementer, I don't know that working on the spec in the Media WG makes sense.

cpn: I think François suggested an alternate ohme for the spec, WebApps WG.
… I don't think there's a barrier to landing that on the Recommendation track somewhere.
… I'll just start an email thread with Tommy and chairs to thrash this out.
… We don't want to hold up too much on that because draft charter is mostly ready otherwise.

Dale: Most use cases are media-related but argument and comparison with Fullscreen makes sense to me.

jernoble: If it becomes necessary to integrate with other specs, such as CSS, etc. it also makes sense to use a group that's more used to doing that.

cpn: I agree. My proposal would be not to do it here.

jernoble: Even more importantly, I think it would be even more successful in another group.

TPAC 2023 joint meetings

cpn: I think what I'm going to propose for our group is similar to last year: full morning or full afternoon to go through discussions. The other thing I'm interested in exploring is joint meetings.

Bernard: Ongoing poll to figure out who's going to show up in-person. If not enough people, my sense is that W3C guidelines are not to request TPAC time.
… I'll be remote.

cpn: I'm planning to be there in-person.

tidoust: I wouldn't worry too much about the requirement for number of people. The venue is large enough

Bernard: Joint meeting with WebRTC will be useful. Lots of ongoing discussions.

cpn: I'll follow-up via email on the specifics of that.

Minutes manually created (not a transcript), formatted by scribe.perl version 210 (Wed Jan 11 19:21:32 2023 UTC).