Media WG meeting – 30 May 2023

Meeting minutes

<padenot> in the agenda, the Detach codec inputs link is wrong, the issue is: w3c/webcodecs#104

Detach Codec Inputs

Eugene: https://github.com/w3c/webcodecs/issues/104, there's a PR that adds a transfer list to avoid a copy
… The spec change is minimal but allows saving a copy when the frame is constructed
… PR is w3c/webcodecs#676
… Can also consider it for audio and ImageDecoder, but this was a starting point
… If we agree on the approach for video, it'll be trivial to do those also
… We add one member to VideoFrameBufferInit. It's a list, potentially it might be a single item, as we never need to transfer more than one arraybuffer
… But for consistence with scoped clone and postMessage in workers, I made it a list
… In future could support separate array buffers for planes, so to future-proof it a bit, transfer list is an array
… Paul looked at the spec wording, I tried to incorporate his feedback
… We say what goes in the transfer list, we test if can be detached, if not we throw. Then detach everything in the transfer list

Youenn: What if there's something in the transfer list unrelated to the VideoFrame

Eugene: Everything gets detached, but we don't use the pixels. We give structured clone an object, and all buffers in the list get transfered. If they're in the object that's cloned they get transfered, otherwise detached
… It's not the most obvious behaviour, but consistent with other places on the web

Youenn: Developers could make a mistake using it, but if consistent with postMessage and other places, it's OK I guess

Eugene: Any objections from anyone? If we're all happy, we intend to figure out wording, and merge. I have a prototype in Chromium, will send intent to ship?

Dan: Are GPU buffers that are mapped transferreable?

Eugene: They're not

Paul: My comments are resolved. Two comments still open, for the first the sentence isn't useful, as we're transferring the data and not copying - line 3449 in the latest version

Eugene: "Use visible rectangle and layout"

Paul: You can do that, but not useful if you're copying

Eugene: I think we still need it, you might have an arraybuffer with gaps, and layout would tell you where to look for the planes. I took the wording from the copy case
… If we need to know it for when to copy from, we also need it in this case

Paul: The idea of the prose is to make a smaller copy. You have all the info about the layout so doesn't need normative text, it can be omitted so we don't say the same thing multiple times

Eugene: I'll drop it

Paul: The other comment is just a remark, nothing to do

Dan: If we remove that prose, it might remove ability to purposely change those things, e.g., change the visible rect to 0,0
… Might be useful to preserve for transfer. We'd transfer the same backing
… It's not required to produce the same visible rect as output

I want browsers to have the option to do it

Paul: If we think there's a compatibility issue, it needs to be in the spec

Dan: I also want to refrain from forcing browsers to not copy. E.g., the odd nv12 case, Chrome expands the size outwards

Paul: So if you specify a buffer in the transfer list, but the implemenation says it's cannot be done, would you prefer it to throw or silently copy?

Eugene: Think we need to silently copy, a browser might not implement this part of the spec
… We might have a buffer that doesn't work like this, e.g., a GPU backed buffer in WebGPU, or other corner cases where we can't guarantee transfer

Paul: Throwing is fine. Detach is an operation that can cause issues. In general I prefer to be explicit, but postMessage as precedent - or generally, transfer lists

Eugene: Another thing we can't transfer is SharedArrayBuffer. That might be preserved by a worker, and we don't want pixels to be changed from under the VideoFrame

Paul: Can make it throw in that case

Eugene: Dan, if you have more comments, please add to the PR
… I'll add detail on the errors throw, in what cases

Paul: I approved the PR, add a non-normative note

Youenn: Would be good to check edge cases around structured clone, e.g., if you have the same arraybuffer twice in the transferlist
… Ideally we'd just call structured clone, but good to check all possible cases

Eugene: I'll do this. We can't use structured clone as it makes an arraybuffer out of another arraybuffer, and we need to move it to a VideoFrame. But we're aiming to be like structured clone

Paul: Why can't we use structured clone?

Eugene: It takes an object and clones or detaches. For us it's tricky as there's no visible arraybuffer after moving to the VideoFrame, it'll be an implementation detail

Paul: It's an abstract operation, take what you need from it and give it to the VideoFrame object. Call the structured serialize transfer from the HTML spec
… Get the return value, an object with properties, then use those as part of the constructor. We call the operation IsDetachedBuffer, could be something higher level rather than reimplementing

Eugene: Good idea, I'll look into it and answer in the PR

<padenot> eugene (IRC): for reference: https://html.spec.whatwg.org/multipage/structured-data.html#structuredserializewithtransfer this is the operation

Chris: Anything else on this item?

Paul: An extension, once you've encoded your image, for example a video frame, the memory can be acquired again
… So give a 4K frame to the video encoder, it optionally gives you back the memory of the 4K frame
… Ideally we can reuse the texture or buffer cycle. There's an API proposal linked in the issue, an argument in the callback
… You take a regular arraybuffer, give to encoder, it gets detached, the browser might collect it eventually. The frame is encoded so it's useless
… Proposal is to give it back to JS to it can reuse it, if often have frame of same size
… VideoFrame can have an arraybuffer inside it. It works for software, not so much for hardware

Eugene: What if we encode the same frame with another VideoEncoder?

Paul: Only works if you transfer
… So give the frame, so the implementation becomes the owner. Saves a large number of allocation of the same size, to speed up the transcoding case

Eugene: This would work nicely with this PR, as we'll take an arraybuffer, transfer to VideoFrame, then transfer to encode, then VideoFrame gives the buffer back. Could consider adding a new callback to give the buffers back

<tidoust> w3c/webcodecs#212

<padenot> w3c/webcodecs#212 (comment)

Dan: I have implementation concerns, due to how frames are refcounted, we may not know when to return them

Paul: This was commented by someone, link in the minutes

ChrisN: Youenn, any view on this?

Youenn: Not sure, maybe copy can be done in one place?

Paul: Might want to zero the memory before handing it back, it'll be touch a lot, costly to send back to the pool. If you can just send the original frame without touching it at all you save reallocation or somewhere in the JS engine would have to zero it out
… As a point of reference, if you're attempting to do realtime 4K encoding, just zeroing the memory makes you over budget with DDR4

Bernard: We've seen that with gaming

Youenn: What about closing?

Paul: It's OK to give JS back the memory it knows about. But hard to have new data created from script with data from previous use
… Proposal to make it bounded, rather than reuse the same backing buffer somehow as part of another API call. If we're forced to do that, creates security or performacne concerns
… It's all commented in the issue. The actual IDL change is small
… In a realtime scenario with high frame rate and fast software encoder, time durations are annoying
… It also busts the memory busts, so slows down other CPU operations

Youenn: In general we prefer hardware encoders
… I'll look at the issue

Orientation Metadata

Chris: w3c/webcodecs#351

Dan: This is a hole in the spec for a while. We have frames with orientation metadata internally, eg., in the container
… Rendered in video element it looks as expected, but copyTo has no way to signal
… Could hide it, so copyTo acts like the frame is rotated, so developers don't need to know about orientation
… Other way is to expose orientation metadata in the APIs
… Orientated frames will be rare, capture from a rotated camera, we're not getting feedback about that
… Want to feed it into decoders. Is it OK to hide this, or expose the metadata which is what low level video developers are used to

Youenn: It's not that rare, in Safari iOS. We don't expose VideoFrame there yet. It's important and will happen a lot
… Might be difficult for developers to get it right. Two browsers on same device, one might handle differently to the other
… If you create a VideoFrame by hand, isn't it useful to provide the orientation metadata somehow?
… Thoughts on potential usage of metadata outside of this?

Dan: If you know the frame orientation you can configure the encoder to use it, can have savings
… So there are some reasons to expose it, but things we could get to later as people need them

Youenn: What would be the default?

Dan: Apply orientation metadata first, then encode - unless we add future API controls to change that

Youenn: OK

Paul: How would it work, to not apply the transformation?

Dan: If the encoder config matches the incoming video frame, you wouldn't need to transform

Paul: So you can't do pixel data repack, send to canvas, you lose the orientation at this point, for modifying the frame? You go outside the realm of WebCodecs so the metadata is lost

Dan: Yes. There's CSS properties, but it's a mess

Paul: Could be useful to have implicit but know the orientation anyway. I think I agree with you

Youenn: So there'd be a need for an option to say don't orient before encoding? Do not apply transformation at all

Dan: Can change every frame. Would these back and forth switches be common in practice?

Youenn: Can be quite a bit on change of phone orientation. Size of video frame is the same, so we don't reconfigure

Dan: We'd have to output a new config, a new concept, don't currently have a non-keyframe config change config. Looking at that for color anyway

Paul: An exposed implementation detail proposal is workable

Dan: Worry that per-spec we'd treat as a size change, not what you want

Paul: What is shipped currently in Chrome?

Dan: Curently encoders don't know about orientation, so they'd not see the orientation change, so they'd visually rotate

Paul: The video wouldn't change dimensions, just rotate suddenly

Dan: Something to think about. As people don't complain already, maybe we're making a copy

Bernard: Dont' rely on people complaining!

Dan: Sounds like there's general support, but have extensions in the future?

Paul: Thinking about if we don't expose, but what if you use some other API than WebCodecs?

Bernard: I think it'll be a problem, create something like a CVO header extension to keep the orientation

Youenn: Also background blur and suddenly you have to do orientiation, could it be a performance issue as well?

Bernard: Yes

Chris: Seems we're tending toward wanting to expose something?

Dan: Debate on whether to include in first version or not
… I expect we'll implement eventually

Custom Error Types

ChrisN: w3c/webcodecs#669

Eugene: I was looking at how WebCodecs handles different error cases. We have a list of DOMException reasons from the existing spec
… Those reasons are pretty vague, don't cover all our needs
… WebGPU introduces a new sublass of DOMException, GPUPipelineError. Maybe WebCodecs could indicate its error cases through its own exception subclass, to indicate errors with more precision
… As a subclass of DOMException, can be done in a backwards compatible way

Paul: In Firefox we use the standard errors but also have a mechanism to log errors so developers have info without introducing fingerprinting

Bernard: We have almost no useful error info today. An example is hardware resources not being allocated, another is a data issue. Distinguishing those would be better than where we are now
… Another is GPU crashes, etc

Paul: We should do a survey of what we could do

Bernard: Chrome doesn't do what the spec says, also doesn't distinguish data and resource issues

Eugene: Do you think it makes sense to introduce a subclass to convey more info, or go with the DOMException reason names
… We can indicate in this or that case, which DOMException name is used. Doesn't matter if name matches precisely, so long as people can determine how to at onit
… Might be fingerprinting surface

Paul: Doesn't have to though

Bernard: If you get into a lot of detail in the errors, it might be

<tidoust> [FWIW, interfaces/specs that extend DOMException: GPUPipelineError, OverconstrainedError, RTCError, WebTransportError, see https://dontcallmedom.github.io/webidlpedia/names/DOMException.html ]

Bernard: My concern is we have nothing now

Dan: We originally wanted to add recoverable and non-recoverable error path. I'd support hHaving an error type that distingushing those

Bernard: Distinguish data from resource errors
… Could also use WebGPU errors, e.g., if there's a GPU pipeline error

Eugene: Those errors are very WebGPU specific, dont' want to go there
… So use OperatingError for resources and EncodingError for data

Bernard: sounds good

Eugene: So don't add a subclass at this stage, just clarify the names

Dan: Could be useful to add it in the future

Next meeting

Chris: Meeting in two weeks, scheduled for later in the day, let me know if you prefer this earlier time instead

[adjourned]

– DRAFT –
Media WG meeting

30 May 2023

Attendees

Meeting minutes

Detach Codec Inputs

Orientation Metadata

Custom Error Types

Next meeting

Diagnostics