IRC log of webrtc on 2022-03-15

Timestamps are in UTC.

14:34:47 [RRSAgent]
RRSAgent has joined #webrtc
14:34:47 [RRSAgent]
logging to https://www.w3.org/2022/03/15-webrtc-irc
14:34:50 [Zakim]
Zakim has joined #webrtc
14:34:56 [dom]
Agenda: https://www.w3.org/2011/04/webrtc/wiki/March_15_2022
14:34:56 [dom]
Slideset: https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf
14:34:56 [dom]
Chairs: Harald, Jan-Ivar, Bernard
15:01:46 [TuukkaT]
TuukkaT has joined #webrtc
15:02:51 [dom]
Present+ Tuukka, Riju, Jan-Ivar, Elad, Guido, Eero, Dom, JohannesKron
15:02:56 [dom]
Present+ Bernard
15:03:04 [eehakkin]
eehakkin has joined #webrtc
15:04:17 [dom]
Present+ Harald
15:04:28 [dom]
scribe+
15:04:58 [dom]
Present+ Varun
15:05:12 [dom]
Recording is starting
15:05:37 [dom]
[slide 1]
15:06:05 [dom]
[slide 3]
15:07:56 [dom]
Topic: TPAC 2022
15:07:56 [dom]
[slide 8]
15:09:58 [dom]
Dom: TPAC being considered as a hybrid event this year - please indicate whether you think you might join physically such an event?
15:10:27 [dom]
[from online poll: 3 Yes, 4 No, 4 don't know]
15:10:42 [caribou]
regrets+
15:11:14 [dom]
Topic: -> https://github.com/w3c/webrtc-svc/ WebRTC-SVC
15:11:14 [dom]
[slide 11]
15:12:08 [dom]
Bernard: issue #68 relates to behavior of getParameters() - unclear about re-negotiation (vs before/after negotiation)
15:12:49 [dom]
... PR #69 has proposed text that clarifies that we're talking about **initial** negotiation (before/after)
15:13:14 [dom]
... if you re-negotiate, you'll still get the currently configured scalability mode
15:13:57 [dom]
Harald: wfm
15:14:31 [dom]
Jan-Ivar: is this correct? getParameters() algos are very explicit about what you get based e.g. on localDescription
15:14:41 [dom]
... some come from pending, others from current
15:15:30 [dom]
Bernard: let's say you change preference order for codecs, and you renegotiate (e.g. from VP8 with L1T2 to H264 that doesn't support scalability) - what happens then?
15:15:36 [dom]
... at what point do things change?
15:16:03 [dom]
JIB: even without setCodecPreferences, getParameters() may return different values depending on whether re-negotiation is happening or not
15:16:11 [dom]
... e.g. if you have a local offer, it might affect the results
15:16:34 [dom]
Bernard: looking at the VP8→H264 case, what should happen?
15:16:44 [dom]
HTA: as long as you're sending VP8, you should get L1T2 back
15:16:56 [dom]
... when you switch to H264, you get L1T1 back
15:17:07 [dom]
Bernard: that's what I would expect and what the text tries to convey
15:17:37 [dom]
... nothing changes until the new codec starts being used
15:17:53 [dom]
... JIB, could you write up your concern in #68?
15:18:09 [dom]
s/68/68 /
15:18:15 [dom]
RESOLUTION: Continue discussion in issue #68
15:18:23 [dom]
Topic: -> https://github.com/w3c/webrtc-extensions/ WebRTC-Extensions
15:18:23 [dom]
[slide 14]
15:18:48 [dom]
Bernard: Fippo gathered a list of hardware acceleration bugs that has been encountered
15:19:00 [dom]
... which raises the question of allowing to disable hardware acceleration
15:19:31 [dom]
... WebCodecs provides an enum to hint about whether or not use hardware acceleration
15:19:34 [dom]
[slide 15]
15:19:56 [dom]
Bernard: I looked into 2 approaches: setParameters, setCodecPreferences
15:20:32 [dom]
... the first one doesn't really work since the envelope of changes may not include hardware alternatives
15:20:48 [dom]
... it also only makes sense if mid-stream switch is necessary
15:21:05 [dom]
... the second approach goes through re-negotiation via setCodecPreferences()
15:21:09 [dom]
q+
15:21:31 [dom]
... [slide 16]
15:21:56 [dom]
... How would you discover this?
15:22:03 [dom]
... Media capabilities may need amendment https://github.com/w3c/media-capabilities/issues/185
15:23:33 [dom]
Dom: should this be managed by the browser rather than left for developers to detect and manage?
15:24:03 [dom]
Bernard: this would be useful *when* developers detect a problem so that they don't need to wait for browsers to react to it
15:24:22 [dom]
Florent: there are also cases where a decoder interacts badly with a specific encoder
15:24:46 [dom]
JIB: for setParameters, there are read-only properties
15:25:13 [dom]
... putting it in codeccapability (which is returned to developers) means doubling the number of entries
15:25:22 [dom]
Bernard: you may not have to return it from Capabilitiy
15:25:34 [dom]
JIB: but then it doesn't fit very well with a notion of codec preference
15:25:59 [dom]
... we've also moved fingerprinting surface to media capabilities
15:26:13 [dom]
... I wouldn't want to reintroduce concerns without good reasons
15:26:29 [dom]
... it doesn't seem necessary to include that info if it is tackled as a preference
15:27:11 [dom]
Johannes: I understand this as developer wanting to disable hardware encoding as a short-term patch to the browser getting it fixed
15:27:20 [dom]
... it sounds like a recovery mode, more than a capability
15:27:41 [dom]
... also agree it's hard for developers to use it, but that it would have its uses
15:27:56 [dom]
Present+ BenWagner
15:28:20 [dom]
Harald: routing around bugs is for specific implementations of the codec, which requires they know the specific implementation
15:28:44 [dom]
... does that point toward media capability as the right way to go?
15:29:01 [dom]
Bernard: that's where you'd find out if it's "smooth", "power efficient", "supported"
15:29:23 [dom]
Harald: if it's X's hardware encoder with software version Y, that may be the information you need to know whether or not to use it
15:29:31 [dom]
... not sure that fits with the Media Capabilities model
15:29:45 [dom]
Johannes: it would seem challenging
15:29:59 [dom]
... Also, the bugs that have been identified seem to be browser-specific
15:30:35 [dom]
... there are block-lists for this or that hardware; it may be worth investigate the possibility to move towards dynamic blocklists from browsers
15:31:28 [dom]
Riju: we share the GPU blocklist defined in Chrome with our driver team to get them to be fixed platform by platfomr
15:31:56 [dom]
Harald: no clear resolution, but some suggested paths worth exploring
15:32:10 [dom]
[slide 17]
15:32:36 [dom]
Harald: issue #99 about RTP header extension
15:32:57 [dom]
... if an implementation supports an extension, it doesn't show up in Capabilities at the moment
15:33:16 [dom]
... is this problematic? if not, no change needed; if it is, we may need to surface that it exists but is disabled by default
15:33:38 [dom]
... you can get the information by inspecting the offer, so this may not be needed
15:34:13 [dom]
Bernard: it's a convenience in the use case; there will be scenarios where you don't want to set it on by default
15:34:20 [dom]
Dom: is anyone asking for it?
15:34:51 [dom]
JIB: if this is for debugging, looking at the SDP is fine; if it's to control running code, it should be an API
15:35:17 [dom]
Harald: the most likely example would be if transport-cc is not supported, I fallback to another congestion control
15:35:38 [dom]
... I think it can be shimmed by creating an offer and dancing with a throw-away peer connection
15:37:34 [dom]
Dom: not hearing a lot pushback, nor a lot of demand either; maybe wait until we have more demand if it can be designed in a way that is backwards compatible
15:37:46 [dom]
Harald: yes, it can be done later in a backwards compatible
15:37:55 [dom]
RESOLVED: close #99 with no change
15:38:02 [dom]
Topic: -> https://github.com/w3c/mediacapture-screen-share/issues/209 Avoiding the “Hall of Mirrors”
15:38:02 [dom]
[slide 21]
15:38:26 [dom]
[slide 22]
15:38:57 [dom]
Present+ Youenn
15:39:29 [dom]
[slide 23]
15:40:40 [dom]
[slide 24]
15:41:19 [dom]
Elad: the proposal would to add a new member to the DisplayMediaStreamContraints à la includeCurrentTab to hint to the UA whether or not to include the current tab or not
15:41:22 [dom]
[slide 25]
15:41:40 [dom]
Elad: influencing the user decision in picking display surfaces has security implications
15:41:59 [dom]
... but I argue that in this case, it is not problematic: the risks of selection are of two nature:
15:42:26 [dom]
... - the attacker influence the user to share a surface under the attacker's control
15:42:46 [dom]
... - the attacker influences the user to share a tab with sensitive content (e.g. their bank account)
15:42:57 [dom]
... but excluding-self is orthogonal to these
15:43:00 [dom]
[slide 26]
15:43:20 [dom]
... if we agree this is worth solving; the question becomes what's the default value should be
15:43:42 [dom]
... if we make it optional, this could be left as a UA dependent default
15:43:45 [dom]
[slide 27]
15:44:05 [dom]
... a potential expansion would cover additional surfaces (e.g. screen)
15:44:34 [dom]
JIB: #209 has the detailed discussion - what is the proposal we're reviewing?
15:44:56 [dom]
Elad: I suggest adding a dictionary member (either include or exclude) that serves as a hint, with no change to current behavior
15:45:18 [dom]
JIB: I like this API, but would want the default to be "false"
15:45:43 [dom]
... I don't think this is so much about hall of mirrors - a symptom that the UA could address either ways
15:45:58 [dom]
... the real issue is that in many cases, self-capture is NOT the intent
15:46:09 [dom]
... long term, self-capture would be getViewportMedia
15:46:45 [dom]
... some sites that want self-capture to be part of the selection - they would need to opt-in
15:47:00 [dom]
... also, TAG guidance is that undefined maps to false
15:47:01 [dom]
q+
15:47:40 [dom]
Elad: re default true - agree
15:48:01 [dom]
... re alternative approaches Youenn suggest, I don't think ti works for current tab (it would work for current screen)
15:48:18 [dom]
... I agree with your characterization that the root cause is if you're not ready to self capture
15:48:56 [dom]
... I suggest we don't take getViewportMedia into account since there is little visibility in terms of its adoption
15:49:13 [dom]
... I think we should avoid breaking apps, even if shortly
15:49:50 [dom]
JIB: I think we should keep that separate from what implementations do
15:50:29 [dom]
... here the question is what's the most frequent case, most sites wouldn't want to it
15:50:44 [dom]
Elad: lost of self-capture happning every year; assume a lot of it not accidental
15:51:07 [dom]
Youenn: re security, the current spec doesn't deal much with tab capture in that regard
15:51:47 [dom]
... we're bringing more and more control to what UAs will show, and that means we need to strengthen the guidance to UAs
15:51:59 [dom]
... Chrome has some mitigations in this space that might serve as a starting point
15:52:08 [dom]
... If this is a hint, this is fine
15:52:28 [dom]
... Some implementations might remove entirely the possibility to select the tab, that's something new
15:52:56 [dom]
... hints allow to push users towards the more meaningful choice, but leave the user in charge of the final choice
15:53:14 [dom]
... re hall of mirrors - I don't think this is solving it
15:53:28 [dom]
... some native apps have implemented current-app blurring to solving the issue
15:53:48 [dom]
... cropping would be another way to solve the issue
15:54:29 [dom]
... if it's only a hint, it's fine; but if it brings a required behavior, I don't think we should go there
15:55:39 [dom]
... also want more security guidance
15:55:55 [dom]
... and keep issue open on addressing other aspects of hall of mirrors
15:56:09 [dom]
Elad: could you help with the security guidance?
15:56:49 [dom]
Youenn: Ideally would like to get the work that Chrome has done
15:58:08 [dom]
Dom: +1 on a hint; if boolean is problematic, we can use an enum to avoid the default value fallback
15:58:33 [dom]
Elad: happy to help with getting the security considerations with guidance from Youenn on what he wants to see
15:58:44 [dom]
Harald: hearing overall support to continue in that direction, towards a hint
15:59:08 [dom]
Topic: -> https://github.com/w3c/mediacapture-screen-share/issues/184 Display Surface Hints
15:59:08 [dom]
[slide 30]
15:59:26 [dom]
Elad: similar to previous issue, but distinct
15:59:46 [dom]
... some apps want to hint to the UA that it is will geared toward a particular display surface type
16:00:25 [dom]
... I think there is agreement that this is worth supporting
16:00:42 [dom]
... but we've struggled to find an approach that everyone likes
16:01:00 [dom]
... I'm suggesting a compromise based on the discussion which would be:
16:01:10 [dom]
... - use constraints as a mechanism
16:01:19 [dom]
... - make it a hint with UA dependent behavior
16:01:55 [dom]
Youenn: hint is fine; it could be a constraint as a model, but with an improved simpler WebIDL surface
16:02:05 [dom]
Elad: reject on "exact"?
16:02:13 [dom]
Youenn: "exact" would be ignored
16:02:38 [dom]
Harald: -1 in integrating this in the proposal - I hate irregularities
16:02:59 [dom]
JIB: +1 to Harald; "exact" is already a type error in getDisplayMedia which already narrows down the constraint mechanism
16:03:32 [dom]
... agree with reusing displaySurface
16:04:16 [dom]
... I have concerns with an app asking for a monitor - I don't think we should provide this level of control
16:04:34 [dom]
... I proposed text to steer away users from monitor capture
16:06:49 [dom]
Elad: this is a hint - UAs can decide not to follow it
16:07:01 [dom]
Dom: with a hint, UAs can provide the best experience they can
16:07:30 [dom]
... not sure the SHOULD would achieve much if the main target isn't interested in SHOULD
16:07:39 [dom]
Youenn: the SHOULd owuld be useful for new implementors
16:07:58 [dom]
Elad: there is merit to that
16:08:12 [dom]
... non-normative language pointing to the risk would be good
16:08:45 [dom]
JIB: the SHOULD already allows for this; given Chrome has a good motivation, this feels like an exact reason why SHOULD would be used
16:09:58 [dom]
RESOLUTION: modulo discussion on SHOULD guidance, we adopt the displaySurface constraint proposal to manage Surface Hints
16:10:18 [dom]
Topic: -> https://github.com/w3c/mediacapture-viewport getViewportMedia update
16:10:18 [dom]
[slide 31]
16:10:36 [dom]
JIB: FYI, there is a PR up to describe getViewportMedia which hopes to bring to a call for adoption soon
16:11:05 [dom]
-> https://w3c.github.io/mediacapture-viewport/ Viewport Capture Unofficial Draft
16:11:56 [dom]
Youenn: we probably need a different set of constraints than the ones for getDisplayMedia
16:12:10 [dom]
... re audio, we need to think about whether to include system level audio or just current tab
16:12:24 [dom]
JIB: currently restricted to current tab
16:12:36 [dom]
Harald: if it can't be isolated, no audio should be captured
16:13:05 [dom]
JIB: there are pending PRs that I hope will be merged before we start the call for adoption
16:13:27 [dom]
Elad: the general intent of this work is awesome; looking forward to see it implemented
16:14:01 [dom]
... that said, until we see it adopted, we need to be careful in basing our decisions on this work, or consider relaxing some of the restrictions
16:14:18 [dom]
Youenn: has there been any outreach to web developers re x-origin isolation?
16:14:34 [dom]
Elad: the feedback I got from developers was this was a blocker for them
16:14:42 [dom]
Bernard: ditto
16:14:56 [dom]
JIB: I agree this is taking the long view here
16:15:10 [dom]
... hence the flexibility we're showing on getDisplayMedia
16:15:30 [dom]
... re using different constraints, we can change it when it shows as needed
16:15:42 [dom]
Youenn: displaySurface would be one case where this is needed
16:15:50 [dom]
Topic: -> https://github.com/w3c/mediacapture-extensions/ MediaCapture Extensions proposals
16:15:50 [dom]
[slide 34]
16:16:22 [dom]
Riju: this is follow up from a conversation that started at TPAC
16:16:24 [dom]
[slide 35]
16:16:46 [dom]
Riju: PR #48 is allowing in-browser face detection
16:16:58 [dom]
... when we showed this last time, the feedback included:
16:17:10 [dom]
... - tie it to VideoFrame rather than MediaStreamTrack, which the PR reflects
16:18:30 [dom]
... - future-proofing the bounding box approach - this is addressed with the Contour described in the PR, with a way for the developer to request something other than the default 4
16:19:11 [dom]
... - another request was to have a face mesh - which is now exposed as an additional property (although there is no native support for it today)
16:19:37 [dom]
... - face expression was raised as a concern, so we removed it
16:20:05 [dom]
... - making face detection work with transform stream
16:20:07 [dom]
[slide 36]
16:20:18 [dom]
Riju: we've put up an example to show how they would work together
16:21:54 [dom]
... we've done early testing that shows improved power consumption - more specific numbers to be shared soon
16:22:25 [dom]
Youenn: good to expose it on VideoFrame; but would also be good to expose in requestVideoFrame callback e.g. for use with canvas
16:22:56 [dom]
... re using "exact" constraints - I would expect "exact" not to be allowed in this
16:23:44 [dom]
... There seems to be switches to give hints to cameras - do we need several switches to allow per-algo enabling, or could we have a single "face detection" switch?
16:24:01 [dom]
Riju: e.g. "is face detection supported"?
16:24:23 [dom]
Youenn: why multiple switches if a single one is good enough, leaving it to the Web app to deal with what they're obtaining
16:24:56 [dom]
Riju: for instance, contour points would allow future support for additional more detailed contours
16:25:45 [dom]
Youenn: since the camera is doing the work, not clear we need to give more hints to the driver
16:27:44 [dom]
Riju: contour/mesh were added for extensibility
16:28:00 [dom]
Youenn: maybe reduce to what's implementable, while future-proofing it
16:29:43 [dom]
Bernard: high level questions about the API surface
16:30:05 [dom]
... I understand the supported contraints & capabilities are used to provide the basic parameters for the algorithm in the driver
16:30:17 [dom]
... videoFrame.detectedFaces is already done by the driver
16:30:46 [dom]
... as opposed to have a promise-based method to which the parameters would be given
16:30:59 [dom]
... if your camera driver doesn't support it, you wouldn't have it
16:31:23 [dom]
Riju: going through promises, this would impact performance and re do work the driver has already done
16:32:12 [dom]
... OS level face analysis would duplicate computation already done in the driver
16:32:33 [dom]
JIB: so, it's a camera API - only available to sources that are camera?
16:32:36 [dom]
Riju: right
16:32:57 [dom]
JIB: my concern is that there is another effort in the WICG, the shape detection API - how does it relate to it?
16:33:18 [dom]
... would be unfortunate to have it to deal with face detection differently depending on the source
16:33:33 [dom]
Riju: shape detection work on images, can be called multiple time
16:33:54 [dom]
... no face tracking available, which helps detecting face across frames efficiently
16:34:25 [dom]
... face detection is based on OS level face analysis, which duplicates the driver work and is less power efficient / robust
16:35:02 [dom]
... we started from that API in our effort in this space - we feel this new approach gives much better results
16:35:19 [dom]
... FaceDetector is only supported in Windows atm; the work has stopped afaict
16:35:39 [dom]
Bernard: so you're saying the WICG work is not going ahead?
16:36:11 [dom]
Riju: I can check the status with Reilly (but my team was the one behind the implementation)
16:36:32 [dom]
Harald: I share some of JIB's worries
16:36:45 [dom]
... we have functions today that depend on high quality face detection e.g. background blur
16:36:57 [dom]
... I'm worried about having these different interfaces to solve the same problem
16:37:11 [dom]
... esp if some interfaces end up proprietary
16:37:30 [dom]
... if the proprietary interfaces provide much higher quality than what standard interfaces can provide
16:37:42 [dom]
... hence my pushback on making contours and meshes available in the API
16:38:22 [dom]
... I'm still not happy with the design that seems to be totally focused on axing this on hardware/driver resources rather than a representation API
16:38:44 [dom]
... it has a bit of that flavor, but there is still a lot of a sense of configuring the camera
16:39:01 [dom]
... also I'm surprised this only gives a 50% factor over media pipe
16:39:15 [dom]
... but in general, this feels like a major new way of treating media information
16:39:26 [dom]
... I'd like to see be proposed as a proposal, not as a set of API patches
16:39:56 [dom]
... with an explainer, use cases, examples - that we typically put together before agree on taking it up
16:40:16 [dom]
Riju: no need to configure the driver
16:40:22 [dom]
... the PR includes a PR
16:40:39 [dom]
s/a PR/examples/
16:42:00 [dom]
Harald: I'm thinking of what application would be use this for, what problems to solve
16:42:08 [dom]
Dom: what an explainer would cover
16:42:12 [dom]
Riju: I can come up with that
16:43:13 [dom]
Dom: happy to help with the logistics of making it happen
16:43:29 [dom]
Riju: is the question about whether this is useful or not?
16:43:30 [dom]
harald: yes
16:43:41 [dom]
bernard: or rather whether it handles all the use cases people want
16:44:31 [dom]
Jan-Ivar: e.g. tying this with camera may become obsolete or too limiting
16:44:44 [dom]
... having an API that isn't as strongly tied to hardware acceleration
16:45:52 [dom]
Harald: I'd like to have a better understanding of which apps want a rectangle around a face
16:46:08 [dom]
Youenn: encoders actually optimize around faces if such metadata are available
16:46:44 [dom]
... +1 on defining API that can obtain metadata from the hardware or a TransformStream
16:49:21 [dom]
JIB: among other things, having less hardware-dependency allows UAs to step in
16:49:56 [dom]
[slide 37]
16:50:09 [dom]
Riju: backgroundBlur has more platform API support than replacement
16:51:51 [dom]
Youenn: iOS has the ability to switch on & off background blur, fully outside of the Web app, and fully dynamic
16:52:11 [dom]
... the Web app could not unblur if the user has set this us at the OS level
16:52:18 [dom]
... (but not vice versa)
16:52:26 [dom]
... that situation is not well supported by constraints
16:53:02 [dom]
... we may need a way to surface whether a constraint *can* be changed (and to signal when it can no longer be changed)
16:53:34 [dom]
JIB: this is a case where constraints work very well - the app states its ideal
16:53:50 [dom]
... background blur is popular, would be good to support it
16:54:15 [dom]
Youenn: I don't think "ideal" suffices to expose the situation
16:55:24 [dom]
... re backgroundBlur level - it's not settable on iOS; are there platforms that would benefit from it?
16:55:42 [dom]
Riju: no platform API supports this, but some software models have that parameters
16:56:03 [dom]
... but I understand some platforms are working towards making it settable
16:56:18 [dom]
Youenn: but without knowing the algorithm, setting a particular value would be hard for developers
16:56:36 [dom]
... we may need a boolean instead
16:57:21 [dom]
JIB: part of the question is whether this needs to be controllable by apps vs the UA
16:57:56 [dom]
harald: in audio, we've encountered cases that it's valuable to tell have manipulating settings that are supposed to be useful in the driver, but actually creates issues
16:57:59 [dom]
... e.g. double echo cancellation control
16:58:29 [dom]
... the most important control we have is to turn platform effects off; the second was to detect the situation to ask the user to turn it off
16:59:41 [dom]
Riju: on the last three proposals (lighting correct, face framing, eye gaze correction), any sense of interest?
17:00:30 [dom]
... the goal is to give options to developers on whether or not to use hardware capabilities
17:00:55 [dom]
Bernard: should we get back to this in April?
17:01:50 [dom]
JIB: from Mozilla's perspective, we don't have strong interest in this approach given possible interop cross-OS issues
17:01:56 [dom]
... we don't see any urgency
17:02:29 [dom]
Harald: for face detection, we have a pretty solid way forward via the explainer with use cases and justifications to support adoption
17:02:40 [dom]
... some of these additional camera controls may fit into that new document
17:03:44 [dom]
... if we accept constraints as a way to control camera drivers, grouping them together make sense
17:04:04 [dom]
JIB: but adding individual constraints is something we've used mediacapture-extensions in the past
17:04:22 [dom]
Youenn: the complexity of a boolean constraint is very different from the more complex API detection
17:05:17 [dom]
Dom: I'll work with the chairs to agree on a clearer path forward then :)
17:05:53 [dom]
RRSAgent, draft minutes
17:05:53 [RRSAgent]
I have made the request to generate https://www.w3.org/2022/03/15-webrtc-minutes.html dom
17:05:57 [dom]
RRSAgent, make log public
20:26:39 [Zakim]
Zakim has left #webrtc