W3C

– DRAFT –
Audiovisual Media Formats for Browsers Community Group

15 September 2022

Attendees

Present
Alan_Bird, Bhoomika_Bhagchandani, chris, Chris_Lilley, Chris_Needham, Cyril_Concolato, David_Singer, Eric_Carlson, Francois_Daoust, Hongchan_Choi, Hyojin_Song, Ingo_Hoffman, Karen_Myers, Kaz_Ashimura, marie, Pat_Griffis, Paul_Adenot, Tatsuya_Igarashi, Thomas_Guilbert, Timo_Kunkel, Wendy_Selzer, Wolfgang_Schildbach, wseltzer
Regrets
-
Chair
Pat, Timo
Scribe
cpn, Karen

Meeting minutes

Introduction

Timo: let's do some introductions, give a short presentation and then discussion and next steps

Timo: I'm Timo Kunkel with Dolby, in CTO with Pat, doing imaging HDI, and want to discuss these things

Pat Griffis, Dolby: I'm VP at Dolby in CTO office; we're neophytes in W3C
… we are media format experts
… I helped to standardize HDR in SMPTE
… we also have some audio experts
… looking forward to finding a win/win for W3C

Eric Carlson, Apple: I'm an engineer on WebKit

Dave Singer, Apple: used to be on media team, now on web team; look after W3C politics

Alan Bird, Global Business Development Leader for W3C

Igarashi, Sony

Eugene Zemstov

Hyojin Song, LG

Wendy Seltzer, W3C Team, strategy lead and council

Kensaku: I work on Web RTC

Marie-Claire Forgue, W3C

Chris Needham, BBC: co-chair of M&E IG and WG

Paul Adenot, Mozilla: work on Firefox and editor of a couple specs

@@

Thomas Guilbert: work on Chromium

Cyril Concolato, Netflix, File Format chair at MPEG and Storage and Transport Format chair at AOM

Chris Lilley, W3C, CSS Color 4 and 5, W3C rep to Int'l Color Consortium

Ingo, Frauenhofer

Karen Myers, W3C

Hongchan Choi

Alan: And on Zoom?

Wolfgang Schildenach, Dolby

Bhoomika, Amazon

Eric Portis, CTO team at @

<eeeps> Cloudinary

Riju, Chromium developer

What to do?

Timo: We made some slides to get the ball rolling

[slide 1] Why are we here?
… Audio Video is a dominant use case on the web; they are advancing, offering more features
… spatial audio formats, high dynamic range, AR/VR all these things are coming are well
… and can benefit consumers and users
… also for multi-channel audio, complex streams

<chris> Are the slides online? Link?

[slides available afterwards]

Timo: these advanced formats are either difficult or impossible on the web
… many of features might not be supported
… cause can be on implementation complexity, limited documentation, or lack of unique identifiers
… other markets these formats are widely deployed
… mobile phones, sound bars, home audio
… in 2020, 58% had HDR functionality and that number has likely gone up
… So because the playback devices support this, the content providers satisfy requirements
… Consistent experiences are not yet possible
… these are opportunities for us to tap into
… there is an interest to make them possible
… we thought about some typical examples
… one is play back of video content
… one thing we talked about at the HDR & WCG Workshop was compositing of complex content
… can be spatial or temporal
… elements originate from different formats and servers
… could be HDR, diff formats
… similar situation on sound side
… audio in monochannel, etc.
… graphical use interfaces

<chris> https://www.w3.org/Graphics/Color/Workshop/talks.html#compos

Timo: and less common applications that use these formats
… or could use them in future
… Go one step further, as I learned at IBC last week, and discussed in another W3C workshop last year
… is content creation and modification
… common thing consumers are already doing
… in SDI
… next step to HDR
… can we create media with addition of common standards in advance formats
… can we manipulate the advanced metadata
… another example is assess, preview and manage content
… for frontednd of online storage providers
… and preview that
… can do with stereo and SDR
… nice to preview in the file
… So this brings us to the problem statement
… media format support by W3C does not sufficiently cover the requirements of today's technology & content
… question is, is this true; and what should happen next to solve this issue
… and to form a CG
… and what is required to improve the quality of experience
… specify how to call open and commercial formats and all their features
… and how to uniquely identify formats
… and provide info about playback system capabilities to browser
… one thing important is to discuss privacy concerns such as fingerprinting
… and we need to keep in mind and find a solution
… that doesn't affect quality
… what is a good middle ground for both concerns
… last one is to avoid risk to interoperability due to fragmentation of features across browsers
… Another aspect we see a lot is to get more experts at the table
… talk about the technologies behind these formats
… come to W3C to discuss
… and alleviate IP fears that may stop them from participating
… we need both web experts and entertainment format experts
… we need the support and help from all of you
… happy to engage and be involved with all of this
… Summary: Current challenges with advanced media formats
… new media innovations are emerging which W3C should address
… why this new CG
… discuss how to improve quality of experience
… engage experts and companies behind formats and alleviate IP concerns
… We don't have all the answers yet, but looking forward to fruitful discussions and how to address
… thank you for listening
… I will make sure I can share those slides after the meeting
… Do you agree or disagree with what we presented
… what are challenges

Pat: we have been talking with a lot of you
… we are not looking at this just as a video problem, pixels, but also audio
… and we represent the content community
… can we take web to the next level to better integrate
… thanks to Alan Bird, Chris Needham in ME IG
… is this a topic of interest?
… your silence is assent?

David Singer: ChrisN, could you give us a background of what is in scope in groups we already have

ChrisN: capability detection for different media formats; and API in the WG
… see if gaps there for detection of support for format
… for HDR there is another CG where we have color experts developing support for how you composite color into a canvas
… for media production, we haven't started anything
… we ran workshop last year, got interest
… and we are really interest to set something for web-based production tools to the cloud

David: not sure

ChrisN: I think that covers it

Alan: anything, ChrisL that is not being covered?

Chris_Lilley: three values, normal, wide, and ridiculously wide
… SRGB, P3 and 2020

Pat: there is a rec

ChrisL: that is the point of the joke about 2020
… there needs to be a finer grade support for detection. lumens in the room
… cannot do turn
… people want to see content, cannot expect people to be sitting in dark rooms the whole time

Pat: open standards, commercial solutions; how do we thread the needle
… the reason we are here
… at least in one area, there are gaps in how things are specified
… by staying true to IPR rules of W3C but make a better solution
… for Audio, there are discussions at Frauenhofer

KazA: team contact for ME and WoT
… wondering if we want to think about object audio, 3D video and geospatial mapping, etc., as well
… seems that scope is broader than audio format
… do you want to think about a bit broader use cases as well?

Pat: let me take that
… I am a believer in crawl, walk, run
… we have issues in 2D HDR
… if XR, AR and 3D
… so much going on
… we are trying to see what is going to happen
… cannot even say Metaverse now
… let's fix ecosystem as it is
… all these new audio formats
… let's do a solution that works for everybody
… and we'll all scratch our heads with metaverse because there are a lot of pixels
… Timo was offering some ideas

Timo: yes, these goals need to be reachable
… there are interesting opportunities
… to keep in mind

ChrisL: you asked if appropriate to form a CG?
… yes, because we don't have agreement on a solution
… at this stage, it's right

Timo: yes, we want to go from stand still to crawl

Pat: what is not clear to us, is where would we go if there were proposals to be made
… you have audio, imaging, a number of groups

Alan: assuming we can move ahead, we would find someone on the team
… standards need to be containable and deliverable
… people like Kaz and ChrisL, are good at saying where each thing needs to go and how to collaborate together
… for example, M&E and Timed Text
… so that is where expertise of tech team comes into play to help guide things
… we would want CG to discuss, build prototypes, and do incubation to see if it even works
… do we have the right five questions answered
… to do in an easier, more relaxed environment, and then we will decide the right next step where to take it
… if it is ready for standardization, might go to Audio, but if needs more incubation could go to M&E IG

ChrisN: could be very obvious stuff
… follow the necessary Github issues and we can evaluate it straight away

David: ChrisL was talking about batting around solutions, but what are the questions
… thinking about formats
… oh, the compression format; clearly not that
… even visual size, HDR, color format, channel couont
… if audio monostereo
… or bigger questions
… kind of soft; we don't have a good language for talking about them
… and sometimes have to be codec specific

Pat: HDR formats; you say codec, I think ABC, ABS, AB1 compression codecs
… in terms of image formats, the image is separate from the codec
… sometimes have codec separate from the format
… we have run into experiences
… so we want to ID the problem
… so our understanding is the CGs surface the problems, where the WGs solve the problems
… so success means we can identify the problems with enough specificity
… and there may be multiple
… audio may be different from video

Eric_Carlson: probably not
… may be some things specific to web audio
… but Media WG would be the right place to work on these things

ChrisN: other thing is W3C specifications don't mandate which codecs to support

Pat: you mean compression codecs?

ChrisN: codecs and formats

Timo: all properties going through a certain use experience
… non linearity, color experience
… all these properties that a rendering engine wants to get best image quality
… and id display devices to provide best experience
… format
… be more precise

DavidS: you used an important word, rendering
… in '90s it was ok to ask what size it was in
… and you could assume audio was in
… and assume @ but other things out of scope
… or just black art and the platforms can deal with it
… that attitude is out of date for two decades but there are residuals of the rendering format

Timo: we have not thought about finger printing
… which I learned about in a W3C meeting
… we are happy to learn about those challenges and find solutions
… we want to find a similar level of experience to TVs
… and see what the roadblocks are

Pat: not to jump into solution space
… in experience we have had
… we learned early on, can be SDR and all flavors of HDR
… the more we specific, easier for downstream device to know what to do, and helps with privacy issues
… when end user knows, better experience in cloud
… it's a back and forth issue
… if group agrees, we'll go back and start to point out gaps from imaging and audio side

Rick: thanks
… I work at intersection of browser and camera capture
… what is exact thing you see as a gap
… like JPEG XL

<chris> s/exel/XL

Rick: or looking at cross processing, image process, post processing on the browser?
… not being able to uniquely ID the source content
… that creates problems in one case we have to work around
… I would not go into too much of the problem space today; that is an issue
… current solutions don't work in all cases
… we'll come back with more
… nothing to do with the compression codecs
… for Dolby, we are codec agnostic
… but the format itself for proper decoding
… we are forced to work with browser implementers to come up with one-off solutions

Rick: you want more metadata to be shown?

Pat: yes, I never met "a data" I didn't like [jokes]
… old world of NTSC, production and playback were identical
… all kinds of headaches with SDR and matching
… solution will come down to better ways to ID the browser
… better embracing without violating rules of W3C
… and doing commercial formats

EricP: can you talk about what is missing in media capabilities
… if I understand correctly

<chris> https://www.w3.org/TR/media-capabilities/

Pat: having more unique ways to describe format types
… talk about color primaries, EOTFs
… not enough that you can ID what level in a certain formats
… causes all kinds of formats
… not to push on solutions, but maybe it's an identifier
… this is a 27.8
… in SMTPE we have a registry to id content types
… that metadata
… that also extends to the audio formats
… in an increasingly diverse world would be step one
… that simple step would be a big step forward
… without violating rules of proprietary formats
… and without violating fingerprinting of user

<riju> https://www.w3.org/TR/webcodecs-codec-registry/#video-codec-registry https://www.w3.org/TR/media-capabilities/#hdrmetadatatype : Something more and more detailed ?

Pat: digress on fingerprint discussion
… need for a EDID2.0
… end device
… depends upon population

<dsinger> s/@2.0/EDID2.0/

Pat: more data, you can ID and Ndevices
… let's say more about the content and then let the browsers deal with it

ChrisN: Riju just pointed to relevant types
… references SMPTE specs
… question I have
… talking about support for proprietary formats, what kind of identifiers would we introduce?
… do we reference out to other standards bodies, or is W3C place to define them?
… or do we expect them to come from other standards bodies

Pat: I was thinking it was W3C
… maybe you point to another entity like CTA
… and they create a registry and W3C points to it
… we are jumping to solution space already

David: what makes things at registries hard
… they tend to be binary
… but much of rendering is not binary question
… I will send to stereo, or adjust to dynamic range if I need to
… not simple binary questions any longer
… which results in best effect for the user and the best representation of the author's intent
… and it becomes a much harder problem

ChrisL: let's say it has this and this, PQ, transfer function
… then it also has to say Rec2020, using RGB, etc. and are you really saying you can do any combination of these things
… if it's HLG2100, then I can do that

Eric_Carlson: I can do that but you can send me a smaller format and would be same from user perspective

Pat: we are at crawling now
… next step is we can come back with some problem cases
… I don't want to boil the ocean
… I think there is a delta we can bring without too much complexity
… maybe things on audio and video sides

.,..we are re-introducing issue because we see implementation problems

David: what could we express in the markup that allows UA and browser to choose alternative that allows the best effect
… and not download the 202.2 format when it's only stereo

Timo: one thing we are thinking about is what are the options available
… can we tell our renderer how to make use of it; is there one; does browser know that
… find an elegant way with a small footprint
… these are desirable solutions
… we are not saying it needs to be this way, but to discuss what is possible

Riju: I introduced to browsers six years ago, and is not shipped because there were not use cases

<chris> use case is dynamic tone mapping of hdr content

Riju: I can go back to team and we can ship within this year
… sensors from browsers helps you get job down
… it has been behind the flag for six years

Pat: it is already implemented in millions of TVs based on PQ

Timo: could be an argument that we have an application
… we could use that right away if it were available, and others could use it

Pat; I think next step is to come back from some very real problems that cause gaps in implementations
… and Pat's implementation, we need to have a unique identifier at least in the imaging space
… codecs are find but for HDR use cases, a unique identifier would likely solve the current generation of problems
… if we can solve basic problem for video and audio

<riju> Intel uses LACE (with ALS) to optimize display https://www.intel.com/content/dam/support/us/en/documents/graphics/LACE_Graphics_Feature.pdf

Cyril: can you define a unique identifier, what can you not do with CICP today?

Pat; have to go back to the guys on what proper information they need

ChrisL: you have some content shift in 2020 container
… mastered in P3
… not on the list of color spaces and cannot describe it

Cyril: didn't Apple push it?

David: I think we made a mistake in industry
… we realized that just doing coding format isn't enough
… we started to add individual pieces of info
… that don't make sense in all the combination of values
… what I think the media apps are trying to tell us
… be better off with a profile for consumption side so you get interop
… you end up with intrusive probing and all kinds of implementations

Pat: regarding profiles
… that is indeed how we do it; profiles based on the compression codec
… issues of backwards compatable, HLG
… not go to that level of details, but not all browsers have same level of content
… but today it's a guessing game
… we should come back with specific examples
… and maybe it isn't getting down to a format type
… that is certainly one approach

Alan: maybe the next steps
… CG needs to get formed crisper
… we need a chair
… and someone needs to set up the Github repo
… I get lost
… in doing that
… that is our primary work space; get the email list going

ChrisN: the group is created, it has a mailing list, but it doesn't have Github

Kaz: talk with Ian Jacobs, CG contact

Alan: then you can start collecting these issues; decide on cadence; set your priorities
… since it's a community-driven thing

Kaz: I am team contact for WGs, but we need to talk to IJ about setting things up
… but I can help

Alan: CGs don't have a specific team contact
… CGs don not have team resources assigned

ChrisL: we can help

Pat: thank you

David: but not end up with overlap in IG and CG
… not discuss same things, so discuss with ChrisN

Timo: yes, we have done that; boil down the scope

Pat: and Chris helped us with the name
… GitHub sounds good for us
… what is a typical cadence for CGs

<dsinger_> s/Gethub/GitHub/

ChrisL: old style is emails, archive
… with GitHub you can continue to make changes in real time

Chris: I see some groups having a GitHub organization

ChrisL: that sounds heavy at the moment

Alan: some groups have a monthly phone call
… or meet in between TPACs to do things F2F, but every group is autonomous
… you may want to meet monthly or more than that
… having GitHub available allows the work to progress in an offline mode

ChrisN: only suggestion is if you propose to have a meeting, that we coordinate that so there is no overlap

Pat: yes, I would coordinate with you
… how often does TPAC meet?

Alan: once a year

Pat: ok, once a year in the Fall

David: Nothing to stop you from meeting with other groups
… you can have joint meetings during the year

Alan: and if there is a natural place, like NAB where community might be there
… sometimes groups might meet there
… or have someone host a meeting somehwhere
… look to both Chrises and Kaz
… how many groups have charters?

ChrisL: we need a charter

Alan: It can just be a paragraph

ChrisN: recommend using the charter template

Pat: we create a charter?

ChrisN: yes, puts more of a description to it

Alan: and then David can get it into Apple

Pat: far be it for me to speak for Apple or Netflix!

ChrisN: Riju are you in the queue?

Riju: no more quesitons yet

Alan: does this look like something that is interesting to explore and makes sense?

Cyril: I would like to come back to Pat and David's point about having identifiers

<kaz> CG Charter page

Cyril: we have been ocilating in MPEG about codecs
… explosion of values
… and we have more and more manifest-based consumption of media
… one task is how to structure these identifiers and how to align better with media capabilities
… I fear grow and grow every dimension
… where we just need to rationalize it

DavidS: we have overloaded on tools
… and sometimes makes it creaky

Timo: Is manifest a term in W3C?

Cyril: i was referring to the DASH manifest or HLS playlist

Pat: If we come back and show you the gaps, then we can talk about how to address them
… there is a GitHub in our future

DavidS: and you could have a W3C Calendar

ChrisL: I just sent you a message for CG to get a GitHub

Wendy: I hereby give permission

Pat; thank you, Wendy
… Wolfgang, are you on?

Pat: If we can break out the audio and imaging questions
… can be different topics on GitHub; I see them proceeding independently

ChrisN: there is a labels feature to allow you to group things

Pat: in terms of chairs, on imaging side
… Wolfgang
… so happy
… thinking we have two parallel tracks
… audio issues are somewhat different
… my proposal is to have two co-chairs

ChrisL: you will need to sign the CG

Wendy: as long as we are talking about joining the CG

<kaz> Audiovisual Media Formats for Browsers CG page

Wendy: and anyone who participates in GitHub should link to their W3C accounts

ChrisL: it tells us what organization you are in

Wendy: bureaucracy with a purpose

Pat: Wolfgang, on the audio side
… you are looking so happy

Wolfgang: a lot of good positive energy in the room

Pat: and you are happy to cover the audio issues
… are you ok to take the lead?

Wolfgang: yes, I would be happy to

Pat: we had good discussions with ChrisN and Frauenhoffer this morning

DavidS: work on charter
… would be helpful to specify what group will produce: proposed solutions, etc.

Alan: you can propose solutions, but they cannot become standards
… if CG builds a prototype or something
… and we think this is a good enough starting point
… someone who is in a group needs to bring the expertise
… need the SME to go with the work

DavidS: but group may ID crisp questions, and suggested directions for the group to pursue
… make a statement about what the charges are
… and look into some suggestions

Pat: solutions sounds too "solutiony"
… identify "options
… and then go back to producing the standard

DavidS: 90% is asking the right question

<kaz> (also wanted to suggest starting with problem statements, expectations, etc.)

ChrisN: one of roles of the chairs is to be able to assess the consensus of the group
… maybe be awkward to combine role of proposing solutions and also chairing
… we should not just default into that

Pat: yes, we have that at SMPTE
… our company policy is that chair's role is to faciltate
… we want a solution that works from everyone
… I can push on them
… hope there is enough excitement that there will be someone helping us

Alan: And we can change the chair once we get started
… for an execution chair, we change to these individuals

Pat: We welcome you as a chair

ChrisN: I did not say this
… CG charter template has some words in it about how to choose and replace chairs

Pat: I am interested to get work done
… as president of SMPTE, know it's about getting it done; it's a real world challenge
… Wolfgang did not know I would volunteer him
… For now, let's get it started
… try a bi-weekly cadence

ChrisN: other groups meet monthly
… we have regular time slots during

Cyril: May I suggest we meet not on a regular cadence? Just meet when we need to and do the work asynchronously

DavidS: you can make asynchronous progress

Alan: That could make sense
… when we all need to get together
… and not have it pre-scheduled

Cyril: we can still have deadlines and push people

ChrisL: it still has chairs pushing through the issues
… a different way of pushing peopleforward

Pat: I am all in favor of not having meetings

Kaz: clarify in charter how to make decision and consensus in the group, including having a meeting or not

Pat: you don't have rules on consensus

ChrisL: in general we avoid voting

Kaz: similar to W3C process document, but decision making depends upon the group (this time the CG)

Pat: what about in a WG, is there a more formal process?

ChrisL: we avoid voting unless to break a log jam

Alan: only if stuck and cannot break it out
… and even then we have one more conversation to get to point of no objectiion

<chris> Voting https://www.w3.org/2021/Process-20211102/#Votes

<chris> Consensus building https://www.w3.org/2021/Process-20211102/#consensus-building

Alan: sometimes why things may take longer in W3C
… people in minority may go off and fork this
… and we're all about one web and one world

Pat: Interesting, in SMPTE, you can object; have a way to resolve assent

ChrisL: you can do a Formal Objection, but it is seen as a big gun

DavidS: rough consensus in CGs is fine

Pat: you try to get consensus if no one objects, but you have to ask
… so that's fine
… ChrisL, you will help us with GitHub
… you will send a template of a charter, ChrisN

Alan: don't send me anything technical

Pat: once we get GitHub set up, I will go back and get specific problem use case number one
… number two, and everyone can start commenting on it
… until we need to meet F2F

ChrisN: W3C has WebEx and Zoom
… whether CGs have access to W3C infrastructure
… we may need to provide our own

Pat: We have some pretty easy technologies
… anything else?

Pat: does anyone object to us proceeding

Alan: Thank you everyone for your time
… Hopefully this will result in some good things

Pat: thank you

Minutes manually created (not a transcript), formatted by scribe.perl version 192 (Tue Jun 28 16:55:30 2022 UTC).

Diagnostics

Succeeded: s/Cyril, Netflix/Cyril Concolato, Netflix, File Format chair at MPEG and Storage and Transport Format chair at AOM/

Succeeded: s/Lilley,/Lilley, W3C,

Succeeded: s/about was/about at the HDR & WCG Workshop was/

Succeeded: s/@/P3/

Succeeded: s/t;/t:

Succeeded: s/for WoT/for ME and WoT/

Succeeded: s/and @/and geospatial mapping, etc., as well/

Succeeded: s/agreement/agreement on a solution/

Succeeded: s/..let's/...let's/

Succeeded: s/@:/Eric_Carlson:/

Succeeded: s/work/word

Failed: s/exel/XL

Succeeded: s/excel/XL/

Succeeded: s/@/browser and camera capture/

Succeeded: s/@/EDID/

Failed: s/@2.0/EDID2.0/

Succeeded: i/let's do some intro/scribenick: Karen/

Succeeded: i/let's do some intro/topic: Introduction/

Succeeded: s/HOG/HLG

Succeeded: s/@/Eric_Carlson/

Succeeded: i/We made some slides/topic: What to do?/

Succeeded: s/Gilbert/Guilbert

Succeeded: s/massive/mastered/

Succeeded: s/do today/do with CSCP today/

Succeeded: s/CSCP/CICP/

Succeeded: s/Alan;/Alan:

Succeeded: s/Gethub/GitHub/

Failed: s/Gethub/GitHub/

Succeeded: s/we are happy hounds/that sounds heavy/

Succeeded: s/IAB/NAB/

Succeeded: s/test based/manifest-based/

Succeeded: s/latest manifest/DASH manifest or HLS playlist/

Succeeded: s/promission/permission/

Succeeded: s/(CG)/(this time the CG)/

Succeeded: s/hit/it

Succeeded: s/rrsagent, make draft//

Maybe present: Alan, ChrisL, ChrisN, Cyril, David, DavidS, EricP, Kaz, KazA, Kensaku, Pat, Rick, Riju, Timo, Wendy, Wolfgang