13:57:25 RRSAgent has joined #me 13:57:29 logging to https://www.w3.org/2024/08/20-me-irc 13:57:29 cpn has joined #me 13:57:47 meeting: Media and Entertainment IG 13:57:54 chair: Chris_Needham 13:58:03 ohmata has joined #me 13:58:33 present+ Kaz_Ashimura, Chris_Needham 13:59:22 agenda: https://lists.w3.org/Archives/Public/public-web-and-tv/2024Aug/0005.html 13:59:56 present+ Leonard_Rosenthol 14:00:12 present+ Naomi_Schoppa 14:00:13 scribe+ cpn 14:00:18 present+ Go_Ohtake 14:01:00 present+ Hisayuki_Ohmata 14:01:07 rrsagent, make log public 14:01:11 rrsagent, draft minutes 14:01:12 I have made the request to generate https://www.w3.org/2024/08/20-me-minutes.html kaz 14:01:44 present+ Alicia_Boya_Garcia 14:02:18 present+ Francois_Daoust 14:02:38 present+ Andy_Parsons 14:03:00 present+ John_Simmons 14:03:17 rrsagent, draft minutes 14:03:18 I have made the request to generate https://www.w3.org/2024/08/20-me-minutes.html kaz 14:03:36 present+ Tatsuya_Igarashi 14:03:39 tidoust has joined #me 14:03:56 igarashi5 has joined #me 14:04:21 present+ 14:05:04 present+ Ewan_Roycroft 14:05:08 leonardr has joined #me 14:05:14 alicia has joined #me 14:06:07 Johnsim has joined #me 14:07:21 topic: C2PA 14:07:21 Topic: Content Credentials 14:07:26 s/topic: C2PA// 14:07:34 Andy; I run the Content Authenticity Initiative at Adobe 14:07:43 s/Andy;/Andy:/ 14:07:53 ... C2PA is a Linux Foundation project. I represent Adoby on the steering committee 14:08:10 ... I oversee our open source efforts at CAI. It's a free implementation of the C2PA spec 14:08:17 -> https://c2pa.org/specifications/specifications/2.0/index.html C2PA specs 14:08:24 ... And implementations in Adobe tools, which are based on the open source 14:08:46 rrsagent, draft minutes 14:08:47 I have made the request to generate https://www.w3.org/2024/08/20-me-minutes.html kaz 14:08:54 Leonard: I'm the CAI architect for Adobe, and chairing the technical work at C2PA 14:08:58 ... I represent Adobe at W3C 14:09:15 Andy: There's an open source DASH player 14:09:49 ... I'll give some background on what content credentials is, then Leonard will describe the specs in more detail, and potential next steps at W3C 14:10:20 ... Two images: generative AI, fake or synthesized photos, are a thing to be concerned about 14:10:29 ... The images fooled forensic experts for a while 14:10:49 ... Aroudn the same time was a photo of the Pentagon on fire, moved the US markets briefly 14:11:09 ... C2PA, understanding image and video provenance as a fundamental building block 14:11:30 ... We've been working on it for 4 years. The spec is moving quite quickly 14:12:09 ... It seeks to address the problem of video, audio, image transparency. C2PA doesn't concern itself with detection. Detecting manipulation by AI or other means is an arms race 14:12:38 ... We have engagement with policymakers on what regulations should require 14:13:21 ... Education about why it's important, how it works, and what it's not. Not about truth, it's about context and trust. Lock icon in the browser 14:13:43 ... Show where things came from 14:14:40 ... Facts about how media was made. Cryptographic data bound to the device, no PII. Editing in Photoshop or Premiere, capture details abotu what was changed 14:14:52 ... Details about who published it might be captured, in a C2PA extension 14:15:18 ... We'll focus on news, but also applies to creators, who want to indicate they don't use AI 14:15:39 ... CR symbol may soon be as recognisable as the copyright symbol 14:16:01 ... Level 2 overlay, click to show the provenance chain 14:16:45 ... There's UX guidance, the approach to video (uses video.js and dash.js), there's a blue segment, then it turns red, to show it's been tampered with 14:16:57 ... The manifest contains the data. Election integrity 14:17:20 ... CP2A steering committee members. It's a real SDO, under the Linux JDF 14:17:31 ... It has the W3C IPR policy, and open as we can be 14:17:41 ... Google and OpenAI joined, more interesting ones to come 14:17:51 ... v2.1 draft is soon to be published 14:18:23 -> https://c2pa.org/specifications/specifications/2.1/index.html 2.1 draft page 14:18:55 ... Published a whitepaper in 2020, describing the basis. Then C2PA was formed, aligned on not inventing multiple standards 14:19:26 ... Last year, implementations, e..g, from Leica, on-device signing. Others from Sony, Nikon. Microsoft election integrity product, to let candidates sign content 14:19:49 ... News from social media companies. OpenAI implementation for DALL-E and SORA 14:20:09 ... Browser integration critical for next stage of deployment 14:20:50 Leonard: The spec is for content credentials -this is the end-user term. The internal term we use is 'C2PA manifest' 14:21:54 ... We're at v2.1 spec in public review. Three reasons we went to public review. There are significant changes in 2.1 where we want public feedback 14:22:18 .... Also feedback from orgs like W3C. I talked with Dom at W3C to figure out what that process will be 14:22:38 ... We're discussing doing a formal TAG review, and how it fits into the web, securely, philisophically 14:22:53 ... Enable us to come together to solve provenance 14:23:26 .... Taking the standard through the ISO fast track. It'll be like Unicode and ICC Profiles, where C2PA will continue to maintain the standard and ISO will publish 14:23:41 .... Group meets in November, when we'll have v2.1 final 14:24:12 ... Before defining the spec, we set design goals 14:24:48 rrsagent, draft minutes 14:24:49 I have made the request to generate https://www.w3.org/2024/08/20-me-minutes.html kaz 14:24:56 ... These are the key ones. Don't invent anything new, use existing standards. Don't require cloud storage, or even being online 14:25:20 ... Also applies to digital ledgers or blockchains, we're agnostic 14:25:55 ... It has to work across the history of an asset, multiple tools and vendors. Has to work for all asset formats: audio, video, images, documents, AI models 14:26:24 ... It's being used in areas we didn't envision, supply chain data, also financial groups who see it as a general mechansim for establishing provenance of something 14:27:34 ... Core technologies we built on: JSON-LD, CBOR, CMS (digital signatures), COSE. Also JUMBF from JPEG, universal box format. It's a BMFF-compatible serialisation model for extending JPEG, but format-agnostic 14:27:55 ... That's how we bundle the data in a C2PA manifest 14:28:39 ... Let's look inside a manifest. There's a manifest store: one or more manifests that describe the history. Each has 3 major chunks: assertion store, claim, claim signature 14:29:20 ... Box structure. Example UI shows data from the manifest, let users understand the provenance 14:29:47 .... Assertions are individual pieces of information. Hard binding, hashes, how to securely bind the manifest to the content 14:30:21 ... There are 4 or 5 hashing methods, depending on the format. For BMFF or other boxed based formats, like RIFF or TIFF, we support a box-based hashing 14:30:49 ... For BMFF you can say which boxes are included and which are excluded. Works nicely for non-fragmented MP4 or small assets 14:31:16 ... When there are multiple mdat boxes (pre-computed data that's streamed, come to live streams later) 14:32:02 ... It allows you to have hashes of smaller chunks, per the DASH demo earlier, you can show per-chunk whether it's modified 14:32:39 ... If relevant, you can add actions, which tell the user what took place: was it created from scratch, what edits took place 14:32:58 ... Ingredients, where you compose assets from different assets, e.g., bringinig in audio, video, captions 14:33:16 ... Each ingredient could have provenance, so you get a tree through each to the final asset 14:33:26 ... If you go to contentcredentials.org you'll see examples 14:33:43 ... Other kinds of metadata. XMP, IPTC, metadata, include in the tamper-evident package 14:34:20 ... All the assertions are individually hashed, included in the claim, with info about the claim generator (the hardware or software that does this job) 14:34:50 ... Then sign with an X.509 certificate. You have a merkle tree, hashes of hashes, in the JUMBF serialisation, and embedded in the asset 14:35:19 .... There's functionality specific to generative AI. Identify whether the entire asset or a portion of it came from gen AI 14:35:32 ... IPTC source type, regions of interest (spatial or temporal) 14:36:20 ... You can be specific, e.g., Adobe has an AI that does lip-syncing and translations, so you can identify that gen AI was used, and to do the lip-sync and not anything else 14:36:52 ... We have ingredients and recipes. A prompt, information about the model, and the tools used 14:37:05 q? 14:37:45 ... In C2PA we focused on the how and not the who - the camera, software tools, this is the recipe for how the manifest came to be 14:38:05 ... Info about human and organisation identity was in the 1.0 standard, based on W3C verifiable credentials 14:38:30 .... We didn't have the expertise. So we partnered with CAWG, it's separate, not affiliated with C2PA, but we have good relationship 14:38:52 ... They focused on org and human identity problem in the C2PA ecosystem, things connected to authorship and ownership 14:39:09 ... There's a "do not train" assertion, also in CAWG 14:39:18 ... CAWG focuses on the how and now who 14:39:55 ... Trust model, X.509 certificates, same as the web and PDF. Same algorithms, SHA hashing, elliptic curves, etc 14:40:22 ... We've extended with hardware attestation, as creation can be done in hardware, on camera, or Google Pixel phones 14:40:40 ... We need to make sure hardware devices can have secure attestations. 14:40:51 ... There's also soft bindings to address durability 14:41:17 ... C2PA has its own trust list, like there's a trust list from the CA Browser Forum 14:41:29 ... IPTC and Project Origin has its own trust list 14:42:07 ... If all data is correct, the manifest is well formed, if it passes the trust list, it's valid 14:42:47 ... Put all this together, you have trust signals. It's not binary, not true or false. Each person makes their own trust decision 14:43:06 ... As a human, look at the information, and decide if you trust the asset. This is key to our thinking 14:43:43 ...We support many file formats. Lots of work done in video and imaging, but audio less so. We want community interset, support for audio formats 14:44:04 ... Can be referenced in the cloud, blockchain, DLT 14:44:17 ... Link headers in a HTTP request to connect 14:44:32 q+ 14:45:25 ... We're working on durable content credentials, to discover removed credentials 14:45:41 ... Core spec is normative. We have 8-10 informative docs, from task forces 14:46:03 ... Andy chairs the conformance TF, we have one on AI/ML, another on live video, actively working 14:46:12 ... Watermarking, and user experience 14:46:42 -> https://c2pa.org/specifications/specifications/2.1/index.html#_technical_specifications Ver. 2.1 Tech specs 14:46:42 ... Extensions: CAWG, IPTC 14:46:57 -> https://c2pa.org/specifications/specifications/2.1/index.html#_technical_specifications Ver. 2.1 Guidance & Informative docs 14:47:20 Francois: You mentioned focusing on how, not who. If BBC produce a media asset, they describe how the asset came to be, but how is it tied to BBC? 14:47:45 s|-> https://c2pa.org/specifications/specifications/2.1/index.html#_technical_specifications Ver. 2.1 Tech specs|| 14:47:51 i|Guidance|-> https://c2pa.org/specifications/specifications/2.1/index.html#_technical_specifications Ver. 2.1 Tech specs| 14:47:58 rrsagent, draft minutes 14:47:59 I have made the request to generate https://www.w3.org/2024/08/20-me-minutes.html kaz 14:48:06 Leonard: C2PA focuses on the core spec. Other groups build extensions, for the C2PA ecosystem. One group is CAWG, building the identity component for the C2PA ecosystem 14:48:30 ... It will incorporate the identity, so BBC would use their credentials (verifiable credentials or X.509) to establish their identity 14:48:43 ... There are representatives from BBC working in CAWG on this. 14:48:53 -> https://github.com/creator-assertions Creator Assertions WG 14:49:06 ... Identity is key, but C2PA wasn't well set up to do it 14:50:02 Andy: Think of these in combination: C2PA trust list, ensures conformance, then the identity is added and takes responsibility for signing. Both signatures are bound to the asset 14:50:03 q? 14:50:05 q+ 14:50:08 ack tid 14:50:09 johnsim has joined #me 14:50:10 ack k 14:50:49 Kaz: Thank you for presenting. C2PA can handle live streaming, the mechanism supports chunk by chunk? 14:51:19 Leonard: You can do individual chunk-based manifests, heavy but doable, but we think one manifest with individual ... 14:51:20 q? 14:51:22 q+ 14:52:14 Alicia: You mentioned policy. Would help to understand the goals, what are you looking for from them? 14:52:56 s/with individual .../for the whole streaming data/ 14:52:59 Andy: C2PA isn't a policy-making organisation, but we are called on to give input. Regulators see value in this. But we tell them it's not a panacea, so we explain what it isn't more than what it is 14:53:52 ... Catalysts to make this ubiquitous. Browser support, mobile device support, and policy makers for AI, as watermarking not enough. NIST in the US, executive order to figure out standards to apply to AI and the disinformation problem 14:54:00 ohmata has joined #me 14:54:20 ... When provenance is required, on AI transparency, legislators indicate this is the standard to adopt 14:55:05 Leonard: We're not pushing, but there to support. EU has text and data mining requirements, so W3C has a TDMRep group. That group and us have been helping EU to describe the concept of opting-out 14:55:21 ... We participate as one of many, to help policymakers understand, define terminolgoy for their needs 14:55:51 Andy: Social media is another catalyst 14:56:21 Alicia: You mentioned mobile devices. Is that like browser support, in applications, new metadata in all pictures. What are the implications? 14:56:51 Andy: Privacy-preserving metadata 14:57:23 Alicia: Someone with a C2PA device, can provide data from moment of capture. But doesn't imply use for all capture 14:58:09 Alicia: Authors proving something not made with AI. Sounds like proving a negative, how does it work? Could be a picture of a picture 14:58:57 Leonard: The user doesn't state they didn't use AI, they used what type of camera. EXIF isn't tamper resistant 14:59:20 ... Nothing is tamper proof, so we say tamper evident 14:59:27 s/resistant/evident/ 15:01:11 Alicia: You're putting trust in technology for what's a social problem 15:05:21 Leonard: It is a social problem. If you trust a news source, you trust their video, but if you don't trust them 15:05:40 Chris: Different roots of trust, compared to web, with CA Browser Forum 15:07:56 -> https://cabforum.org CA/Browser Forum 15:09:10 Leornard: We've been trying to use the same model 15:10:42 cpn: what would be the next step? 15:11:00 ... is there something we should wait (e.g., 2.1 review end)? 15:11:16 ... outcome from the TAG review? 15:11:37 s/ (e/, e/ 15:11:41 s/)?/?/ 15:11:56 s/cpn:/Chris:/ 15:12:53 Leonard: looking at the API aspect might be worthwhile 15:12:58 ... DASH integration also 15:13:17 Chris: what would be the specific expectation for audio? 15:13:47 Lornard: regarding MP4, fragments can be handled 15:14:42 ... don't have equivelants for audio 15:14:57 Alicia: MP3 is complicated 15:15:15 ... there is some hack, though 15:15:26 Chris: this is a huge topic 15:16:06 ... how much of the provenance mechanism to be understandable for regular people? 15:16:15 ... and browser integrations 15:16:46 Leonard: we've been very clear that creator has no control 15:17:17 ... for security purposes, etc. 15:17:48 ... browsers are different questions 15:18:05 ... but there are already 3 different browser plug-ins 15:18:35 ... Adobe and Microsoft for Chrome, and another for Saffari 15:18:48 Chris: this is interesting 15:19:03 q? 15:19:09 ack c? 15:19:11 ack c 15:19:22 ... any other final questions? 15:19:50 Alicia: what could an end user can get for video about "Who"? 15:20:03 ... C2PA itself is focusing on "How" 15:20:32 ... e.g., photo suppliers 15:20:45 Leonard: could be included as part of the Ingredients 15:21:01 Alicia: wasn't very clear to me 15:21:29 ... what is included in the data? 15:21:38 Leonard: let me show... 15:21:51 ... I'm on a content credential site 15:22:05 ... (shows an example) 15:22:12 ... this data has a very long history 15:22:18 ... that is "provenance" 15:22:40 ... this tells me what can be found 15:23:02 ... what tool was used 15:23:15 ... Actions like combined, cropping, ... 15:23:47 ... can click one of the "Ingredients" to see the information 15:24:16 ... (shows one image which doesn't have the credential) 15:24:32 ... we can go back to the RAW file 15:24:59 Alicia: what does "Issued by" means? 15:25:16 Leonard: singed by Adobe Photoshop 24.1.1 15:25:54 ... regarding "Who", the CAWG is working on that 15:26:16 ... verified by Verifiable Credentials or social media accounts 15:26:43 ... this example have many Ingredients 15:27:32 ... here, nothing is mandatory other than the Hash 15:28:08 Alicia: what if someone checks in a picture using Adobe Photoshop? 15:29:05 Leonard: Photoshop will be certificated 15:29:15 ... but it depends on implementations 15:29:44 ... each implementation will get its certification 15:30:21 Alicia: make more sense than giving one specific certificate to Photoshop 15:31:06 Leonard: we ourselves are not working on certificate 15:35:04 Chris: Thanks for your discussion. Let's keep in touch. 15:35:48 Leonard: Please reach out me if you have any questions 15:35:56 [adjourned] 15:35:58 rrsagent, draft minutes 15:35:59 I have made the request to generate https://www.w3.org/2024/08/20-me-minutes.html kaz 16:25:15 tidoust has joined #me 18:32:58 Zakim has left #me