14:55:01 RRSAgent has joined #me 14:55:01 logging to https://www.w3.org/2021/09/27-me-irc 14:55:06 Zakim has joined #me 15:01:00 ChrisLorenzo has joined #me 15:01:25 present+ Kaz_Ashimura, Chris_Lorenzo, Chris_Needham, Rob_Smith 15:02:13 Meeting: Media Timed Events / Unbounded VTT Cues 15:02:21 Chair: Chris_Needham 15:02:24 scribenick: cpn 15:03:02 nigel has joined #me 15:03:06 present+ Gary_Katsevman 15:03:58 RobSmith has joined #me 15:04:07 Agenda: https://www.w3.org/events/meetings/257432ab-e123-4986-bcbd-a006e9ddbf2c 15:05:19 calvaris has joined #me 15:05:29 present: Xabier_Rodriguez_Calvar 15:07:19 rrsagent, make log public 15:07:25 rrsagent, draft minutes 15:07:25 I have made the request to generate https://www.w3.org/2021/09/27-me-minutes.html kaz 15:07:55 Gary: At the last meeting, we concluded that the way things are now, there's no benefit to having unbounded cues in the text format 15:08:00 topic: Unbounded cues in WebVTT 15:08:57 Present+ Nigel_Megitt 15:09:12 s/topic: Unbounded cues in WebVTT// 15:09:12 Gary: The reason why is that seeking to the middle of a stream, where as far as we can tell you'd have to copy each unbounded cue per VTT segment. Otherwise you'd have to load all the text tracks since the beginning of time, which isn't reasonable 15:09:23 i/At the/topic: Unbounded cues in WebVTT/ 15:09:29 rrsagent, draft minutes 15:09:29 I have made the request to generate https://www.w3.org/2021/09/27-me-minutes.html kaz 15:10:22 Rob: I've read the minutes from last time, and discussion about updating cue attributes 15:10:35 ... There's a requirements document 15:10:54 https://github.com/w3c/media-and-entertainment/blob/master/media-timed-events/unbounded-cues.md 15:11:30 -> https://github.com/w3c/media-and-entertainment/pull/77 pr with updates to unbounded cues 15:11:41 Rob: From an unbounded cues point of view, requirement 1a, is what unbounded cues do 15:12:01 ... There was some discussion about changing other cue attributes, but I don't think we had any use cases for that. Has that changed? 15:13:08 Gary: I think the main thing with other attributes, it's probably fine if narrow, but the worry is about preventing extending in the future to allow cues to be updated 15:13:43 ... If we narrow the use case to updating unbounded cues to be bounded, so only the end time is set and nothing else changes, that could be restrictive enough to not be an issue 15:14:14 Rob: I'd generally agree. The scope should be limited as to what can be changed. There aren't use cases for changing other attributes 15:14:29 ... We shouldn't rule it out 15:14:41 Gary: There are some for live captioning, but it's too early 15:15:11 Rob: Can be done with unbounded cues, in a different way where the update is done as a new cue, linked at a higher level. It's implementation-specific how to do that 15:15:32 ... This comes back to matching by start time and content, which would allow content to be updated 15:15:48 ... I'd argue that changing the start time or content should be a new cue, rather than changing an existing cue 15:15:58 ... from the point of view of the VTT file format 15:16:19 ... There isn't a mechanism to change existing cues. I don't think the syntax supports updating cues currently 15:16:32 Nigel: The discussion last time didn't identify a reason to do it 15:16:58 .. We don't have a delivery mechanism in WebVTT. For video we have segments and we can bound the VTT cue time to the segment interval 15:17:13 ... If necessary repeat the cues. Then you don't run into acquisition issues doing that 15:17:39 ... Having some kind of external model, if you need to update state of a metadata entity, you can do that in segmented delivery in the same way 15:18:09 ... Updating from chapter 1 to chapter 2 without needing to hunt back for old cues 15:18:52 Rob: I agree do don't want to have to look back previously. I didn't understand what you meant by metadata, are you meaning a (time, value) pair 15:19:09 Nigel: The entity you're modelling has a lifecycle, which is application specific 15:19:31 Rob: Are you treating changes as instantaneous events at a point in time, or as a value with duration? 15:20:08 Nigel: The information you send can be bounded to an interval, but what it's about can be changing in time 15:20:50 ... The cue has a duration but the entity for which it provides duration may not have the same duration. It's a model maintained in the client application 15:21:28 Rob: For a single segment that starts at chapter 1 then chapter 2. Is there an instantaneous event that says "chapter 2 starts now"? 15:22:00 Nigel: In that application, I'd probably build it by saying there's always an active cue that describes the current chapter 15:22:39 Rob: WebVMT supports that, values can be set in an interval and unset at a later time 15:22:52 ... Unbounded cues alllows that to be solved 15:24:39 Chris: Would it help to write this down? 15:25:38 ... A worked example could be helpful 15:26:44 ... There's an open PR to the use case document: https://github.com/w3c/media-and-entertainment/pull/77/files 15:26:58 Gary: This describes the sports score example, and live captioning 15:30:25 Chris: The requirements in the document might not be useful 15:31:24 Gary: You should represent unbounded cues as multiple bounded cues, always, and the unbounded-ness isn't in the cues 15:32:08 ... You might overrun by a second when the cue becomes bounded 15:32:49 ... WebVTT gets delivered in segments at a time. You can make cues be the duration of the segment. By the time you're ready to deliver the segment you can set the end time rather than end of segment time 15:33:49 Rob: I agree, the issue is that the unbounded cue has unknown end time, so you're setting a bounded cue with known end time 15:34:02 ... so the problem comes if you set a bounded cue across the current segment 15:34:12 Gary: Yes, you're not sending ahead of time 15:34:43 Nigel: For live, the content is segmented 15:35:04 Gary: The framgneted MP4 is basically the same, with small chunks 15:35:38 Rob: That differs from the measurement observation use case. If you take a temperature measurement now, but you don't know when the next one will be 15:35:56 ... When the next one arrives, you can update it to the next value 15:36:27 ... In a live case, just use the last known value. But when you re-play it, you can interpolate from the last to the next value 15:36:59 ... Makes it simple for implementations, just take a sequence of time values 15:37:38 Gary: The way i'd represent a time measurement in WebVTT is each measurement for a preceding period of time, looking back instead of looking forward 15:37:57 Rob: In the case I described, there's no need to look back 15:38:21 Gary: Some people use cues with same start and end time to represent an event, maybe that's the answer 15:39:08 Rob: That's the way WebVMT deals with discontinuities in the data, so if there's a break in the data, where there's no value, make an instantaneous cue to say there's no data 15:39:40 Gary: If you seek to the middle of the video, how do you know the state? Do you need to parse all the history? 15:39:43 Rob: Yes 15:39:51 Gary: That's a requirement we're trying to avoid 15:40:28 ... With segmented WebVTT it'll parse just the current segment rather than any previous segments 15:41:10 Rob: Could solve that with unbounded cues, if you're assembling segments retrospectively, if there's an active unbounded cue, it's reasonable to assume it's still active in the next segment 15:41:30 Gary: Yes, and we concluded that you'd have to copy cues from segment to segment 15:41:58 ... In the discussion, having that extra signal of unboundedness wasn't adding much as you have to copy the cues between segments 15:43:09 Rob: You'll either know the cue ends within the segment, or it ends at the same time as the segment 15:43:21 s/Rob/Gary/ 15:43:45 Rob: So it seems unbounded cues can be handled using bounded cues in segmented WebVTT 15:44:42 Chris: Does the client coalesce the cues into a contiguous long cue? 15:45:11 Gary: It doesn't. From a previous FOMS discussion, we talked about writing a note to describe avoiding flicker in rendering 15:45:19 ... That may be something we want to do as part of this work 15:45:33 Rob: That could be on a per-use case basis 15:46:00 Gary: It's not specific to the format, it's about player implementations, so for a Note instead of in the spec 15:46:31 Rob: For timed metadata you wouldn't want it to repeat 15:47:00 Nigel: Good point. The ability to say that a cue is the same as a previous one is orthogonal but could be worth looking at 15:47:28 ... You need a contract between producer and consumer of the files 15:47:54 ... All of the specs at the moment only define well-formedness in a single file, not across multiple files 15:49:14 Chris: Rob, what was your understanding of live distribution? 15:49:54 Rob: For WebVMT, if you have recordings from a sensor on a resource limited device, send readings as they're taken. Unbounded cues help, because you don't know when the next reading will be 15:50:25 ... So being able to supersede a value with a new value, recorded such that you can interpolate in playback 15:53:19 ... If you record an unbounded cue at time A with a value, you can supersede it with another cue at time B, using an identity to link those two things together 15:53:57 Chris: Is there an example we can look at? 15:54:12 Rob: It's an open item to add that. It's been discussed but not added to the document 15:54:54 q+ to ask about response to David and webvmt/webvtt alignment? 15:55:17 Chris: Also with WebVMT, use WebSockets for live delivery? 15:55:33 Rob: Yes 15:57:56 Chris: Let's follow up on that another all 15:58:31 Gary: So we can confirm to David that no syntax changes are to be made, and for the unbounded case you copy cues between segments, and they're bounded by segments 15:59:01 WebVMT live interpolation examples: https://github.com/webvmt/community-group/issues/2#issuecomment-708529659 15:59:54 Chris: So we have: 16:00:14 ... 1. A proposed model for delivering unbounded cues in segmented VTT 16:00:38 ack gkatsev 16:00:38 gkatsev, you wanted to ask about response to David and webvmt/webvtt alignment? 16:00:47 ... 2. A client processing model to describe how cues are coalesced (write as a Note) 16:01:07 ... 3. How to identify cues across segment boundaries? 16:01:23 ... 4. (possibly) live delivery over WebSockets or other non segmented media delivery 16:02:35 Chris: Would MPEG also need to have a solution for identifiers across segments? 16:04:42 Gary: I don't think so, at this stage 16:04:50 Chris: What next for this group? 16:05:15 Gary: Consider whether to adopt some WebVMT syntax changes into WebVTT. I'm unsure, but it's interesting topic 16:06:19 Rob: I'll need to look into live streams 16:06:32 Gary: Is the idea you merge documents client-side from multiple streams 16:07:11 Rob: Yes, there's a video stream and a VMT stream. The way it currently works, a drone embeds metadata into the MPEG file, so there's a postprocessing step to export that into WebVMT 16:07:29 ... That's the main case I've been looking at so far. The live case would also be interesting 16:08:04 Gary: Longer term, useful to think about live captioning and potential for updating cues, e.g., a stenographer who wants to correct text already sent 16:08:46 Rob: Or voice recognition. You can mis-hear things and then go back and correct, with additional later context 16:09:08 Gary: 608 captions has some ability to do that 16:09:33 Topic: Next meeting 16:10:46 Chris: TPAC is coming up. Could meet on 11th? 16:11:11 Kaz: I can't make the 11th, but you can go ahead 16:13:11 Chris: I'll send an invite 16:13:11 [adjourned] 16:13:23 rrsagent, draft minutes 16:13:23 I have made the request to generate https://www.w3.org/2021/09/27-me-minutes.html cpn 16:13:28 rrsagent, make log public 16:14:21 present+ Gary_Katsevman, Chris_Needham, Chris_Lorenzo, 16:14:25 rrsagent, draft minutes 16:14:25 I have made the request to generate https://www.w3.org/2021/09/27-me-minutes.html cpn 16:14:50 present+ Rob_Smith 16:14:52 rrsagent, draft minutes 16:14:52 I have made the request to generate https://www.w3.org/2021/09/27-me-minutes.html cpn 16:17:50 rrsagent, bye 16:17:50 I see no action items