Copyright © 2009 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
This document defines the Ontology for Media Resource 1.0, a core vocabulary to describe media resources on the Web. It is defined based on a common set of properties which covers basic metadata to describe media resources. Further it defines semantics-preserving mappings between elements from existing formats. The ontology is supposed to foster the interoperability among various kinds of metadata formats currently used to describe media resources on the Web.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is the First Public Working Draft of the Ontology for Media Resource 1.0 specification. It has been produced by the Media Annotations Working Group, which is part of the W3C Video on the Web Activity. The Working Group expects to advance this specification to Recommendation Status.
Please send comments about this document to public-media-annotation@w3.org mailing list (public archive).
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
1 Introduction
1.1 Purpose of this specification
1.2 Formats in scope
1.3 Formats out of scope
2 Terminology
3 Property value type
definitions
3.1 Basic property value
types
3.1.1 URI
3.1.2 Date
3.2 Complex property value
types
4 Property definition
4.1 Core property
definitions
4.1.1 Description of the approach
followed for the property definitions
4.1.2 Core properties
4.2 Property mapping table
4.2.1 Rationale regarding
the mapping table
4.2.1.1 Semantic
Level Mappings
4.2.1.2 Syntactic
Level Mappings
4.2.1.3 Mapping
expression
4.2.2 The mapping table
A References
B References (Non-Normative)
C Acknowledgements (Non-Normative)
This section is informative.
This document introduces the Ontology for Media Resource 1.0, a core vocabulary to describe media resources, mostly targeted to media resources on the Web. It is defined based on a common set of properties which covers basic metadata to describe media resources. For example creator is a common property that is supported in several existing metadata formats, and is therefore part of the core vocabulary defined by the Ontology. Further, the Ontology defines mappings between elements from existing formats and our list of properties. Ideally, the mappings should be semantics-preserving, but this is not achieved with the first version of the Ontology (Media Ontology 1.0), because of the difference in nature of the properties in the mapped vocabularies their extension is not exactly overlapping. For example the propertydc:creator from Dublin Core and the propertyexif:Artist defined in EXIF are both mapped to the property creator of our Ontology, but the extension of the property in the exif vocabulary (the set of values that the property can refer to) is more specific than the one of Dublin Core. Mapping back and forth with our ontology as reference will hence induce a certain loss of semantics. This is inevitable if we want to achieve a certain amount of interoperability.
The Ontology, with the properties definition and mappings, provides the basic information needed by targeted applications (see Use Cases and Requirements for Ontology and API for Media Object 1.0) for supporting the interoperability among the various kinds of metadata formats related to media resources, and particularly media resources on the Web. In addition, the ontology will be accompanied by an API that provides uniform access to all elements defined by the ontology.
The initial version of this document contains only a limited set of properties with their corresponding mappings to the vocabularies listed in section 1.2 Formats in scope. Nevertheless it is being published with the aspiration to gather wide feedback on the general direction of the Working Group. In particular we would like to encourage feedback on section 4 Property definition.
This specification defines an ontology for cross-community data integration of information related to media resources, with a particular focus on media resources on the Web. The purpose of the ontology is to help circumventing the current proliferation of video metadata formats by providing full or partial translation and mapping towards existing formats.
The following table lists the formats that were selected by the working group as in-scope, along with the identifiers which are used as prefixes to identify them in this specification.
Note:
This specification is based on a review of existing formats and the properties they provide. This review does not aim to be complete, and this specification does not aim to cover all properties defined in these formats. The choice of properties is motivated by their wide usage.
Identifier | Format | Example | Reference |
cl11 | CableLabs 1.1 | cl11:Writer_Display | Cablelabs 1.1 |
cl20 | CableLabs 2.0 | cl20:Producer | Cablelabs 2.0 |
dig35 | DIG35 | dig35:ipr_name/ipr_person@description='Image Creator' | DIG35 |
dc | Dublin Core | dc:creator | Dublin Core |
ebucore | EBUCore | ebuc:creator | EBUCore |
pmeta | EBU P-Meta | pmeta:Contribution | EBU P-META |
exif | EXIF 2.2 | exif:Artist | EXIF |
frbr | FRBR | frbr:Person | FRBR |
id3 | ID3 | id3:TCOM | ID3 |
iptc | IPTC | iptc:Creator | IPTC |
it | iTunes | it:©ART | iTunes |
lom21 | LOM 2.1 | lom21:LifeCycle/Contribute/Entity | LOM |
ma | Core properties of MA WG | ma:creator | 4 Property definition |
media | Media RDF | media:Recording | Media RDF |
mrss | Media RSS | mrss:credit@role='author' | Media RSS |
mets | METS | mets:agency | METS |
mpeg7 | MPEG-7 | mpeg7:CreationInformation/Creation/Creator/Agent | MPEG-7 |
nmix | NISO MIX | nmix:ImageCreation/ImageProducer | MIX |
qt | Quicktime | qt:©dir | QuickTime |
media | SearchMonkey Media | media:type | MediaMonkey |
dms | DMS-1 | dms:Participant/Person | DMS-1 |
tva | TV-Anytime | tva:CredistsList/CredistItem | TV-Anytime |
txf | TXFeed | txf:author | TXFeed |
vra40 | VRA Core 4.0 | vra40:agent | VRA |
xmp | XMP | xmpDM:composer | XMP |
yt | YouTube Data API Protocol | yt:author | YouTube Data API Protocol |
This section is normative.
Any Resource (as defined by [RFC 3986]) related to a media content. Note that [RFC 3986] points out that a resource may be retrievable or not. Hence, this term encompasses the abstract notion of a movie (e.g. Notting Hill) as well as the binary encoding of this movie (e.g. the MPEG-4 encoding of Notting Hill on my DVD), or any intermediate levels of abstraction (e.g. the director's cut or the plane version of Notting Hill). Although some ontologies (FRBR, BBC) define concepts for different such levels of abstraction, our ontology does not commit to any classification of media resources.
A property is an element from an existing metadata format for describing media resources, or an element from the core vocabulary defined in this Working Group. For example, the Dublin Corecreator element and the Media Ontology creator element are properties. A property links a Media Resource with a value: dc:creator links a given representation with the value of its creator (Dublin Core specifies: "Examples of a Creator include a person, an organization, or a service."). This value can be specified as a simple string or as the URI representing the creator. The set of properties selected to be part of the Media Ontology Core vocabulary is listed in section 4 Property definition.
The notion of Mapping refers to the description of relations between elements of metadata schemas; in our case the mapping concerns the Vocabularies "in scope", and the properties of the core vocabulary of the Media Ontology. These Mappings are presented in section 4.2 Property mapping table.
Property value types are the types of values used in a property. Property value types are defined in sec. 3 Property value type definitions. They are relying mostly on XML Schema data types [XML Schema 2].
URI "Uniform Resource Identifier" is defined in [RFC 3986]. In this specification the term URI is used since it is well known. However the term is used as meaning IRIs "Internationalized Resource Identifiers (IRIs)" [RFC 3987], that is URIs which may contain non-escaped characters other than ASCII. The data type is anyURI .
A Date
value is represented using the XML Schema dateTime
data type.
This section is normative.
The following information is available for each property:
rough description of purpose
mappings to existing formats
Editorial note | |
This core list of properties is neither closed, nor pretend to be exhaustive and that the group still looks at rationale for including properties to be considered in the final list of properties. In addition this table will be elaborated for the types and further information. |
Name | Type | Description |
ma:contributor | tbd | A pair identifying the contributor and the nature of the contribution. E.g. actor, cameraman, director, singer, author, artist (Note: subject see addition of contributor type) |
ma:creator | tbd | The authors of the resource (listed in order of precedence, if significant) |
ma:description | tbd | A textual description of the content of the resource |
ma:format | tbd | MIME type of the resource (wrapper, bucket media types) |
ma:identifier | tbd | A URI identifies a resource; which can be either a "Resource" (abstract concept) or a "Representation" (instance/file). See 4.4 Annotating Media Fragments |
ma:language | tbd | Specify a language used in the resource, Recommended best practice is to use a controlled vocabulary such as [RFC 4646] |
ma:publisher | tbd | Examples of a Publisher include a person, an organization, or a service. Typically, the name of a Publisher should be used to indicate the resource |
ma:relation | tbd | A pair identifying the resource and the nature of the realtionship. E.g. transcript, original_work |
ma:keyword | tbd | An unordered array of descriptive phrases or keywords that specify the topic of the content of the resource |
ma:title | tbd | The title of the document, or the name given to the resource |
ma:genre | tbd | Genre of the resource |
ma:createDate | tbd | The date and time the resource was originally created. (for commercial purpose there might be an annotation of publication date) |
ma:rating | tbd | A pair identifying the rating person or organization and the rating (real value) |
ma:collection | tbd | A name of the collection from which the resources originates |
ma:duration | tbd | The actual duration of the resource |
ma:copyright | tbd | The copyright statement. Identification of the copyrights holder (DRM is out of scope for MAWG) |
ma:location | tbd | A location associated with the resource. Can be the depicted location or shot location |
ma:compression | tbd | Compression type used, e.g. H264. Note: possible to use extended mime type, see [RFC 4281] |
ma:frameSize | tbd | The frame size. For example: w:720, h: 480 |
ma:targetAudience | tbd | A pair identifying the issuer of the classification (agency) and the classification. E.g. parental guide, targeted geographical region |
ma:locator | tbd | A URI at which the resource can be accessed (e.g. a URL, or a DVB URI) |
As a first step to build the Media ontology, a set of commonly supported properties by the aforementioned vocabularies has been listed. This list, henceforth referred to as "Core Media Properties list", is the basis for vocabularies matching. Its namespace is "ma", for Media Annotation. We provide a first set of mapping propositions between the vocabularies taken into account and this list. These mappings have double nature: semantic and syntactic.
The mappings are "one way" so far, i.e. the semantics is of a relationship between one or more items from the vocabulary considered and one or more property from our list. For example, in XMP, both xmpDM:copyright and dc:rights (as part of the XMP standard) are mapped to ma:copyright; in EXIF, the Copyright property is mapped to ma:copyright. No semantic relationships can be inferred between the properties in XMP and in EXIF from these mappings. This "Core Media Properties list" can be considered as the minimal requirement for describing media content. The mappings that have been taken into account have different semantics: the properties of the different vocabularies can be
Exact matches: the semantics of the two properties are equivalent in most of the possible contexts. For example, ma:title matches exactly vra:title.
More specific: the property of the vocabulary taken into account has a semantic that takes into account only a subset of the possibilities expressed by the property defined in this Working Group. For example in DIG35, ipr_names@description and ipr_person@description are more specific than the property ma:publisher to which it is mapped.
More generic: the inverse of the above, the property of the vocabulary taken into account has a semantic that is broader than the property defined in this Working Group. For example, the DIG35location is more general than the ma:location.
Related: the two properties are related in a way that is relevant for some use cases, but this relation has no defined semantics. For example, in Media RSS, media:credit is related to ma:creator.
Syntactic level mappings declare the correspondence between two semantically equivalent properties but with a different syntactic expression. It's most evident case is the date formatting, but some others may appear.
Editorial note | |
Currently the mapping table for most of the formats does not contain information about syntactic mapping, but this information will be added in the following version. |
The mapping expression corresponds to the concrete implementation or representation of the mappings defined in the previous paragraph, both at a semantic level and at syntactic one.
In the context of the W3C Semantic Web activity, SKOS (acronym for Simple Knowledge Organization System) is currently a Candidate Recommendation that defines a vocabulary for representing Knowledge Organization Systems (i.e. vocabularies) and relationships amongst them. In SKOS the mapping properties that we take into account in the mapping table are expressed as: skos:exactMatch, skos:narrowMatch, skos:broaderMatch and skos:relatedMatch. Some more fine grained definition of the properties has still to be done: we need to agree on the properties' names, define their formal properties (if they are symmetric, etc) and the type of value expected, to enhance more efficient concrete mappings, in the API.
Editorial note | |
Here to put the mapping table in some form |
The following mappings are established between various multimedia metadata formats with core vocabulary of MA WG as pivot. This list of formats is not closed, nor pretend to be exhaustive and that the group still looks at rationale for including and excluding formats to be considered in the final mapping table;
This document is the work of the W3C Media Annotations Working Group.
Members of the Working Group are (at the time of writing, and by alphabetical order): Werner Bailer (K-Space), Tobias Bürger (University of Innsbruck), Eric Carlson (Apple, Inc.), Pierre-Antoine Champin ((public) Invited expert), Jaime Delgado (Universitat Politècnica de Catalunya), Jean-Pierre EVAIN ((public) Invited expert), Ralf Klamma ((public) Invited expert), WonSuk Lee (Electronics and Telecommunications Research Institute (ETRI)), Véronique Malaisé (Vrije Universiteit), Erik Mannens (IBBT), Hui Miao (Samsung Electronics Co., Ltd.), Thierry Michel (W3C/ERCIM), Frank Nack (University of Amsterdam), Soohong Daniel Park (Samsung Electronics Co., Ltd.), Silvia Pfeiffer (W3C Invited Experts), Chris Poppe (IBBT), Víctor Rodríguez (Universitat Politècnica de Catalunya), Felix Sasaki (Potsdam University of Applied Sciences), David Singer (Apple, Inc.), Joakim Söderberg (ERICSSON), Thai Wey Then (Apple, Inc.), Ruben Tous (Universitat Politècnica de Catalunya), Raphaël Troncy (CWI), Vassilis Tzouvaras (K-Space), Davy Van Deursen (IBBT).
The people who have contributed to discussions on public-media-annotation@w3.org are also gratefully acknowledged.