ISSUE-403: Feedback on the mapping from Tim Lebo
Feedback on the mapping from Tim Lebo
- State:
- Product:
- Mapping PROV-O to Dublin Core
- Raised by:
- Daniel Garijo
- Opened on:
- 2012-06-09
- Description:
- Regarding
"To be more precise, we define provenance metadata as metadata providing provenance information according to the definition of the W3C Provenance Incubator Group"
Why are you still using the XG's definition? Does PROV-WG still not provide one that you like? Should PROV-WG be explicit about their definition of provenance (since its materials will become Recommendation and XG's will not)?
"For the complex mappings, we take the following approach: "
is confusing. Is one of the "three parts" enumerated above "complex". Ah, yes. The third.
Suggest to draw that connection more clearly.
The points in the second half of the paragraph:
". A rationale for these two steps is that the mappings in stage 1 are context free and do not depend on the existence of any other statements. On the other hand, by employing the patterns developed for stage 2, any kind of generated PROV data could be cleaned up at a later point, for instance after the integration with provenance information from a different source, which could be advantageous. "
really should be promoted to the first half of the paragraph. It takes too long to determine what the distinction is between the two phases.
The use of blank nodes is disturbing ( Please make it clear that the bnodes only exist during the processing that you suggest, and that bnodes are not produced in resulting PROV or DC records.
Direct mappings:
-1 dct:references rdfs:subPropertyOf prov:wasDerivedFrom .
+1 dct:creator rdfs:subPropertyOf prov:wasAttributedTo .
+1 dct:rightsHolder rdfs:subPropertyOf prov:wasAttributedTo .
-1 (casting a broad to a specific) dct:date rdfs:subPropertyOf prov:generatedAtTime .
+1 dct:Agent owl:equivalentClass prov:Agent .
-1 (reverse these) prov:hadOriginalSource rdfs:subPropertyOf dct:source .
+1 prov:wasRevisionOf rdfs:subPropertyOf dct:isVersionOf .
Voting for all of them (in
+1 dct:Agent owl:equivalentClass prov:Agent.
-1 dct:references rdfs:subPropertyOf prov:wasDerivedFrom .
+1 dct:rightsHolder rdfs:subPropertyOf prov:wasAttributedTo .
+1 dct:creator rdfs:subPropertyOf prov:wasAttributedTo .
+1 dct:publisher rdfs:subPropertyOf prov:wasAttributedTo .
+1 dct:contributor rdfs:subPropertyOf prov:wasAttributedTo .
+1 dct:isVersionOf rdfs:subPropertyOf prov:wasDerivedFrom .
+1 dct:isFormatOf rdfs:subPropertyOf prov:alternateOf .
+1 dct:replaces rdfs:subPropertyOf prov:tracedTo .
+1 dct:source rdfs:subPropertyOf prov:wasDerivedFrom .
-1 dct:date rdfs:subPropertyOf prov:generatedAtTime .
I would support reversing the above. As it is, you are casting a general "any date you wish" into a very specific meaning.
At first glance, the following are concerning. If the same instance has all of these properties, then it was generated at many distinct times. Perhaps your complex mappings tease this out.
-1 dct:issued rdfs:subPropertyOf prov:generatedAtTime .
-1 dct:dateAccepted rdfs:subPropertyOf prov:generatedAtTime .
-1 dct:dateCopyRighted rdfs:subPropertyOf prov:generatedAtTime .
-1 dct:dateSubmitted rdfs:subPropertyOf prov:generatedAtTime .
-1 dct:modified rdfs:subPropertyOf prov:generatedAtTime .
The following casts a range into an instant of time.
-1 dct:valid rdfs:subPropertyOf prov:generatedAtTime .
-1 prov:hadOriginalSource rdfs:subPropertyOf dct:source .
I would support reversing the above. PROV is pointing to a subset of the sources that dct:source intends to cite. dct:source is the union of hadOriginalSource and any of its derivations (and more, perhaps).
+1 prov:wasRevisionOf rdfs:subPropertyOf dct:isVersionOf .
For readability, I'd reverse the order of these:
dcprov:CreationActivity rdfs:subClassOf
prov:Activity, dcprov:ContributionActivity .
dcprov:ContributionActivity rdfs:subClassOf
prov:Activity .
For readability, I'd reverse the order of these:
dcprov:CreatorRole rdfs:subClassOf
prov:Role, dcprov:ContributorRole .
dcprov:ContributorRole rdfs:subClassOf
prov:Role .
If we reapply the SPARQL queries from the complex mappings twice, do we get two un-identified blank nodes that should be identified?
If so, this leads to proliferation of bnodes that should be avoided. If the queries are only to be informative, and those bnodes to be appropriately named to avoid duplication, then I suggest this be clearly stated.
In section "List of dc terms excluded from the mapping",
I suggest to organize by descriptive vs. provenance metadata. That way I can review your categorization more easily, AND focus on only the provenance metadata (which is the point of the mapping).
No bibliography for (DCMI Usage Board, 2010b) or (DCMI Usage Board, 2010a)
You don't reference the URL ?
It seems like you could include the content of and directly in the "primer" - the redundancy is dissonant.
Why three complex mappings in the primer? Why now fewer?
The organization across 4 pages makes it difficult to determine "what is where". I think the content as it is could stand on its own as one document.
Where is stage 2 of the complex mappings?
13) Are there implementations of your complex mapping?
The following order makes more sense to me
dcprov:PublicationActivity rdfs:subClassOf prov:Activity .
dcprov:ContributionActivity rdfs:subClassOf prov:Activity .
dcprov:CreationActivity rdfs:subClassOf prov:Activity, dcprov:ContributionActivity .
dcprov:ContributorRole rdfs:subClassOf prov:Role .
dcprov:PublisherRole rdfs:subClassOf prov:Role .
dcprov:CreatorRole rdfs:subClassOf prov:Role, dcprov:ContributorRole .
Are the following used in the complex rules? It would be very nice to show which rules each specialization is used in. Similarly, it would be nice to group rules by their use of PROV terms, and by "in the where" versus "in the construct". A navigation like this would really bring the material together nicely.
dcprov:PublicationActivity rdfs:subClassOf prov:Activity .
dcprov:ContributionActivity rdfs:subClassOf prov:Activity .
dcprov:CreationActivity rdfs:subClassOf prov:Activity, dcprov:ContributionActivity .
dcprov:ContributorRole rdfs:subClassOf prov:Role .
dcprov:PublisherRole rdfs:subClassOf prov:Role .
dcprov:CreatorRole rdfs:subClassOf prov:Role, dcprov:ContributorRole .
Is the following a copy paste error (publisher is never mentioned):
Section: dct:publisher
?doc a prov:Entity .
prov:wasAttributedTo ?ag .
_:out a prov:Entity .
prov:specializationOf ?doc .
?ag a prov:Agent .
_:act a prov:Activity, dcprov:PublicationActivity ;
prov:wasAssociatedWith ?ag ;
prov:qualifiedAssociation _:assoc .
_:assoc a prov:Association ;
prov:agent ?ag ;
prov:hadRole dcprov:PublisherRole .
_:out prov:wasGeneratedBy _:act ;
prov:wasAttributedTo ?ag .
?doc dct:creator ?ag .
spacing is off in:
The rightsHolder is different, here we propose to omit the activity and just add the rights holder to the entity by means of
prov:wasAttributedTo. This mapping could actually be omitted as the statements can be inferred from the direct mapping.
?doc a prov:Entity .
?ag a prov:Agent .
?doc prov:wasAttributedTo ?ag .
?doc dct:rightsHolder?ag .
Recommend expanding variable names to be more readable (e.g., ?ag to ?agent)
Is there a reason why you use "_:iss_entity" instead of just the "[]" syntax? smearing a node across the CONSTRUCT makes it more difficult to read. You used the "[]" in :
[ a prov:Generation ;
prov:atTime ?date ;
prov:activity _:act . ] . - Related Actions Items:
- No related actions
- Related emails:
- Re: PROV-ISSUE-403 (Feedback_TL): Feedback on the mapping from Tim Lebo [Mapping PROV-O to Dublin Core] (from on 2012-07-05)
- Re: PROV-ISSUE-403 (Feedback_TL): Feedback on the mapping from Tim Lebo [Mapping PROV-O to Dublin Core] (from on 2012-07-04)
- Re: PROV-ISSUE-403 (Feedback_TL): Feedback on the mapping from Tim Lebo [Mapping PROV-O to Dublin Core] (from on 2012-07-04)
- PROV-ISSUE-403 (Feedback_TL): Feedback on the mapping from Tim Lebo [Mapping PROV-O to Dublin Core] (from on 2012-06-09)
Related notes:
No additional notes.
Display change log