See also: IRC log
phil: did a review of use cases this morning. not too much change, missed one that christoph added.
https://www.w3.org/community/rax/wiki/Draft_Material#Data_acquisition_from_job_postings_via_GATE
phil: thanks a lot for adding this, christoph - can you give a brief description?
christoph: sure. have not yet
managed to share the descriptions, I have more material, and
will get it done to share this
... will also add more concrete examples. Application setting
is: we collect job postings in the form of plain text from the
web
... we do named entity recognition with gate, and we get XML
output
... begining and end of each token is annotated
<clange> text text text <start/>recognised entity<end/> text text
christoph: see above XML example. this has to be translated to RDF
<clange> <start id="foo"/>
<clange> <start href="#foo"/>
christoph: start and end tags look like the above
<clange> ids or refs (forgot which direction) are in these start/end tags
christoph: we are using XSLT based tool I developed (trextor) to create RDF. it is quite hard
<clange> krextor
christoph: with XPath it is hard
to select elements between start and end tags
... that is a bit tricky, you need a good knowledge of XPath,
the sibling axis' etc.
... in context of European project, in which another partner is
doing the extraction
phil: is this similar to Martynas case?
christopher: in terms of Xpath
complexity, yes
... general XML to RDF transformation issue?
https://github.com/fsasaki/its20-extractor/tree/master/wikipedia-extractor
<philr> felix: I've written various converters
<philr> ...it is always special case issues
<philr> ...XML has various ways to include content
<philr> ...special purpose handling is somwhat unavoidable
<philr> ...example documents with guideance would be useful
scribe: may be useful to give guidance on how to handle various cases
christopher: there are patterns,
e.g. parent child relations in XML and RDF properties
... for this you can provide a high level translation
patterns
<philr> clange: High level translation is possible with simple parent-child relationships
<philr> felix: mixture of text and element nodes is challenging
<clange> fsasaki: handling of specific links (specific to wiki markup)
phil: in FREME project we are also doing named entity recognition on plain text. our services are capable of returning turtle files, but we can cover many formats
https://api-dev.freme-project.eu/ckeditor-dev/ckeditor/samples/freme.html
various types of output, inline or external using json-ld
<scribe> ACTION: felix to provide examples of round tripping as done in the freme project [recorded in http://www.w3.org/2016/11/25-rax-minutes.html#action01]
<philr> felix: to collect information on what better tooling is needed
<philr> ...best practices abd standardization
<philr> ...1.5 hour session on requirements
<philr> clange: is there more I can do if I do not attend the summit?
<philr> felix: it would be good if someone from your organization could attend
<philr> ...questionnaire to bdva members but want input from companies
<philr> Is there a fee to join bdva?
felix: yes, will send info on that
<clange> fsasaki 14:29: EU is not necessarily interested in new standards being developed, but in existing standards to be _applied_ in a better way
thanks, clange
discussion on automationML use case
felix will send further infos on BDVA around
next meeting 9th of December
phil cannot make it, christian to chair
This is scribe.perl Revision: 1.148 of Date: 2016/10/11 12:55:14 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: RRSAgent_Text_Format (score 1.00) Succeeded: s/this/this, christoph/ No ScribeNick specified. Guessing ScribeNick: fsasaki Inferring Scribes: fsasaki Present: philr felix timea christoph Regrets: christian gerard jose Agenda: https://lists.w3.org/Archives/Public/public-rax/2016Nov/0008.html Got date from IRC log name: 25 Nov 2016 Guessing minutes URL: http://www.w3.org/2016/11/25-rax-minutes.html People with action items: felix[End of scribe.perl diagnostic output]