14:57:00 RRSAgent has joined #hcls 14:57:00 logging to http://www.w3.org/2010/08/30-hcls-irc 14:57:08 kennyluck has left #hcls 14:57:08 Zakim, this will be BioRDF 14:57:08 ok, kei; I see SW_HCLS(BioRDF)11:00AM scheduled to start in 3 minutes 14:57:26 agenda +introduction [Kei] 14:57:45 agenda+ +paper submission [All] 14:57:54 take up next agendum 14:59:11 SW_HCLS(BioRDF)11:00AM has now started 14:59:17 + +1.832.386.aaaa 15:00:00 +Kei_Cheung 15:01:44 michael has joined #hcls 15:02:39 + +1.206.484.aabb 15:02:52 matthias_samwald has joined #hcls 15:03:20 +[IPcaller] 15:03:35 zakim, [IPcaller] is matthias_samwald 15:03:35 +matthias_samwald; got it 15:04:23 kei has joined #HCLS 15:05:49 scribenick: matthias_samwald 15:06:09 kei: over the last days i spent time on editing the discussion section 15:06:39 +mscottm 15:06:53 ... smoother, more readable. i did not delete the old text, so others can have a look and possibly add. i highlighted it in yellow. 15:07:57 ... we made a lot of good progress 15:09:00 mscottm has joined #hcls 15:09:12 ... how far are we with the examples? 15:09:33 lena: all federated queries are now also on the demo 15:09:34 http://ibl.mdanderson.org/~mhdeus/BioRDF/microarray/sparql_endpoint.html 15:09:48 lena: on that page, you can find the six demo queries 15:10:28 kei: is this the full gene list? 15:10:37 lena: this is a small subset 15:11:15 http://mibupload.com/u0PSbD.xml 15:11:38 (never mind this link) 15:11:47 kei: the first queries are querying the gene lists themselves 15:12:03 ... what type of brain regions, disease etc. 15:12:23 ... also something about the data themselves (same software package etc.) 15:12:30 ... looks pretty good to me 15:12:50 ... the last queries focus more on query federation 15:13:00 ... these queries also make use of the origins of the datasets 15:13:23 lena: see Q4 15:13:44 ... as scott suggested, i used the VoID vocabulary. 15:14:50 lena: if you click on the query, the SPARQL query is automatically entered. 15:16:29 lena: for the federated queries -- it does not work on some browsers 15:17:39 michael: would it be possible to have a query that returns all the genes in the final gene list? 15:18:21 michael: i.e., the simple final gene list for a certain experiment. 15:18:44 michael: most of these queries do not return much information, it would be nice to know what the basic information available is 15:19:47 lena: for the first gene list there are 162 15:20:01 lena: i can use VoID for that kind of information 15:21:48 egon: the vocabulary predicate you are using on diseasome, what are you doing with that? 15:22:55 lena: what we are doing for all datasets that have been annotated with that vocabulary. 15:23:11 ... gives us certainty that we are finding the things we are looking for 15:23:27 egon: but are you referring to diseasome as a dataset? it is not. 15:24:21 I am talking about this part of a query: OPTIONAL { [ rdf:type void:Dataset ; void:sparqlEndpoint ?srvc2 ; void:vocabulary ; dct:issued ?issued2 ] . FILTER (?issued2 > ?issued) 15:24:27 kei: is it a language or vocabulary? how do you use VoID so that machine knows that this dataset has to do with this certain set of diseases. 15:25:15 s/egon/scott/ 15:25:35 (sorry, audio quality is quite bad, hard to discern people) 15:27:09 void guide: http://semanticweb.org/wiki/VoiD 15:29:29 kei: the concern is: when we are using VoID to describe data origins, how is the data provenance actually captured by VoiD? 15:29:54 lena: you can describe subjects (e.g. "gene"). 15:31:27 I can't hear what Lena said 15:32:03 +EricP 15:33:14 +351 21 4469852 15:33:58 - +1.832.386.aaaa 15:34:38 Thank you Eric! 15:34:54 the part of Lilly Tomlin's "Operator" will be played by ericP today 15:36:04 lena: to find all datasets that have to do with genes, i would have to figure out the URI -- it is easier with the VoiD vocabulary. 15:36:17 scott: diseasome is not a vocabulary, but a dataset 15:36:18 Zakim, who is speaking? 15:36:30 ericP, listening for 10 seconds I heard sound from the following: EricP (63%) 15:37:07 lena: you still have to say what is a disease by indicating a full URL 15:37:37 ... we need a dataset of genes AND diseases 15:38:13 scott: VoID just gives you means to talk about a graph 15:39:27 kei: would it be a lot of work to switch to that use of subject? 15:39:42 ... to make the semantics a bit more understandable 15:39:59 lena: okay. this is easy to change. 15:43:11 uname -a 15:43:43 (lena and eric talk about issues with 32 vs. 64 bit version of federation software) 15:45:11 eric: but this is not critical for the paper 15:46:10 kei: for the paper the review period is quite sure (in the next 3 weeks) 15:46:53 eric: the second query we are working on, i got too many resoultions from one endpoint, we need to figure out how to solve this. 15:49:07 kei: the query that makes use of PharmGKB is a good example. we do not need to get into biological details for this paper, though. 15:49:54 lena: there would not be enough papers, also the reviewers would not understand. 15:51:40 scott: about the NCBO sparql endpoint: i don't know if there is a way with this microarray scenario. we would need an appropriate vocabulary (e.g. for diseases). but this is a level of provenance that is not fully formalized on the NCBO sparql endpoint. 15:52:42 ... e.g., a query that finds all datasets about neurodegenerative diseases -- that would be possible 15:54:03 ... another example: if you have a list of neurodegenerative diseases (based on ontology), then you can find data from other neurodegenarative diseases 15:55:53 lena: we could trim the list of disesases in Q4 to only neurodegenerative diseases 15:57:41 kei: in terms of the paper, how do we go about finalizing it? 15:58:16 lena: we have to calculate ~1 page for abstract, 1 page for references 15:59:10 lena: most of the results can be deferred to links to web pages 15:59:36 kei: lena, you are the person to do the first cut 16:00:58 kei: still, it is interesting to talk about the data model in the paper and give some examples 16:01:43 lena: i would say keep the diagram, lose the triples. i will make these changes. 16:02:12 kei: the deadline is friday, at one time we need to convert it to the IEEE format. when do we make that switch? 16:02:57 lena: my suggestion is to switch to IEEE on wedenesday and have everyone read it. 16:03:12 lena: on thursday we can still have a conference call and see if we all agree 16:07:00 scott: i would like to have some slides about this work that i can present at Oxford Global Pharma conference in october 16:07:30 -mscottm 16:08:54 -Kei_Cheung 16:08:54 Uh oh - on hcls2 Zakim, I get "This passcode is not valid." 16:08:54 -matthias_samwald 16:08:57 - +1.206.484.aabb 16:09:04 Meeting: BioRDF 16:09:05 Can you help, Eric? 16:09:16 michael has left #hcls 16:09:24 Chair: Kei Cheung 16:09:24 RRSAgent, please draft minutes 16:09:24 I have made the request to generate http://www.w3.org/2010/08/30-hcls-minutes.html kei 16:09:33 Scribe: Matthias Samwald 16:09:33 RRSAgent, please make log world-visible 16:09:46 ericP - can you help with hcls2 (Terminology)? 16:10:05 RRSAgent, please draft minutes 16:10:05 I have made the request to generate http://www.w3.org/2010/08/30-hcls-minutes.html matthias_samwald 16:10:41 now it's there. the "make world visible" has to come before the "draft minutes" command. 16:11:11 (i just figured that out recently...) 16:12:02 -EricP 16:12:03 SW_HCLS(BioRDF)11:00AM has ended 16:12:05 Attendees were +1.832.386.aaaa, Kei_Cheung, +1.206.484.aabb, matthias_samwald, mscottm, EricP 16:56:05 matthias_samwald has left #hcls 17:12:32 oops 17:17:27 (triple ?targetID ?OverExpressedGenLabel) 17:17:30 (triple ?drug ?targetID) 17:17:33 (triple ?drug ?drugName) 17:17:35 (triple ?drug ?actionMechanism) 17:17:38 (triple ?drug ?disease) 17:17:41 (triple ?disease ?diseaseTarget) 17:17:48 ?drugName = "H6PD" 17:18:08 ?OverExpressedGenLabel 17:18:39 SELECT+DISTINCT+%3FtargetID+%3FOverExpressedGenLabel+%3Fdrug+%3FdrugName+%3FactionMechanism+%3Fdisease+%3FdiseaseTarget+%0AWHERE%0A%7B%0A++%3FtargetID+%3Chttp%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdrugbank%2Fresource%2Fdrugbank%2FgeneName%3E+%3FOverExpressedGenLabel+.%0A++%3Fdrug+%3Chttp%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdrugbank%2Fresource%2Fdrugbank%2Ftarget%3E+%3FtargetID+.%0A++%3Fdrug+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23label%3E+%3FdrugName+.%0 17:20:17 Querying for 17:20:17 query=SELECT+DISTINCT+%3FdiseasomeGene+%3FgeneLabel+%3Fdisease+%3FdiseaseName+%0AWHERE%0A%7B%0A++%3FdiseasomeGene+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23label%3E+%3FgeneLabel+.%0A++%3Fdisease+%3Chttp%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdiseasome%2Fresource%2Fdiseasome%2FassociatedGene%3E+%3FdiseasomeGene+.%0A++%3Fdisease+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23label%3E+%3FdiseaseName+.%0A%7D%0A yielded 9740 results. 17:22:31 http://ibl.mdanderson.org/~mhdeus/BioRDF/microarray/diseasome_query_results.htm 17:30:09 http://swobjects.svn.sourceforge.net/viewvc/swobjects/branches/sparql11/lib/SWObjects.cpp?revision=1236&view=markup&sortby=date#l1476 17:30:16 kennyluck_ has joined #hcls 18:38:55 hi Lena, no progress. http://www.w3.org/2001/sw/rdb2rdf/directGraph/ coopted my attention 18:42:02 lol 18:42:23 i have almost a new full dataset for you :-) 18:42:41 in the meanwhile, keep crashing my computer trying to parse it 19:02:35 that's the 64big machine, right? 19:07:22 no, windows 19:07:39 ( crashing probability is higher :-) ) 19:08:18 going home.. will probably be back online in about 1 hour 21:07:06 egonw has joined #hcls 23:38:54 kennyluck has joined #hcls