<agendabot> clear agenda
<addison> Scribe: addison
ivan: did update the document
<ivan> https://w3c.github.io/rdf-dir-literal/
ivan: has now all the options we discussed and more
<r12a> manu, you may need to look also at this: https://github.com/w3c/rdf-dir-literal/wiki/Draft-ideas-related-to-string-metadata-storage-options
<r12a> and this, manu: https://github.com/w3c/rdf-dir-literal/issues/7#issuecomment-498268872
<chaals> scribe: chaals
Manu: Why do we need this? Can't we get it from language information?
Addison: There is low-quality
data all over the place, sometimes tagged properly, sometimes
not.
... There are parallels to what we have internally all over the
Web. there are places where I might have direction information,
where language information is at best a guess.
... I am currently relying on low-quality heuristics, but when
I have accurate data, I don't want them to apply.
<addison> chaals: so there are a bunch of use cases
<addison> ... not a standard requirement for all content
<addison> ... e.g. comments on youtube
<addison> ... think about how json handles strings
<addison> ... json doesn't break strings down well
<addison> ... and devs don't like doing that
<addison> ... plain strings that don't do that
<addison> zakim ivan
<addison> ivan: went through use cases you (addison) posted
ivan: read through the use cases and wiki page. I conclude that we won't have a solution that makes everyone equally happy.
<addison> ... won't have estimation that makes everyone happy
<addison> ... going back to completely ignoring?
<addison> ... at point where we cannot produce an optimal solution
<addison> ... have to make suboptimal
<addison> ... which one is a good middle way
<addison> ... and what is reasonable
<addison> ... the data that we have like for verifiable claims
<addison> ... comes from various databases
<addison> ... or human authoring
<addison> ... data is at beginning of process
<addison> ... not like we have base direction already and don't want to lose
<addison> ... for overwhelming use cases for RDF this issue doesn't arise
<addison> r12a: which issues?
<addison> ivan: "I extract data from form"
<addison> ... that's really a problem, but what's the percentage of rdf data that comes this way
Addison: We see a lot of data produced in this way… content coming from multiple places.
<addison> addison: is that true? not mostly dead datasources?
<Zakim> manu, you wanted to note that the direction feels clear to him, then.
<addison> manu: feel that we can knock some items off
<addison> ... -d and -x not a good option, not right
<addison> ... thing that convinced me was addison saying lang info is accurate and sometimes dir is
<addison> ... ensure that we solve long term
<addison> ... provide tooling for right i18n decisions
<addison> ... atomizing info so not co-mingled in a later-damaging way
<addison> manu: way done before was to separate
<addison> ... defer to the folks in i18n
<addison> ... don't fix with LocalizableString
<addison> ... fix LangString
<addison> ... fix implementations first, okay if takes years
<addison> ... fix json/json-ld/etc.
<addison> ... express direction separate from language and uniform
<addison> ... so that would mean in json-ld, e.g. verifiable credentials
<addison> ... use value/language/direction, 3 things you put together
<addison> ... put together in the same way
<addison> ... benefit further to align with HTML, using 'lang' and 'dir'
<addison> ... aliasing might be done in json-ld
<addison> ... simple use cases yes, more complete use cases need markup
<addison> ... workable, years to update rdf
<addison> ... but do impl in short time
[I think Manu is jumping the agenda, but I agree, with a proposal for some more intermediate guidance…]
<addison> r12a: there has been a strong tendency to keep the info in the language tag
<addison> ivan: there are cases where we disagree
<addison> ... for coming several years is that json/json-ld world would be separated from rdf
<addison> ... many not care, but I do
<addison> ... syntactically speaking, we can put in json-ld
<addison> ... when put into rdf, ignored
<addison> ... ideal is to fix rdf
<addison> ... in the meantime this is a hack I don't like
<addison> ... rdf dataset not same example
<addison> ... and there is no rdf wg currently
<addison> ... could last several years
<addison> ... not a good thing
<addison> ivan: influence community, in meantime rely on bcp47
<addison> r12a: looking at 2 possibilities
<addison> ... 1 is change lang string
<addison> ... 2 use bcp47 language info
<r12a> https://github.com/w3c/rdf-dir-literal/issues/7#issuecomment-498755489
<addison> ... more recent discussion linke here
<addison> r12a: not sure bcp47 language tags are that easy to use
<addison> ... unless every time they have a script tag
<addison> addison: (interjecting) grumpy if we put script subtags by fiat
<addison> r12a: putting -d or -x putting script info into tag
<addison> ... problem is, my original concern when writing string-meta
<addison> ... use the script subtag to key off
<addison> ... produce a formalism to use script
<addison> ... addison, you changed that
<addison> ... can detect from normal bcp47 tag, but don't think that's so easy
<addison> ... many many languages you need to check
<addison> ... it's complicated
<addison> ... if you had a rule, you'd need to inspect the language every time to determine dir
<addison> ... not as straightforward as first strong
<addison> ... only need direction if needed
<addison> r12a: 1. not as straightforward to use bcp47
<addison> ... 2. if we add scripts (and that's what mark is suggesting)
<addison> ... have to have cutoff point for languages in/out
<addison> chaals: so want to test propositions
<addison> ... not clear that problem statement is described
<addison> ... vast majority of content direction is obvious
<addison> ... first character tells you
<addison> ... there are exceptions within vast majority
<addison> ... there are cases where script and sometimes language
<addison> ... are mixed
<addison> ... the first thing that appears might not be the semantically dominant direction
<addison> ... expect majority of these are mixed script
<addison> ... and numbers
<addison> ... would like to support manu's proposal
<addison> ... push rdf hard to fix this
<addison> ... none of us control turning up an rdf wg
<addison> ... don't know how fast we can fix this probem
<addison> ... my sense from experience on AB and W3M, would get sympathetic hearing
<addison> ... for getting concrete poposal to change spec, hard part is working with implementers
<addison> ... think we should be taking that path
<addison> ... believe that for mixed script cases
<addison> ... is correct and adds nothing new if you add a language tag
<addison> ... or language+script
<addison> ... to assert a direction
<addison> ... not infallible but something that can be done
<addison> ... shouldn't rely on everywhere, but makes things easier in imperfect world
<Zakim> manu, you wanted to note that the JSON-LD world deviated from RDF for a while before (RDF Datasets...)
<addison> chaals: magick extensions to BCP47 for direction not a good idea
<addison> manu: +1 to chaals
<addison> ... modified json-ld before rdf to push rdf
<r12a> chaals, you can't mix FS heuristics with language information if lang information has precedence, but you do want language information on all strings, so you'd never do FS heuristics
<addison> ... at that time it was controversial
<addison> ... did happen after vigorous debate
<addison> ... needed something to start snowball down mountain
<addison> ... it's a different use case, but same general idea
<addison> ... two choices; one is a hack and second is good long term solution
<addison> ... the hardest part is getting implementation up to date
<addison> ... then rdf spec won't match reality
[if you don't *have* language information - even though you want it - you might well keep an FS-heuristic routine around for the cases you come up against]
<addison> ... hopefully an easy way to get w3c process to generate those docs
<addison> ... use cases and requirements should be our goal
<addison> ... need to switch reality to solving the right way
<addison> ... and let specs catch up
<addison> ... from practical standpoint
<r12a> [chaals, that's not what i meant - maybe i should join the queue]
<addison> ... if we went down this direction/path
<addison> ... one VC spec would defer to string-meta
<addison> ... draft langauge in string-meta
<addison> ... this design pattern works
<addison> ... json-ld would need a direction keywrod
<addison> ... and luckily there is a WG
<addison> ... that's the second hard thing.
<addison> ... pointing to string-meta is easy
<addison> ... json-ld achievable in a month or two
<addison> ... third thing is dataset normalization
<addison> pchampin: start with question for manu
<addison> ... when mentioning dataset signature
<addison> ... based on nquads
<addison> ... considering extending to put direction metadata
<addison> ... so that wouldn't be nquad per spec
<addison> manu: special kind only used for canonicalizing for digital signature
<addison> ... could pull into rdf if they decide to do this way
<addison> ... would be a special thing separate in rdf space
<addison> pchampin: think it is a good idea to put good practice in json-ld now
<addison> ... agree here that having separate direction attribute
<addison> ... have two options for rdf conversion
<addison> ... either lose information
<addison> ... or try to encode this information somehow in rdf
<addison> ... what I was going to propose; I think greg kellogg proposed
<addison> ... a bcp47 extension subtag as a temporary way
<addison> ... being very explicit that we'd deprecated as soon as RDF updated
<addison> ... think that would be a smooth path
<addison> ... cannot rely on temporary implementation to sign things
<addison> ... that is a valid concern
<addison> ... but would push on using temporary solution
<addison> ... the private -x solution
addison: Direction-setting? Are we homing in on something
<addison> ivan: think we have an agreement
<addison> ... not only extending langstring in rdf is ideal solution, but we should be working toward it
<addison> ... not sure how it will happen
<addison> ... will talk to ralph tomorrow
<addison> ... to see what seems to be quickest way of getting there
<addison> ... where we disagree
<addison> ... 1. more pessimistic than chaals in time it will take to get done
<addison> ... rdf concepts doc; update turtle, nquad, sparql
<addison> ... get rdf wg through AC when AC doesn't want to work on rdf
<addison> ... next question I think we disagree is what to do in meantime
<addison> ... think there is some disagreement
<addison> ... putting something in json-ld and then mapping in private use
<addison> ... once genie is out of bottle...
<pchampin> I agree this is a risk
<addison> ... and just losing in rdf is not attractive
<addison> ... we say that we are working on final solution; hope to get charter out there
<addison> ... help us do work, members
<addison> ivan: agree on what to do in the meantime
<addison> r12a: chaals, everything you said was what I said; agree
<addison> ... if we don't have direction, use first-strong
<addison> ... point I wanted to make was metadata, if you have it, always trumps heuristics
<addison> ... heuristics not always accurate; metadata exists to provide
<addison> r12a: think there are problems with using standard bcp string
<addison> ... if use private use, that says "this is a hack"
<addison> ... could use in a way where only use when needed
<addison> ... however still not great
<addison> ... better to have separate metadata
<addison> ... otherwise have to parse langauge strings
[where there is insufficient metadata, you expect to fail. If you have some heuristics that reduce your failure rate, go for it, but you still expect failures]
<Zakim> manu, you wanted to ask if losing directionality when converting to RDF / canonicalizing is a catastrophic thing? and to provide concrete path forward
<addison> manu: want to get to concrete list of things
<addison> ... verifiable credentials will point to string-meta for what right thing is
<addison> ... can provide language
<addison> ... that is going to presume that there will be a direction tag in json-ld at some ponit
<addison> ... can continue in json-ld
<addison> ... then canonicalization; that's like impl details
<addison> ... can resolve in json-ld
<addison> ... make decision on what we'll do, such as -x-dir
<addison> ... convinced that's wrong
<addison> ... or talk about other methods, nquads etc
<addison> ... come to solution that will preserve info
<addison> ... mechanism only thing up in the air?
<addison> ... would i18n be happy with VC?
<Zakim> chaals, you wanted to suggest we propose an RDF WG charter specifically scoped to solving this problem (in practice, if there are other obvious errata, they should be added)
<addison> chaals: same approach as manu
<addison> ... should put opinionated statements in specs, starting with VC
<addison> ... anticipate dir in json-ld
<addison> ... have to mark at risk, depends on getting langstring in rdf core done
<addison> ... clear about the status
<addison> ... might be experimental-yet-recommended
<addison> ... more valuable to start at bottom before propagating
<addison> ... if borken, get info quickly
<addison> ivan: don't like the "hard pushing" style
<addison> ... that is, antipating things will happen in json-ld
[also, I am less sceptical that Ivan about getting AC approval for a well-scoped RDF WG to do a concrete repair task…]
<addison> ... main point manu didn't say, before we do anything else we need to figure out if we can get a proper wg in rdf
<addison> ... happy to try
<addison> ... don't underestimate problems with getting rdf work approved
<addison> ... talk to ralph to gauge reaction
<addison> ... might need a draft charter
<addison> pchampin: when I proposed to use private extension
<addison> ... understand are not favorable to that solution
<addison> ... understand why
<addison> ... still thing we might carefully craft a transition path until have proper path
<addison> ... one way to contain, whenever you see '-x-dir' you have to parse as direction metadata
<addison> ... those -x extensions should only appar in rdf and should be converted to metadata
<addison> ... not propagated to html for example
[I am *very* skeptical of genies going back into bottles - it seems to be much harder than anyone ever thinks it will, and often seems not to happen after all]
<manu> I am also incredibly skeptical of that
<manu> Once you have a tool, people use it
<manu> Yes, exactly, let's do the right thing
<pchampin> until we provide them with a better tool :)
<manu> yes, but then we can never deprecate the old tool :)
<addison> ivan: I will talk to ralph
[I can start an RDF-WG charter proposal…]
<addison> ... have to go down usual process
[Think the normative text needs to be in Rec-track specs, and string-meta is a copy that shows what to do there]
<addison> manu: will put PR into string-meta
<addison> no follow up meeting
<addison> ivan: put dicussion in one place
<addison> ... have a separate mailing list?
[how about we just agree to make a concerted effort as individuals to make sure that this is documented and people know where conversations are happening?]
<addison> addison/richard to update string-meta as the explainer for this
This is scribe.perl Revision: 1.154 of Date: 2018/09/25 16:35:56 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: Irssi_ISO8601_Log_Text_Format (score 1.00) Succeeded: s/need this?/need this? Can't we get it from language information?/ Succeeded: s/utterings/authoring/ Succeeded: s/pct of rdf data/percentage of rdf data that comes this way/ Succeeded: s/ahere/here/ Succeeded: s/magick tags/magick extensions to BCP47 for direction/ Succeeded: s/ideal solution/ideal solution, but we should be working toward it/ Default Present: addison, chaals, ivan, r12a, manu Present: addison chaals ivan r12a manu pchampin Found Scribe: addison Inferring ScribeNick: addison Found Scribe: chaals Inferring ScribeNick: chaals Scribes: addison, chaals ScribeNicks: addison, chaals Agenda: https://lists.w3.org/Archives/Member/member-i18n-core/2019May/0040.html WARNING: No date found! Assuming today. (Hint: Specify the W3C IRC log URL, and the date will be determined from that.) Or specify the date like this: <dbooth> Date: 12 Sep 2002 People with action items: WARNING: IRC log location not specified! (You can ignore this warning if you do not want the generated minutes to contain a link to the original IRC log.)[End of scribe.perl diagnostic output]