See also: IRC log
Goutam presents his three layer scheme
CL: a comment
... we have covered the content domain level already in our requirement documents
... the
second level, the sentence level
... information you suggest like "this is an question"
... that information would ensure accurate
translation
... that level can be covered by a host schema
... for example the TEI has elements which could cover the sentence level
... as for the third level, the word level, that is related to terminology work
... e.g. you might say "that term is the expression of a concept
'bank'"
<SebastianR> an example of TEI word-level markup:
CL: we addressed the terminology realm in the requirement document
<SebastianR> <u trans="smooth" who="PS1BY">
<SebastianR> <s n="25">
<SebastianR> <w type="DTQ">what</w>
<SebastianR> <w type="VBZ">'s </w>
<SebastianR> <w type="EX0">there </w>
<SebastianR> <w type="TO0">to </w>
<SebastianR> <w type="VVI">put</w>
<SebastianR> <c type="PUN">, </c>
<SebastianR> <w type="VVD">took </w>
<SebastianR> <w type="AT0">an </w>
<SebastianR> <w type="AJ0">extra </w>
<SebastianR> <w type="CRD">twenty </w>
<SebastianR> <w type="CRD">thousand </w>
<SebastianR> <w type="PRP-AVP">on </w>
<SebastianR> <w type="PRP">from </w>
<SebastianR> <w type="AT0">the </w>
<SebastianR> <w type="NN1">beginning</w>
<SebastianR> <c type="PUN">?</c>
<SebastianR> </s>
<SebastianR> </u>
FS: the categories of Goutam's schema can also be expressed as attributes
... e.g. <s
type="praying">
RI: So you want to use three attributes which should be available everythere?
GO: No, I will explain
Goutam explains the proposed scheme
CL: We agree that these three levels of information will give us a lot of benefit
GO: You will get meaningful output
CL: I do not agree that it will solve every problem of translation
... e.g. dialogue systems and machine
translation systems
... they do not just know about pos, sentence cat, domain
... but really about complex transfer conditions
... you
need that information to do accurate work
... our scope is not to tell people who built parsers how to do that
... what you propose can be
a part of our guidelines
... which show that you cannot do an accurate translation without such information
... so as a guideline for
schema authors: please provide that information
... then people like the TEI people can see if they have covered that topic
... or people
who develop new schemas will read the ITS guidelines and create their schemes in that way
FS: would anybody disagree to put this into the guidelines?
CL: I would put it into the guidelines and add a fourth level
... the guidelines should say
... please
provide as many context as possible (i.e. "context" as a fourth level)
... "please don't give every sentence to a translator seperately, but
give the translator the context"
... e.g. the translator should see not only the content of a XUL element, but the other parts of the XUL
document respectively
YS: how about dialect specification?
... would that be part of the requirement for lang / locale
specification?
RI: we should mark it up for language
... a question on the purpose of the three layer scheme:
... do
you expect content authors to mark up that layers?
GO: It might be, but not necessarily
RI: so this markup would be used by a linguistic person?
GO: Maybe even a "simple" person
... e.g. students who study grammar, first language / second language
grammar
YS: I would not use such markup because I'm bad at grammar ...
CL: I share Yves feelings
... some authors have difficulties to provide this information
RI: Is this for use by machines?
... if that is the case, the tokens have to be machine recognizable
... it seems to be difficult for an ordinary person to use such information
CL: on RI's question whether this is for humans or for machines
... I think information about the domain,
sentence type or specific words
... it will help translators to do better quality work or to do the work quickly
... if they know that a
word belongs to a specific domain
... they can go to a terminoloy data base and check the word
... so even for human translators this might
be helpful
... e.g. "this is a computer interface string" is a helpful information
... for my understanding, the human use scenario is not
only for the translators
... but also authors or for quality assurance
RI: That is a different topic for quotation
... the example you give with term data bases
... a
machine, not a human will look up the data base
CL: As for terminology
... the translator has to be made aware of the fact that s.t. is a term
FS: that is then the terminology requirement
YS: I propose to have an action item to work on the document Goutam started
<scribe> ACTION: Goutam to continue work on the document he started to see if we should put that into the guidelines, including the aspect of language / dialect identifaction [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action01]
YS: we talked about pointing to XSLT
... I think there are different kinds of mapping
... i.e. a 1:1
mapping
... sometimes we have to map elements to attributes
... so how could we address that
... e.g. s.t. like the DITA translate
would be easy to map
FS: how about the question when the mapping takes place?
... e.g. "on the fly" during processing or before /
after processing?
YS: We don't have to specify that
FS: Yes, we can leave that to the people who use the mapping
YS: let's collect examples what kind of mapping we need
SR: If you say that the translate attribute of ITS maps to DITA translate
... that could only be a clue for
a human
CL: One of our requirements is to mark up terminology
Dita localization aids: http://www-306.ibm.com/software/globalization/topics/dita/localization.jsp
CL: I saw simple mapping of mapping priporitary language identifiers to offical ones
... e.g. people would
use numeric values to identify languages
RI: so you map values, right?
CL: yes
... we need a list of data categories what is to be mapped
... e.g. translatability,
constraints
... and then a list of the mappings
YS: we don't want to force people to use ITS if they already have the information
RI: But that is a different use case, right?
... if DITA does exactly the same thing, that do we have to
do?
SR: DITA attributes are not in a namespace
... so it would be no problem if we use that
... in DTDs,
you could hard wire prefixes
RI: hard wired means "changing the schema"?
YS: if we are stepping out of the namespace realm
... we might run into clashes
SR: by proposing the automatic mapping
... that is a burden for the processing application
YS: true, but the tools can be very generic
FS: would it be a possibility to approach the DITA people and make an agreement with them on what one should use for "translate"?
YS: it would be mainly the case in terminology
FS: example with architectural forms: http://www.w3.org/People/fsasaki/EML2005sasa0411.html section 4.3.1
CL: If we establish what the indicator of translatability is
... that would be very helpful
... the
"equiv" would be helpful for people who are in the process and the people who use this
... of course we might have problems which RI and SR
mentioned
... so we could provide a container for mapping
... and have suggestions how to fill the container
... e.g. with xslt
FS: so an "extensible" container?
RI: that sounds like localization property stuff
YS: to some degree
... the problem is: we have schemas
... which we cannot process
because their is no generic way of applying their l10n related information
scribe: to the ITS sensitive tools
RI: somebody has to do that at some point
YS: yes
... and the schema is the best place to have that information
RI: Another issue
... if you put that into our schema, the dita schema might change
<YvesS> ..FS to show the example with Architectural forms.
RI: Would that not be localization properties work?
YS: In some way
... if there is an w3c way of mapping
... we could just adopt it
... I want to
say
"img" is a graphic
scribe: as the tool processes "im"
... it should be processed like a graphic
RI: That is not a tag set again, that is localization properties
YS: yes, but we have that existing requirement
... we thougth we have a common goal, but maybe not
CL: We will solve the need to provide information about correspondences
... we recommend that to people and
have an element / attribute that points to a mapping
... then we say that people can consider different things like xslt or architectural
forms
YS: That is one part
... in addition we need to look at the type of mapping we need
... I want to know
what will be mapped
... I want a pointer to xslt
... and s.t. that says "what is mapped"
<its:mapping>
<its:mappingdesc>some desc</its:mappingdesc>
<its:map>
here some xslt
</its:map>
</its:mapping>
RI: If you have xslt stylesheets you would tie what to a specific version of DITA
CL: should that be another section of the WD?
... we have discussed this far enough
... we should be
able to make a statement about the mapping
... we should not prescribe how the mapping is realized
YS: just a place holder that the mapping exists
<scribe> ACTION: CL and FS to decide who will edit the mapping section of the ITS implementation WD [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action02]
YS: we have three documents
requirements, ITS guidelines, ITS specification
YS: no editor for the guidelines yet
... AZ mentioned he would to some editor work
... I will do some
editing as well of the ITS guidelines
<scribe> ACTION: YS as the initial editor for the ITS guidelines, Diane helping [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action03]
RI: please before you do editing, please read:
http://www.w3.org/International/xmlspec/002/documentation/styleguide.html
http://www.w3.org/International/xmlspec/002/documentation/xmlspec-i18n-dtd.html
http://www.w3.org/International/xmlspec/002/documentation/i18n-docs-processing.html
scribe: and follow the guidelines
FS: don't spend much time on the status section
RI: that is important only before publication
YS: In september, we would like to publish the first WD of the ITS specification
... the second publication
of the req. document is november
... the first publication of the "ITS techniques" (before called "ITS guidelines")
... so we don't have to
change our deadlines know
SR: You would need to write an "ODD2XMLSPEC.xsl"
xmlspec i18n dtd from http://www.w3.org/International/xmlspec/002/xmlspec-i18n.dtd
i18n specific elements: http://www.w3.org/International/xmlspec/002/i18n-elements.mod
http://www.w3.org/International/xmlspec/002/i18n-extensions.mod
<r12a-sophia> http://www.w3.org/International/xmlspec/002/documentation/i18n-docs-processing.html#xmlspeci18n-files
YS: let's continue that discussion by email
http://www.w3.org/TR/2003/WD-xquery-full-text-requirements-20030502/
http://www.w3.org/TR/xquery-full-text/
FS: proposal to have a focus on the ITS specification and ITS techniques
... the req document should only be
updated from time to time
YS: how about the wiki editing?
... how do the keep track of the changes if we publish a new WD?
... do
we have to change everything in the wiki?
... in the document with div, del, ins?
... that takes a lot of time
CL: does everybody needs to modify the req documents in the wiki?
... maybe we could say we move away from
the wiki
RI: If you have a contenious subject
... there is a lot of mail discussion
... it is difficult to
summarize discussions
... as for the wiki, you can see what is being talked about
FS: how to handle the ITS techniques and the ITS specification?
... also handling in the wiki? i.e.
converting ODD (possibly ODD) into the wiki
YS: that is a general problem for all three documents
RI: what would you do with an image?
bugzilla example: http://www.w3.org/Bugs/Public/show_bug.cgi?id=1334
http://cgi.w3.org/cgi-bin/html2txt?url=http://www.w3.org/International/Overview.html
<SebastianR> Christian/Felix: grab http://users.ox.ac.uk/~rahtz/its.zip and see the Makefile
<SebastianR> (that is the ODD demo to see if you can reproduce)
<YvesS> YS: we will discuss requirements
<YvesS> .. and does anybody has another requests
<YvesS> ACTION: For YS to post message about meeting f2f Dec-14 to 16 (noon). [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action04]
classification parameters:
1) should the req be in the techiques doc / in the specification doc?
2) is the req sensitive to the scope problem we discussed at the f2f?
http://esw.w3.org/topic/its0506ReqConstraints in spec, sensitive to scope
http://esw.w3.org/topic/its0503ReqSpan in spec, not sensitive to scope
http://esw.w3.org/topic/its0503ReqEntities part of techniques doc
http://esw.w3.org/topic/its0503ReqLangLocale part of the techniques document
http://esw.w3.org/topic/its0503ReqTermIdentification probably techniques doc, depends on how we develop it
http://esw.w3.org/topic/its0504ReqPurposeSpecMap we don't know yet
<scribe> ACTION: felix to ask w3c if there is a methodology for mapping exisiting / under development [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action05]
http://esw.w3.org/topic/its0908LinguisticMarkup we don't know
http://esw.w3.org/topic/its0504ReqCulturalAspects maybe a technique, but we don't know yet
http://esw.w3.org/topic/its0504ReqLinkedText
YS: part of the techniques, with a "?"
... good practice would be to provide an attribute to give feedback
to the translator
SR: like an alt tag on an link which is specific for its
FS: so part of the specification?
http://www.w3.org/People/fsasaki/EML2005sasa0411.html
example for such a link:
<para> If you create a typing error like "strs(s)",
you will get the message
<xref id="resfile.resx">
<subst>
<search> {0}</search>
<replace> <Filename></replace>
</subst>
</xref>.<para>
discussion about the linked text requirement
http://esw.w3.org/topic/its0505ReqBidi specification
RI: this is one driver of the original ITS work
... originally we said to SSML folks that they need bidi
markup for accessebility
... they asked us for a coherent way of doing that
... so we started this effort: ITS (initially)
... it
would be nice to have this as part of the xml ns, but that is not likely to happen
YS: so this is part of the spec and the techniques doc
FS: And this is not part of the scope issue
http://esw.w3.org/topic/its0505Translatability part of the spec and the techniques
YS: and we do need scope
http://esw.w3.org/topic/its0505WordCount
SR: thinks like bidi are more part of i18n , most of the other stuff we talked about are part of l10n
YS: this would be a guideline / technique
... SR said that we need to make the difference between universal
things (like bidi) and l10n specific things
RI: some thinks we might say "please use these tags" ..
... there might be s.t. like "please don't do this"
like translatable text in attributes
... and the third category would be "here is s.t. you could use"
YS: like the ITS tag set?
RI: yes
... and we would make clear what aspect would be important
YS: back to metrics: what should it be?
... metrics does not enhance the localizer, I think
http://esw.w3.org/topic/its0505ReqAttrAndTrans
YS: this is a guideline
SR: a guideline of good practice and an instruction
http://esw.w3.org/topic/its0505ReqNamingScheme
YS: please avoid s.t. like: <Message001>Cannot open the file.</Message001>
... more and more the
name are the same as the content
... or s.t. generic because they use non-xml tools for the generation of xml
RI: they should use IDs for ids, and not the name of the element
this is guidelines
http://esw.w3.org/topic/its0505ReqLocNotes
SR: this is like its:info
... you might want to say "who said that"?
YS: so that means specification, and it has to do with scoping
http://esw.w3.org/topic/its0505ReqWhiteSpaces
YS: explains the req
from the xml rec:
The value "default" signals that applications' default white-space processing modes are acceptable for this element; the value "preserve" indicates the intent that applications preserve all the white space.
YS: so this would be a guideline
http://esw.w3.org/topic/its0506ReqMultilingualDoc
YS: it is an issue for the localization process
... and a guideline
SR: It depends on how you manage the process
http://esw.w3.org/topic/its0506ReqRuby
YS: part of the specification
RI: Steve wants to have a different ruby spec
... which is not so presentation oriented
... I want to
have a different level of conformance
<YvesS> .. three levels would be better
<YvesS> RI: wonder if we should separate attribute and element in scoping (even for translatablity).
RI: we don't want to provide a tag set for bad practice
... but we can show them how to get out of
trouble
YS: We don't have a solution for attributes, so we can only have the element content case in the spec
http://esw.w3.org/topic/its0506ReqDateTime
SR: what is the value of knowing it is a date?
... you can just use the data type "date"
... is it
different than marking up technical terms as terms
RI: it gives you the date itself
... i.e. a machine could transform it into a specific calendar etc.
YS: I put that as a guideline, and we see what will happen
<scribe> ACTION: Sebastian to introduce to the wg the l10n / i18n aspects of the TEI [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action06]
http://esw.w3.org/topic/its0509ReqNestedElements
YS: goes to the guidelines
<scribe> ACTION: SR to put a comment on http://esw.w3.org/topic/its0509ReqNestedElements in the wiki [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action07]
http://www.w3.org/TR/2005/WD-ws-i18n-20050914/
<scribe> ACTION: Felix to make proposals by mail for a shortcut for the namespace of the ITS spec wd [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action08]
action items for monday: http://www.w3.org/2005/09/19-i18n-minutes.html#ActionSummary
<scribe> ACTION: to contact Deborah A. Lapeyre (DITA commitee) about the relation between its / DITA [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action09]
action item for tuesday: http://www.w3.org/2005/09/20-i18nts-minutes.html#ActionSummary
<scribe> ACTION: RI to check for hosting the f2f near Oxford (December, 14-16 (noon)) [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action10]
YS: drop http://www.w3.org/2005/09/19-i18n-minutes.html#action04 and
http://www.w3.org/2005/09/19-i18n-minutes.html#action03
... these are not
necessary anymore
GO: a different topic: computational or "semantic" linguistic markup
YS: Thanks to everybody
GO: Thanks to you all
... I was happy to be able to come