W3C

– DRAFT –
Invisible XML

12 December 2023

Attendees

Present
Jim Saiya, John, Michael, Norm, Steven
Regrets
Bethan
Chair
steven
Scribe
norm

Meeting minutes

Review of agenda

The group welcomes Jim.

Accepted.

If there's time, or for next time, prolog and version changes

Previous Actions

2023-01-10-f: Norm and Michael to do a bit of revision to the XML vocabulary

Continued

2023-10-17-a: SP to document the modified IRI/URI grammar

Continued

2023-11-28-a: SP to make pull request adding his proposed addition to section 3.3 ("Names are case-sensitive.").

Continued

2023-11-28-b: SP as group chair to ask W3C to publish the ixml spec as report.

Steven reports that he generated the static version and requested a URL from W3C staff contact.

Continued.

2023-11-28-c: SP to review the EBNF to BNF document.

Continued

2023-11-28-d: Norm to tidy 'Extraneous material in section on Serialization'

Done.

2023-11-28-e: MSM to write a proposal regarding full stops in names.

MSM: I wrote a proposal, but several of us wrote proposals that defined the wrong language!

MSM: I'm putting together a test catalog based on the sample names, I'll add a few more. Started but unfinished.

Continued.

JL: This is a niggle that came up when you're writing a parser. A name that ends in a full stop causes real problems with lookahead.
… We decided that we'd forbid full stops at the end of names.

Status of implementations

Norm: Nothing exciting to report.

JL: I'm continuing to work on detecting regex and using them in the parser; managed to do it successfully for repeat1, but repeat0 is causing me trouble.

Steven: I have nothing to report.

MSM: No news hear either.

Status of testing and test suites

MSM: There are pull requests related to the test suite that will come up shortly.

publication plans

Already discussed.

Review and resolution of bug reports and technical issues

Pull request #224 Mitigating ambiguity in mod357

These are related to the performance tests.

ACTION: MSM to review the open pull requests from Gunther Rademacher and respond to them

MSM: This PR is about the mod357 grammar. Right now, it asks that the parser tell you which number the number is divsible by.
… So it's ambiguous on 15, for example, which is divisible by 3 and 5.
… I didn't attempt to filter out ambiguities. The way we've addressed that in the past is to add your results to the list of acceptable results.
… Gunther suggests instead, since the specifics don't matter, and just make it return "yes" or "no" instead of reporting the reason.

JL: Is this an "am I passing the test or not" rather than "what is the parse tree"?

MSM: The purpose of all these performance tests is to provide a simple grammar that demonstrates performance issues.
… It's relatively simple for humans to understand but hard to optimize.

Steven: If the only thing is checking performance then it doesn't matter what it produces.

MSM: I will think about it, but I'm coming to believe that he's probably right.

Steven: Gunther is not a member of the group, should we not propose that he join and come to these calls?

MSM: I can certainly ask.

JL: I've been doing lots of stuff on performance; once you know that you're getting the right results, you don't care about the results anymore, you only care about performance.
… What I really want to know is just how long did it take. And, of course, what is the point at which your parser breaks.
… The performance tests are in a slightly different ballpark; we might want to think about describing them differently.

MSM: That's probably worth adding to the README file and the descriptions of the tests in the test catalog.
… It may be worthwhile to tinker with your test harness so that you're only timing the parsing rather than the parsing and the results.

JL: When you're doing a conformance test, your compiling the grammar, parsing against the grammar, serializing, and then comparing that result. In most of my cases, the parsing is the one that takes the time.
… So you're focusing on just trying to get the parsing piece fast.

MSM: You might not even want to bother writing the output to a file.
… I compare the XDM instance and then serialize for the record; but I don't serialize and then do the comparison.

JL: I don't either, but I think of the XDM as a serialization of the parse tree.

MSM: Depending on what your parser produces, extracting a parse tree can be part of the performance measurement.

PR #228 Correct several typos in the text of the EBNF/BNF working paper #228

Accepted.

Issue #198 Check test suite browser is updated on additions

Norm: I'd like the group to accept invisibleXML/ixml#229 to close the issue.

Accepted.

Issue #210 Extraneous material in section on Serialization

Norm: Fixed by invisibleXML/ixml#221

Accepted.

Problems with the prolog

https://lists.w3.org/Archives/Public/public-ixml/2023Oct/0004

Steven posts

Steven: I don't like the syntax; it upsets me but I don't have a solution for it.
… The second I think we should fix

Scribe reports "the second" is: the semantics. The default is version 1.0, but if I say ixml version "I really don't care" you get errors.

JL: Is there a sense that the prolog should be a property table of some form?

Steven: I don't know how to answer that question.
… It's really the potential long lookahead before you know if you're looking at a rule or a version. The ixml grammar begins with the word "ixml".

JL: So you could have a case of "ixml" followed by an arbitrary amount of stuff.

Steven: A simple solution is just to say that "ixml version" has to be two words with a single space.

MSM: I confess that the rhetorical effect of the example is enhanced by the length of the comment, but technically speaking I have "ixml" that could be the beginning of a rule or the prolog
… I might consume a lot of whitespace before the version. It doesn't trouble me that much. But it's also true that some sort of bracketing around the prolog wouldn't trouble me very much.
… I'm used to SGML and XML documents beginning optionally with metadata.

Norm: Why is it special here?

MSM: Because this is the one place where I've read something that could be a name and you don't know if it's a name or the prolog.

Steven: It could be just the quoted version number.

MSM: I want a label to go with the data.

Steven: We don't have to decide this now, but I'd like to let it sink in for a little while.

MSM: Before we go, how do other people feel about the possibility about some sort of bracketing.

Norm: I'm okay with that. I want the prolog to be flexible enough to hold other things but I'm okay with bracketing.

ACTION: Steven to propose alternative syntaxes for the prolog.

Version number naming

Steven reports https://lists.w3.org/Archives/Public/public-ixml/2023Oct/0005

Steven: If there's no version, it uses 1.0; if it can recognize a version, it uses that, if it doesn't recognize the version, then it'll do whatever it can.
… I think it should be: if it can recognize the number and process it, it should use that version, otherwise it should use whatever version is available.

JL: Don't we report a version mismatch in the state?

Steven: My proposal is that the processor can use any version it wants; if there's a version and it can use that version, I should use that one.

MSM: There is a keyword in the state attribute for version mismatches.

JL: So there's a mechanism to report that it worked but it may not be the version you expected.

MSM: I think I'm sympathetic to this. If you didn't label it with a specific version, let the parser use what it wants. There's an analogy with the XML version declaration and my recollection is that at least some people say no the reason that XML 1.1 was doomed to failure was that the spec required people to assume 1.0.
… A number of very smart people have told me that the reason that turned into a disaster, we botch the handling of version numbers.
… It was botched from the start, but then it was made worse when "you may fail" was changed to "you must fail".
… The advice I disregarded at the time was that a processor be allowed to fail instead of requiring that it soldier on.
… I thought early failure was too important to give up, but that was wrong. I wish I knew exactly what lessons to draw from that for this case!
… If you don't recognize the version number, we should probably say that you have to try anyway.

MSM: The absence of a version number should not be assumed to be 1.0.

Steven: The absence of a version number should mean use the best version you've got.

JL: I agree. If you specify a version, then that's what you want.

"Flag the version mismatch but soldier on."

Next meeting

No meeting on 26 December, 2023. Next meeting is 9 January.

MSM gives regrets for 23 January and 6 February.

Norm volunteers to handle agendas if MSM can't.

<norm> s/Scribe: "the/Scribe reports "the/

Summary of action items

  1. MSM to review the open pull requests from Gunther Rademacher and respond to them
  2. Steven to propose alternative syntaxes for the prolog.
Minutes manually created (not a transcript), formatted by scribe.perl version 221 (Fri Jul 21 14:01:30 2023 UTC).

Diagnostics

Succeeded: s/numberr/number/

Succeeded: s/Topic;/Topic:/

Succeeded: s/Scribe: "the/Scribe reports "the

Failed: s/Scribe: "the/Scribe reports "the/

Maybe present: 2023-01-10-f, 2023-10-17-a, 2023-11-28-a, 2023-11-28-b, 2023-11-28-c, 2023-11-28-d, 2023-11-28-e, JL, MSM

All speakers: 2023-01-10-f, 2023-10-17-a, 2023-11-28-a, 2023-11-28-b, 2023-11-28-c, 2023-11-28-d, 2023-11-28-e, JL, MSM, Norm, Steven

Active on IRC: norm, Steven