15:02:33 <RRSAgent> RRSAgent has joined #htmlt
15:02:33 <RRSAgent> logging to http://www.w3.org/2011/05/03-htmlt-irc
15:02:47 <krisk> RRSAgent, make logs public
15:03:08 <krisk> zakim, this is htmlt
15:03:08 <Zakim> sorry, krisk, I do not see a conference named 'htmlt' in progress or scheduled at this time
15:03:20 <plh> zakim, list conferences
15:03:20 <Zakim> I see SW_(SPARQL)10:00AM, XML_ET-TF()11:00AM, Team_(community)15:00Z, Team_(webtv)14:00Z, WAI_UAWG(CHAIRS)10:30AM active
15:03:24 <Zakim> also scheduled at this time are VB_VBWG()10:00AM, DIG_weekly()11:00AM, SW_RIF()11:00AM, WAI_PFWG(HTML_TF)11:00AM, Team_(Sandisk)11:00AM, SW_HCLS(COI)11:00AM, W3C_(COMM)9:30AM,
15:03:27 <krisk> Conf call has expired
15:03:28 <Zakim> ... RWC_WebEven()11:00AM, T&S_XMLSEC()10:00AM, I18N_ITS IG()11:00AM
15:03:35 <plh> oops
15:03:49 <plh> ACTION: plh to renew HTMLT Call
15:04:03 <MikeSmith> MikeSmith has joined #htmlt
15:04:06 <krisk> let's do this all on IRC
15:04:23 <krisk> Next time we can have a call setup
15:04:40 <plh> fine by me
15:05:08 <krisk> Agenda http://lists.w3.org/Archives/Public/public-html-testsuite/2011May/0000.html
15:05:14 <plh> by the way, I believe the html chairs asked for a report on the latest version on the test suites
15:05:27 <krisk> That is correct
15:05:27 <plh> how many tests are approved, submitted, etc.
15:05:50 <krisk> I told Paul Cotton and Sam Ruby that I would do this report...
15:06:20 <krisk> ...so that they can present this to the AC as part of  Last Call
15:06:43 <krisk> I believe the AC meets next week?
15:06:46 <krisk> is that correct?
15:06:54 <plh> nope, week after
15:07:02 <plh> the report will probably be on May 16
15:07:49 <krisk> Agenda Item#1 Bugs on approved tests
15:08:04 <krisk> No new bugs
15:08:21 <jgraham> Basically the report is that we are far from done. Hopefully it is clear to people that raw numbers don't mean much
15:09:13 <krisk> The dataset of intrest is for tests approved and submitted
15:09:25 <plh> well, raw numbers do help showing progress
15:09:28 <krisk> s/submitted/submitted tests/
15:09:49 <jgraham> I don't really understnad why that is interesting. Without context - like how much of the spec we are covering - it doesn't seem to mean much
15:09:53 <krisk> Recall that it has only been a little of a year since we approved the first test
15:10:08 <krisk> ...and now we have almost 1000 approved tests
15:10:12 <jgraham> Sure, I'm not saying that we should panic or anything
15:10:23 <krisk> ..and alot more participants submitting tests
15:10:52 <plh> how many submitted tests do we have in the pipeline?
15:11:25 <plh> it might also be good to start enumerating section where we don't have tests
15:11:26 <jgraham> Well the meta reflection tests are like 10,000 tests iirc
15:11:30 <krisk> This is good data to show that the TF has made alot of progress in the last ~17 months
15:11:46 <jgraham> Of course they are auto-generated
15:11:54 <plh> please, focus on the progress in the last 6 months
15:12:00 <jgraham> The html5 parser tests are ~8000
15:12:11 <krisk> so alot of tests...
15:12:30 <jgraham> But really these numbers don't mean much without knowing what they cover (and so what is not covered)
15:12:39 <plh> alright, then we need to talk about how we're approving those 18,000 tests then
15:13:04 <krisk> that is correct - numbers are good but not a realistic view of spec coverage
15:13:18 <plh> yes, having a view of spec coverage would be good indeed
15:13:30 <krisk> Sure it's good to have a large amount of attribute reflection tests
15:13:52 <jgraham> The meta reflection tests are the easy case (although no one has stepped up to do the review yet). The html parser tests are about 50% autogenerated and about 50% handwritten
15:14:16 <plh> btw, by test, you mean test assertions, right?
15:14:36 <krisk> Note that though the Audio/Video tests are much smaller in total numbers they have more value
15:15:21 <jgraham> plh: I'm not quite sure how you are defining the terms
15:15:42 <jgraham> I mean "result as recieved by the test harness"
15:15:52 <plh> ok
15:15:59 <jgraham> i.e. a test is something that gives a single pass/fail result
15:17:09 <krisk> Agenda item #2 Expectations moving forward for the testing task force as the spec moves to last call and beyond
15:19:15 <jgraham> Expectations?
15:19:39 <krisk> The HTML5 spec has specific progress dates
15:20:32 <plh> my own expectation would be that we target a first release of the html test suite next year, and work backward from there to setup deadlines for test submission, bug reports, etc.
15:20:46 <jgraham> I think that is highly unrealistic FWIW
15:20:51 <krisk> The currently we don't have an expectation on getting the test suite into a specific state that map to these dates
15:20:57 <jgraham> I know this won't make me popular :)
15:21:07 <plh> what is unrealistic?
15:21:20 <plh> you don't like delivery dates? :)
15:21:22 <jgraham> A first release next year
15:21:26 <plh> why not?
15:21:35 <jgraham> For a start I don't think the concept of release makes sense
15:21:41 <plh> I'm not saying the test suite will be complete
15:21:56 <plh> of course it does, otherwise people are too lazy
15:21:57 <krisk> We have two basic options...
15:22:02 <jgraham> And second I think that by next year we won't even nearly have full spec coverage
15:22:19 <plh> which is ok
15:22:35 <jgraham> So I don't understand what "release" would mean
15:22:52 <plh> well, we publish working drafts for html5
15:23:01 <plh> I think we should do the same for the test suite
15:23:14 <jgraham> With my Opera hat on, the rate at which we write tests is likely to be a function of the rate at which we implement features and very little else
15:23:14 <plh> call it a beta release or something
15:23:31 <krisk> I also think we should start to think about do the same type of release for the test suite
15:23:35 <plh> in that case, why don't we have more tests for html5 forms?
15:23:41 <plh> you guys did some implementation
15:23:57 <jgraham> Because when we did that implementation a few years ago we sucked more at releasing tests
15:24:07 <jgraham> We are getting better at that now :)
15:24:15 <plh> ah, that's good to hear
15:24:35 <plh> but having target dates will help everyone understand when to submit tests
15:24:56 <MikeSmith> MikeSmith has joined #htmlt
15:25:00 <jgraham> I really don't think that is true for the tests themselves
15:25:07 <plh> and we need a target for the test suite since we'll want to move out of CR
15:25:18 <jgraham> What *might* be helpful is a target date for the infrastructure
15:25:21 <plh> I don't believe that we'll get anymore at all without having target dates
15:25:44 <plh> ah, the infrastructure comes before that
15:26:05 <krisk> I expect that the co-chairs will ask for a few targeted dates for the test suite 
15:26:17 <plh> and my hope is that we can have a first version of the infrastructure sooner rather than later
15:26:18 <krisk> Even if it's not complete
15:26:35 <jgraham> So it would be nice to have a target date to, say, be able to work out what fraction of the spec has test coverage
15:26:42 <plh> I'll ask for target dates for sure. I'm not going to trust the html wg to get their act together without target dates
15:27:16 <jgraham> Until we have some clue about that it seems crazy to talk about target dates for finishing the tests themselves; we won't even know how far done we are!
15:27:21 <krisk> My experience is that if we don't pick some dates then we'll never complete the work so that it maps to the overall HTML5 schedule
15:28:18 <plh> the goal of the first release isn't to be complete. it's to tell people that they need to submit for it by a certain date. of course, it won't be complete
15:28:55 <krisk> I would think that even if we did something as simple as....
15:28:56 <plh> if people care about the results of that first release, they'll have a date to know when to submit tests
15:29:19 <plh> I'd expect the group will do 3 or 4 releases over 2 years
15:29:26 <jgraham> I don't think that's the right model for contribution
15:29:55 <krisk> Take test submissions for 3 months...
15:30:08 <plh> James, alright, then how do you propose that we get folks to send more tests?
15:30:22 <krisk> Then ask for a request for review for 1 month....
15:30:25 <jgraham> Most people who are contributing aren't doing it for the goodness of the testsuite itself. They are doing it because they are making the tests anyway and realise that they can have move value as a shared resource
15:30:50 <plh> actually, most people don't even know they can submit tests
15:31:10 <krisk> I don't think that is the case anymore...
15:31:15 <jgraham> plh: Make it clear exactly where we are missing tests. Make the process for writing and submitting tests clear
15:31:58 <jgraham> If people say they are implementing specific features ask if they intend to contribute their tests to the testsuite
15:32:00 <krisk> Looking at other WG HTML5 seems to have a very large number of tests and the spec is not even at last call
15:32:37 <krisk> I believe that if we propose a schedule and dates for approving tests it will only improve the situation
15:32:59 <plh> Kris, yes, you're right, but I'm still disappointed by the level of contributions. no offense to you guys, but I wish we would get a lot more tests from the community
15:33:35 <jgraham> Me too, but I don't know how to make people want to spend their free time writing tests
15:33:45 <jgraham> It's not sexy like writing performance tests
15:33:59 <jgraham> and it's not really easy either
15:34:22 <plh> giving them deadlines increases tests on their TODO list
15:34:52 <krisk> We don't have to decide today
15:34:59 <plh> without clear steps, they won't feel as motivated
15:35:06 <plh> Kris, correct
15:35:09 <jgraham> Are there even people with "write HTML5 onformance tests" on their TODO list who are not already contributing?
15:35:43 <jgraham> I imagine most people who could be doing it haven't even considered it. Or have good reasons not to.
15:35:52 <plh> they are plenty of people with "I wish HTML5 would work better" on their TODO list
15:36:24 <jgraham> Yeah. Evangelising those people is a noble goal.
15:37:01 <jgraham> I'm not sure I know how to do it though. I don't think setting deadlines for not-very-meaningful releases helps much though
15:37:26 <jgraham> First they have to see that writing tests is the number one way to help browsers suck less
15:37:31 <krisk> Well we can switch and see what happens....
15:38:09 <jgraham> Then they have to learn to read the spec closely. Then they have to learn the process. Then they have to find time to do it.
15:38:39 <krisk> I do understand it doesn take a good amount of time to participate and add value
15:39:15 <krisk> Though I believe that a set schedule could help get tests reviewed on a more consistent schedule
15:40:13 <krisk> Though I wanted to bring this up to understand each of your views
15:40:14 <jgraham> I'm not sure that lack of a deadline is the main reason that the meta-reflection tests haven't been reviewed yet, for example
15:40:57 <jgraham> I think in that case the main reason is that it's a lot of tedious work and no one gets paid to do it
15:41:34 <jgraham> When it's not a lot of work or is interesting, reviews often happen faster
15:42:12 <krisk> Then maybe it's best to not write a bunch of super complex tests that take a ton of time to review...
15:42:36 <plh> I'd rather approve by default all the meta-reflection tests after a specific date, rather than waiting indefinitively
15:42:37 <jgraham> Personally I would much rather have the tests
15:42:38 <krisk> Looking at some of the parser tests they look like they could be reviewed in smaller batches
15:42:54 <jgraham> Yeah, the parser tests batch up very easilly
15:43:33 <krisk> At TPAC Firefox and Apple seemed to favor having a schedule
15:43:51 <krisk> ...seems like this could work
15:44:01 <plh> after all, that's what css 2.1 did for their 9000 tests. those tests got auto-approved after a certain, the browser vendors started to pay attention because of the timeline.
15:44:15 <plh> s/certain/certain date/
15:44:28 <jgraham> With CSS2.1 what I actually saw was all the problems coming out at the end 
15:44:39 <krisk> Note that we are trying to not have all the tests show up and get approved at the very end
15:44:44 <jgraham> so there was huge churn in the TS just before they went to PR
15:44:59 <plh> james, yes, but they did come because they had a schedule.
15:45:12 <plh> without the schedule, we would still be wondering about those tests
15:45:26 <krisk> Note that the schedule came in the last year of so....
15:45:48 <jgraham> That happened because they decided they needed two completely interoperable implementaions to get to PR.
15:46:02 <plh> yep, that was part of the schedule
15:47:05 <krisk> Let's not make a decision right now in this meeting...
15:47:33 <jgraham> If you are suggesting that we make implemetation reports early then that isn't such a bad idea, although it has to be presented in a careful way to avoif giving a misleading picture
15:47:50 <jgraham> In particular it should be presented per-test rather than per-UA
15:48:06 <plh> I wasn't suggesting that, but I'm open to the idea.
15:48:46 <krisk> I would add that alot of customers (not specifically browser vendors) see alot of value on have a set of known good HTML5 tests
15:49:14 <jgraham> I think we all agree that having good tests is good for many people
15:49:21 <plh> James, did you see the results for css 2.1? How do they look for you?
15:49:34 <jgraham> I don't remember seeing the final results
15:49:34 <krisk> I don't like the CSS2.1 resutls
15:49:42 <plh> http://www.w3.org/Style/CSS/Test/CSS2.1/20110323/reports/results.html
15:49:53 <plh> kris, what's wrong with them?
15:50:33 <plh> I like the fact that the tests are presented by sections
15:50:49 <jgraham> Well that layout won't scale to the number of tests we expect anyway
15:51:17 <jgraham> I would present by number of failing implementations.
15:51:28 <plh> James, eh, you guys have to beef up your implementations to handle the number of tests :)
15:51:42 <jgraham> The tests that fail in the most implementations are most likely to need attention
15:51:47 <jgraham> So put them first
15:52:00 <krisk> It's odd that they results are for RC6?
15:52:02 <jgraham> The ones that everyone passes are boring so put them last
15:52:09 <krisk> Why not for the final test suite?
15:52:18 <jgraham> Don't display which implementations pass/fail by default
15:52:52 <plh> re having fail tests first, that sounds ok to me.
15:53:08 <plh> not what you mean by not display which implementations by default
15:53:22 <plh> Kris, dunno about this RC6. 
15:53:47 <plh> s/not/not sure/
15:53:50 <jgraham> At this stage presenting who fails what is not the most important thing. It's only interesting to implementors
15:53:58 <jgraham> So they can fix their bugs
15:54:12 <jgraham> Finding out which tests need work is more important
15:54:17 <krisk> Well it's also intresting to customers that want to choose a UA
15:54:29 <jgraham> Not really
15:54:41 <plh> we can certainly have different views in any case
15:55:54 <krisk> Anyhow I think we should still think about moving to a schedule and publishing the test suite on a regular basis
15:56:11 <MikeSmith> +1 to publishing on a regular schedule
15:56:29 <plh> agree, but I wouldn't necessarily push for it before later in this year
15:57:00 <krisk> Let's adjourn - it's almost 9am (pacific time)
15:57:03 <jgraham> Since I seem to be in the minority, I will check I understand what you all mean
15:57:13 <jgraham> What would publishing actually entail?
15:57:45 <krisk> I'll send it out to the list...
15:57:57 <jgraham> OK
15:58:02 <krisk> basically take test submissions for a set amount of time...
15:58:21 <krisk> Then stop taking submissions for a set amount of time...
15:58:27 <MikeSmith> oh
15:58:51 <MikeSmith> I am not sure about the value of the "stop taking submissions for a set amount of time" part
15:59:19 <jgraham> I am now more opposed than I was before. Why would we want to stop anything?
15:59:24 <MikeSmith> I guess I also need to understand what we mean by "publishing"
15:59:26 <krisk> the reason to stop taking submissions is to get consensus on the submitted tests
16:00:00 <jgraham> Even if we want to do that, stopping taking submissions is unnecessary
16:00:04 <krisk> if you submit tests during review period (when we stop taking tests) the would be part of the next release
16:00:10 <plh> ah, we don't stop taking submissions, we just set a timeline on a specific set of submitted tests to be reviewed
16:00:14 <jgraham> One would just specific a given rev. in hg
16:00:33 <krisk> or move to using a branch
16:00:42 <plh> like the html spec, we don't stop taking comments
16:00:55 <plh> we just set a timeline on when those comments will be addressed
16:01:12 <krisk> During the review period - people can raise objects and have tests removed/fixed due to feedback
16:03:03 <jgraham> This feels like it is going to end up in cat-herding territory
16:03:21 <jgraham> I don't see what the mechanism is for forcing people to do the reviews
16:04:27 <plh> we don't force them, we simply tell them that, if they don't review, we'll simply approve the tests
16:04:35 <plh> and move on
16:05:21 <jgraham> We already do that without set publication dates on a submission-by-submission basis
16:05:32 <jgraham> and if we find bugs later we have to unapprove
16:05:43 <jgraham> So I haven't understood what changes
16:06:10 <plh> not sure if we're going to reach a conclusion today
16:06:17 <plh> and I need to go
16:06:23 <krisk> IRC is not the best communication tool for process changes
16:06:29 <jgraham> Sorry for keeping you :)
16:06:35 <krisk> I'll send something out to the list
16:06:48 <krisk> let's adjourn
16:07:13 <krisk> rrsagent, generate minutes
16:07:13 <RRSAgent> I have made the request to generate http://www.w3.org/2011/05/03-htmlt-minutes.html krisk
16:19:57 <Ms2ger> Ms2ger has joined #HTMLT
17:28:40 <Zakim> Zakim has left #htmlt
17:45:35 <plh> plh has left #htmlt