Meeting minutes
w3c/aria-at - #945 - Rethinking the wording for assertion verdicts
jugglinmike: The working mode doesn't have the term verdict yet, but its one we intend to add.
jugglinmike: The working mode refers to verdicts as supported, not supported etc
jugglinmike: automation refers to the verdicts as acceptable, omitted, contradictory
jugglinmike: I have a proposal for a new set of terms
correction: automation refers to the verdicts as good output, no output, incorrect output
the proposed new terms are acceptable, omitted, contradictory
js: I like the new terms you proposed. In terms of bubbling up the results, I wonder if no support, partial support, supported is clearer
MK: Thats why I wanted to use numbers
MK: Partial support could mean anything between a little support to almost fully supported
JS: I agree but if something is 90% supported, the remaining 10% could still make it unusable
MK: I agree, unless we have multiple layers of assertions we don't need numbers. We also want to be diplomatic
MK: I think for your solution is pretty solid
MK: We just need to decide if we extend the use of these terms, or bubble them up
jugglinmike: Yes bubbling up we need to consider, the case where a feature is all supported except one, its not supported. For verdicts that can be in three states, understanding why its partially supported is tough. I'm not sure if bubbling can work if we are looking for a percent score
MK: Yeah supported needs to be binary
JS: I think we need all three states
MK: What do the responses tell us? Either there is some support there or there isn't. Then the reasons is because someone tried, or someone didn't try to support
MK: If you measuring something using a percentage, then it needs to be binary
JS: for the reports, are there three levels to two of support?
MK: Any level of support beyond assertion is a percentage.
MK: At the AT level, the test level, at the AT level, all will be a percentage
MK: So we would say, using Mikes terminology, At the assertion level if the response is omitted or contradictory then that counts as a 0. If its acceptable then it counts as a 1.
MK: We could do other reports we could run that say what percent is contradictory, which percent is omitted
MK: I don't know that we need to bubble up these terms in the reports we have now
MK: We don't need terms for working mode, its just level of support
jugglinmike: I do think the working mode uses supported not supported.
MK: I can get rid of that
MK: I have some other issues for the working mode, particularly 950, I think we need to work on another iteration of the working mode and share it with the community
MK: We could have a binary state for assertions, and get rid of contradictory
JS: I agree, but we should rewrite the terms
JS: Lets add this to the agenda for the CG meeting thursday
jugglinmike: What I'm hearing is, we like the terms I proposed, but we may not need three terms
JS: It will make the testing easier if we just have two states/terms
MK: Okay but if this task isn't on the critical path, I want to be conscious of that
JS: This could speed up the process
MK: But its not a blocker, we can talk about enhancements in the near future
Michael Fairchild: Is there a third state where we publish a report with some of the data missing?
JS: No, really, but we need to consider this.
JS: If there is a situation where only 50% of test have been completed, what does that look like for a percent supported?
MK: We made a decision to change the working mode, and to get rid of the three output terms
MK: The question before we change the UI, is do we go from 3 to 2 states? Acceptable, not, contradictory
w3c/aria-at - #946 - Disambiguating 'Test Plan Run' in the Working Mode
MK: I'll comment on this issue and we can move it forward outside this meeting
Review of rationale for omitting explicit references to automation from the Working Mode - we touched on this during the 2023-05-22 meeting and agreed that Matt's perspective was critical
jugglinmike: This came up two weeks ago. We were talking about my task of describing automation how it layers on to the working mode, but the working modes doesn't describe automation
jugglinmike: James was not convinced of the utility of organizing our work that way
jugglinmike: As this has been a theme of my work, I want to make sure we are aligned on our direction
JS: Yes I wasn't sure what we were trying to achieve.
JS: For our tests, it doesn't matter if the responses are entered by a human or machine. But the results may need to be checked from a human. We are a long way away from automation checks responses and interprets them and providing their own verdicts
JS: Even if we get to that point, in many many years, I still think its valuable to have a human check the responses.
JS: The automation may be able to say a response is unexpected, but it wont be able to categorize how its unexpected
MK: I asked Boaz about abstracting the working mode. I want to make sure the working mode states here are how the business things work. Its the process for generating the spec, but its not a operations manual
MK: I think that there are some things about how the group currently uses the app that can be written into documentation.
MK: I think later on we can decide what a human does, or a machine does, but is outside the set of principles of the work
JS: That makes sense, but lets make that very clear. For someone new to the project, we want them to understand both angles, not just one dry article that describes who does what
MK: I still think we should get the roles out there
MK: The working mode does need to specify who does what "Directors need to approve this" The scope of the working mode needs to include scope of authority
JS: Okay I agree. The work that happens day to day is more practical, how the app works, what it does well, etc. I do think there is a disconnect between the working mode and how we actually do things. This is partially what Mike is bringing up.
JS: We need a document that outlines governance, and another document that defines how we work
JS: The governance document is more abstract, and you can go directly to a implementation. There needs to be a step between
MK: I agree, we are slowly building towards this. The wiki work I recently did to describe how we write tests, how we onboard people. We dont have much in the way of app documentation
jugglinmike: There is one thing that comes to mind that is fundamental the work Im doing. When we talk about roles, who is responsible for intitating automation? I've been assuming thats a test admins job, if that's the case then we have to talk about what the test admin is doing. Theres another framing however that changes what we build, which is the testers responsibility, can matt assign louis some tests and then louis runs that automatio[CUT]
MK: Its features design, we can say it both ways.
MK: We could make a feature where a tester uses AT to generate a response, and then adjusts it to be correct, and submits it as part of a manual test
MK: So right now I believe we said our MVP for Automation is, somebody, we didn't say who, is for a test plan run can we collect responses
MK: The automation will know what AT to spin up, what the tests are, and run them
MK: We can add to that, MVP Prime, if any of the responses, if there is a previous run of that same plan, if any differences exist flag those
MK: Thats so we can identify regression, If a new version of chrome comes out, automation can recheck everything and say yep its still supported
jugglinmike: so for the short term for me, Can I propose a change to the working mode that would capture a test admin to collect AT Responses?
MK: I don't think we need that
jugglinmike: right now the working mode just describes running the tests. We need to split up running tests and assigning verdicts. And we need to define the actors who will do these thigns
JS: The more we abstract these details, the more it becomes vague.
JS: If were saying this level of detail needs to go into another document, thats something else
MK: So test admin can run tests, but there nothing in the working mode that says what a tester needs to do to run a test. If the first thing they do is press a button and AT runs the test.
MK: The working mode doesn't care what buttons to press or what the scope of the test is. Running a test can be, I ran a test and got the same results as the AT. We can write a manual to describe that process, which is what we do now.
jugglinmike: So what we are saying is there is no change to the working mode.
MK: Yes I don't see a need to change.
MK: The working mode says the goal of the work is, make judgements about a test, how are the screen readers behaving, acceptable or not? That is the role of the tester. The test admin role is to make sure they agree with what the testers are doing, and resolve when there are conflicts. The working mode doesn't say what buttons to press or how many characters to enter
jugglinmike: So should we give running Automation to just test admin, or to everyone?
MK: What ever you think is better and faster?
JS: I don't think human testers need to be involved in that
JS: The pattern we follow now, granted testers assign themselves to tests, but for the most part, we gather info on who is willing to do what tests, then we assign the tests and work to resolve conflicts. The test admin is the gatekeeper to make sure everything stays on track
JS: We dont want people assigning to things we not ready to review
JS: I see automation in a similar light, once in place it may make this easier. The more we use the system the more it may know what we want, but there still is a manual element of having humans run tests and review conflicts.
MK: I'm good with a conservative approach, we should roll out the smallest, simplest, least risky/most useful approach. Lets not give to much power to everyone day 1
jugglinmike: I'm envisioning, the test admin can see who has been assigned to a test plan, but now they have a new ability to say collect new responses for this tester.
jugglinmike: As responses came in they would be entered in the correct places.
jugglinmike: If we make space for the system to have errors, in that we can retry certain commands
jugglinmike: in the case where there is an issue, we can have another tester run a particular assertion and compare results
MK: I think so, We may want to do that like the test plan is in the queue, instead of assigning the test they just plan a "run" button that creates a unassigned data set, that when its done we can assign someone to it who will complete and validate the report.
MK: Please put together a design proposal and lets go through it. I think you are on the right track