W3C

– DRAFT –
ARIA-AT Automation and Community Group Meetings

26 September 2022

Attendees

Present
Glen Gordon, James Scholes, jcraig, Matt_King, michael_fairchild, Sam Shaw, zcorpan_
Regrets
-
Chair
Matt_King, michael_fairchild
Scribe
s3ththompson

Meeting minutes

jcraig: at the time, AppleScript interface is the most supported mechanism for automating... that has a hardened security profile already

James Scholes: just to clarify, it wouldn't be a webpage... it would be something like webdriver

jcraig: right, that's still not possible at this time with our security model

s3ththompson: when we discussed last week, we discussed the difference between sending raw keypress input vs. higher-level semantic commands... does this escape the concern about keypresses and OS security layer?

jcraig: right, there is definitely functionality in AppleScript today that could be used to implement some sort of normalized commands across multiple screenreaders

s3ththompson: we had previously considered having 2 phases in the spec draft: raw keypresses (easier to spec) and then higher-level commands (harder to spec), but this conversation suggests maybe we should reprioritize those sections to be able to accomodate running a prototype on Apple as you mentioned

zcorpan_: right, we talked about prioritizing high-level commands in the Automation API... the question now is whether we should work towards agreeing on a set of normalized commands, vs. just implementing vendor specific commands

Matt_King: i'm slightly hesitant about the normalized commands... we want to make sure the tests can correlate 1:1 with how the user would experience it. it's one thing to say: the user would press these keys. it's another to say: we have to figure out what those commands are that are triggered by those keys

James Scholes: yeah, i'm wondering about how we can make sure we're exercising the full suite of commands that users might use to trigger an action... i would worry that on VoiceOver we might only be using one AppleScript method, where on window, we're using multiple different keypresses

jcraig: i'm definitely not trying to convince you the applescript way is a preferred way to do it... just that it's available today. i can't promise a timeline to implement anything different there.

Matt_King: so if we needed extra commands that aren't in the AppleScript library today, is that something we could potentially add?

jcraig: possibly. there is an escape hatch right now by navigating through the web menu

Matt_King: are there options for control-option-F3, control-option-F4 (announce current VoiceOver cursor, announce current keyboard cursor)

jcraig: there's `keyboard cursor object` with `bounds` and `text under cursor`

Matt_King: those provide the text programmatically, but can applescript actually trigger an utterance?

jcraig: when VO is enabled, if you're driving a VO cursor... if you're using the query interfaces through AppleScript, it will return the string

Matt_King: i think a next step here on our side would be to do an analysis to determine what we can map from AppleScript

Glen Gordon: so we're reworking the behavior in JAWS right now... in the near future, we will be able to interpret user-injected keystrokes as if they were real keystrokes... this will open things up to programatic commands

Matt_King: does anyone know windows' approach to the security issues at hand here?

Glen Gordon: definitely something i'm thinking about... right now, there are definitely limitations, but ultimately when you install a low-level keyboard hook, things are bad

Glen Gorder: but i think we're confusing issues, the issue is not: can an app intercept keys... it's can it inject keys?

jcraig: the issue is if VO could be made to perform arbitrary system actions (e.g. creating new users, etc.) by injected keyboard commands... right now we know the difference between HID-level keypresses and injected HID-level keypresses and we ignore the latter

Glen Gorder: the difference is that it's harder to get UI Access... if screenreaders help an attacker reach that, it's a problem

Glen Gorder: you have to be code-signed and you have to be in a secure file location like program files

James Scholes: that makes sense and that's why the limitations of the portable version NVDA are what they are

https://www.irccloud.com/pastebin/hDCF2osD/zoom_log

Matt_King: so it sounds like on NVDA there are no keypress issues, on JAWS it sounds like we should proceed with the understanding that we will be able to trigger keypresses in a forthcoming version of JAWS

Glen Gordon: correct, with the caveat that it might be behind a setting depending on how we reflect on this convo and make a decision

Matt_King: that's great, that's enough for our use cases

s3ththompson: and James Craig, it sounds like your recommendation at least in the short term is to consider AppleScript to be the primary bridge with which we'll be able to automate VoiceOver?

jcraig: that's correct, yes

Matt_King: okay, so the next step there will be to do a coverage analysis and come back with our findings

James Scholes: we've been developing some AppleScript shortcuts for human users so we can help push that forward too

jcraig: by the way, James Scholes thanks for your comments on toggle https://github.com/w3c/aria-at/issues/784

Feedback about API section for keypresses

Minutes manually created (not a transcript), formatted by scribe.perl version 192 (Tue Jun 28 16:55:30 2022 UTC).

Diagnostics

No scribenick or scribe found. Guessed: s3ththompson

Maybe present: s3ththompson