WebDriver F2F TPAC 20 September 2016 -- 20 Sep 2016

RRSAgent: please listen

RRSAgent: please draft the minutes

<scribe> Scribe: ato

<scribe> Scribe: Andreas Tolfsen

<scribe> ScribeNick: ato

W3C process and moving towards Rec

plh: There have been changes to the process.
... There is a lot of leeway.
... Please don’t ask me what the process is, but ask me what you want to do.
... Tell me what you want to do, and I will try to make the process getting there as pain-less as possible.

AutomatedTester: We’re quite confident that the spec will be done by March.
... We’re more unsure where implementations will be.
... As we implement things, we will find bugs. There will be bugs everywhere.
... What would happen in that situatioN?
... Obviously we can apply for an extension, which hopefully wouldn’t be that much of an issue.

plh: Depends how many times you’ve applied for an extension.

simonstewart: Many.

plh: That may be a problem.

AutomatedTester: The specification seems to go into an immutable state.
... What is the process for an errata?

plh: Don’t move to Rec before you have a good answer to how you are going to maintain it.
... Figure out how you will do that first, please.
... You should already publish your draft for your next version before you mvoe to Rec.
... We want to avoid that specs are unmaintained.
... We want you to have a draft for the next version already, before you move to Rec.
... Version links, completely broken.

AutomatedTester: It’s easy iterating over what we have in Github.

plh: That’s what I expect. For you to continue iterating.
... Should you continue as an incubator? [???]
... Please continue maintaining your GitHub repo.
... Depending on whether your changes are substantive or not, you can push to rec every month.
... People will think you are crazy, but you can.
... What kind of guarantees do you want to provide the world about your API? Also for implementors.

ato: I don’t think anyone is suggesting we will push API changes to a new version of the Rec.

plh: No, but if you find a security issue in your spec, we will need to update the spec.

AutomatedTester: Say we got to the point where we feel our specification is ready, but we are waiting for implementors to implement it.
... And our charter expires.
... What happens?

plh: Implementors want guarantees about stability of API.
... The director doesn’t expect things to be perfect.
... He expect it to be good enough.
... Most WG develop test suites.
... To give reassurance.
... How many tests to write? I don’t know.

AutomatedTester: Not all vendors are running WPT.
... Even within Mozilla we don’t run the WebDriver spec tests.

plh: I thought the goal of WebDriver was to get past that.

AutomatedTester: We have a test suite that is divergent from the spec.

ato: That is not the case…

AutomatedTester: Potentially we get to a point where our test suite is ready, we have this divergent test suite, and we’re fixing a more spec-based test suite.

plh: Meaning that you’re confident you have to conforming implementations?

AutomatedTester: Well that’s what I mean. How do we know?
... We can use the Selenium project’s test suite.

jgraham: I think from this point of view the Selenium test suite is irrelevant.
... I understand that there are issues running the tests in CI
... But if you think pushing the specification to Rec is important, you need to push internally in your company to run those tests.

plh: You don’t need to run it in CI.

AutomatedTester: No but if you want to prove it…

plh: Do you know what you’re missing?

jgraham: We’re missing _the_ test suite.

plh: I thought you had some test.

jgraham: We did, but they were imported from Selenium and were incorrect and have been removed.
... There’s now no infrastructure blocker for someone going in an writing tests for this.
... The question is just people doing it.
... There are people in this room that want to keep the March deadline, and I hope they are stepping up to the task of writing tests before that time.

JohnJansen: Yes, I like having a deadline. That forces us to be active.

AutomatedTester: I don’t think we need any W3C resources.

plh_: W3C wants to help. WebDriver is important.

jgraham: For Mozilla, there are a series of technical steps we need to take to make it possible to run these tests in our CI.

<AutomatedTester> judg3dr3ddb34r1234

jgraham: It will then, as we do implementation, we will then fill out the test suite.

<AutomatedTester> doh

plh_: But you can’t expect this to happen organically.

AutomatedTester: I can start splitting out the specification in parts that need support.

plh_: You need to find out where the gaps are in your specification.

<scribe> ACTION: AutomatedTester to create a project plan for tests [recorded in http://www.w3.org/2016/09/20-webdriver-minutes.html#action01]

plh_: We can provide you with a generic testing plan.
... It’s not written anywhere either.

AutomatedTester: I don’t think we need that.
... We need people not to step on each others’ toes.
... It’s always possible hire contractors.

RRSAgent: please draft minutes

JohnJansen: You want to make sure that everyone is looking at the proper version.

plh_: Yes, and the “proper” version is different to different people.
... There is no single answer.
... Are you in CR?

JohnJansen: No.

AutomatedTester: We have two major things that are being landed at the moment.
... Actions and changes to New Session, and there might be one or two things that have come out of the last few days.

plh_: Give me a time estimate.

jgraham: MikeSmith said you try to stay in CR for as little time as possible.

plh_: We need _reasonable_ amount of time to review the document.

AutomatedTester: I’ll pull a date out of my head. January.

plh_: Who is it relevant to? Who do we ask?
... Who needs to read this in CR?

AutomatedTester: I don’t know.
... We’re not changing JS APIs.

plh_: We want the review to discover serious issues with your spec. Like if certain groups or people can’t use it.

AutomatedTester: In the charter it says [reads from charter].

plh_: I think probably security review is going to be one of the most important.

AutomatedTester: Vendor security teams have reviewed things underway.
... Does that make any difference?

plh_: We have missed problemsin the past.
... It’s not going to be enough.

RRSAgent: please make the log world-visible

RRSAgent: please make it world-visible

RRSAgent: please make these logs world-visible

RRSAgent: please draft the minutes

AutomatedTester: More questions?

plh_: I have one for you.
... When MikeSmith was saying keep the CR as late as possible…
... Don’t go into CR until you know how to get out of it.
... There is a price to be paid to be in CR.
... It is more painful to change the specification once it’s in CR.
... Because you have patent commitment &c.

AutomatedTester: Thank you.

plh_: Thank you.

AutomatedTester: Break now

<juangj_> Scribe: juangj_

Handshake strictness and Postel’s Law

simonstewart: reminder that as a general principle, you should ignore requests that contain extra stuff

jgraham: yes that’s how the spec has always been written

simonstewart: don’t expect this is contentious, just mentioned it because there have been bugs

/status endpoint

simonstewart: particularly for large server processes, you start the server and it goes and does a bunch of stuff while you have some coffee
... it’s possible that the httpd is accepting connections while this stuff is happening, even though it’s not really ready to serve
... nice to have an endpoint that says ‘200 ok’ when ready

ato: any reason not to just not respond to the request and block until ready?

<juangj> Scribe: juangj

simonstewart: hard in practice
... other use cases are for grid, for nodes to come up but indicate whether theyre really healthy

juangj: argument against /status in sapporo was that it’s an implementation detail of server management
... however, intermediary nodes are arguably in the business of server management, so it would be convenient

ato: still don’t quite get the use case here

simonstewart: useful for things like grid to detect end node health; generally not useful for endpoitn nodes themselves

ato: is there some more rest-ful way we can do this, with a HEAD request perhaps

jgraham: if we did this, there should be agreement on what the response should look like and how it behaves in different situations

ato: not sure what an endpoint gives us that a HEAD request doesn't

simonstewart: HEAD request to where

ato: HEAD to /session? or to /

sam_u: seems like we’re just discussing the name of the endpoint now, not whether it’s valuable

ato: i see the use case for distributed systems, but…
... much of this depends on what it means to be “OK"
... not convinced an http status can convey the several states

jgraham: in principle this is useful. there should be an applicable http status somewhere, there are a lot of them
... if there was an http status that was unambiguously what we want for alive-but-not-healthy cases, that would be more theoretically pure

simonstewart: in cases where you’re ok and ready to start new sessions, 200 is fine, and in all other cases 503 is probably fine
... states are, e.g., “able to create new sessions”, “running, but unable to create sessions”, perhaps “shutting down” (e.g., draining a cluster of nodes) — might want to distinguish between reasons for being unable to create

jgraham: hm, actually, http status really applies to the current request. returning 503 is strange when we were clearly able to handle the /status request, even though it theoretically would not be able to handle a request to a different endpoint

sam_u: no reason you couldn’t say whatever you want to say in the body of the 200 response

ato: if there are going to be 4-ish states, then yes we should just put info in strings in the body

sam_u: do we still want, e.g., architecture, os name, in the /status response? (it’s in there in the json wire protocol)

ato: doesn’t really make sense for an intermediary node, which has multiple sessions

simonstewart: what’s actually important is just to know whether the node can create new sessions so local ends can poll for it

ato: [looking for relevant status codes]

jgraham: i thought we agreed to just use 200, with a json blob and status field
... could we just have a json blob, either {“ready”: true} or {“ready”: false}? and return ‘ready’ if a request to ‘new session’ would currently succeed

ato: wonder if we could supplement with additional, implementation-defined message
... of course we can’t stop intermediaries from adding more fields, but “ready” and “message” at least seem sufficient

sam_u: for the “ready” boolean, how sure do we need to be that we’re actually ready?
... e.g., chrome doesnt start up well when windows isn’t logged in

ato: thought we decided this was only useful for intermediary nodes?

jgraham: don’t think we decided that

ato: not sure there’s a use case for this in an end node then. you could just attempt to create a session and get a ‘not created’ error

sam_u: don’t want to actually have to create a new session just to find out if you *could* create it

jgraham: don’t think we should define endpoints that only apply to intermediaries. intermediary and endpoint nodes should serve the same endpoints

simonstewart: simple use case for endpoint node would be to return ready=false when you’ve reached the maximum session count

sam_u: perhaps ready=false says it will definitely fail, ready=true means give it a shot, but doesn’t promise it will definitely succeed

jgraham: necessarily must be purely informational; there is always a race here

ato: 2 conditions: would “POST /session” succeed, and has max session count been reached

juangj: is it required to fail when your’e at the limit, or are you permitted to kill an old session and make a new one?

jgraham: it should be a fail

proposed resolution: add /status endpoint. response is {“ready”: boolean, “message”: implementation-defined explanatory string}

juangj: is “message” actually optional or is it required, but possibly empty?

ato: i guess we don’t have any optional keys

RESOLUTION: add /status endpoint. response has two keys: {“ready”: boolean, “message”: implementation-defined explanatory string}. ready = true if the node believes that a ‘POST /session’ will succeed in creating a new session

<ato> juangj: r+

ato: do we want to do something similar with ‘GET /sessions’?
... it’s convenient but there is no clear use case right now

jgraham: in the absence of clear use case, i’m opposed from a security standpoint, because it could make hijacking sessions easier

simonstewart: we do do this in Selenium server already, at /wd/hub; shows a list of active sessions so you can quit them more easily

AutomatedTester: definitely useful for intermediary nodes so you can clean up sessions that you thought should be dead. not clear this is useful for endpoint nodes

ato: you could specify this, and then perhaps in endpoint nodes you could return 403 Forbidden

AutomatedTester: concern that if endpoints return garbage here, then it breaks transparency
... there’s also no reason to put something in the spec that people aren’t willing to implement

simonstewart: people could talk to their infosec teams

jgraham: wary of adding new features and more work at this point

brrian: i did used to have this endpoint and apple infosec did not like it

jgraham: oh well then that settles that, doesn’t it?

RESOLUTION: we will not add a /sessions endpoint for a list of active sessions, because of security concerns about divulging active sessions

sam_u: are session IDs supposed to be secret in general?

brrian: can’t keep them perfectly secret because people can sniff local traffic, of course. but having a /sessions endpoint means a webpage could grab it more easily

<scribe> ACTION: simonstewart to mention sensible use of TLS in the Security appendix [recorded in http://www.w3.org/2016/09/20-webdriver-minutes.html#action02]

Plans for a test suite

AutomatedTester: we need to work out what we currently have in the test suite, which i appreciate is not much
... should create something to track who is working on what, and share what still needs to be done
... figure out who has what tests, where, and how to migrate them to a common place
... assuming for chrome’s tests, like mozilla’s, it’s non-trivial to migrate them?

sam_u: [agrees]

ato: skeptical of just mass-migrating tests from the selenium test suite
... should certainly run the selenium tests as a reference, but really important is that we ensure normative language is covered appropriately

simonstewart: vast majority of our users are selenium users, so selenium tests are an indication we can migrate people without blowing up every test in the world

ato: but also important to have tests to ensure conformance of implementations

simonstewart: both things important

JohnJansen: in your experience, is there value in starting with the old tests?

ato: no, don’t think so. better to examine spec text/structure, consider each step of each algorithm and consider ways to break them

AutomatedTester: example: marionette was returning floats for element coordinates, selenium was expecting ints
... main takeaway is that we need to have a project plan so that we don’t step on each other toes. obviously won’t work to just say “work on whatever you want to work on"

ato: want to avoid situation where we import a mass number of tests into WPT that haven’t been reviewed carefully, and where we don’t know which parts of the spec each test is intended to cover

jgraham: less concerned about that. for our particular case, much more so than with the general web platform, we could get pretty far just with code coverage

JohnJansen: how would we implement this plan? just like a wiki table of spec sections?

[general discussion of the plan for the plan]

ato: the old tests are still in the WPT repo. you can still go look at them for reference if you want. we can delete once we have spec coverage of those areas

brrian: can you talk about the organization of the tests? right now there are 2 big files — are you envisioning more, smaller files?

ato: was thinking one file per chapter

simonstewart: how about directory per chapter? some complex things like Actions will be gargantuan

ato: let’s not worry about that at the moment. we can modify that when it becomes a problem

brrian: doesn’t seem like it would scale. one file = one person working at a time?

jgraham: not really worried about the merge for people just contributing new tests

[questions about actually checking out the repository]

[disbelief about some ridiculous git nonsense]

[debate about merits of embedding data: URIs in tests vs. putting HTML in separate files]

<ato> RRSAgent: please draft the minutes

Promises in executeScript

simonstewart: i’m not sure how i’d do this in a local end that isn’t JS

some confusion about what the proposal actually is

AutomatedTester: (reading the bug out loud)

<JohnJansen> https://www.w3.org/Bugs/Public/show_bug.cgi?id=28060

jgraham: we clearly are not going to remove executeAsyncScript
... if we want to finish anything, this is not the time to be writing a new algorithm to deal with this

ato: FTR i do think it’s an interesting idea

jgraham: this seems nice but should be traiged to level 2
... to clarify, proposal is that the promises should be settled entirely on the remote end

<gitbot> [13webdriver] 15jgraham opened pull request #332: Implement a GET endpoint for the timeouts (06master...06get_timeouts) 02https://github.com/w3c/webdriver/pull/332

<gitbot> [13webdriver] 15jgraham opened pull request #333: Add a status endpoint. (06master...06status) 02https://github.com/w3c/webdriver/pull/333

juangj: currently possible to deal with promises using executeAsync; just .then(callback)

but not necessary to add special functionality for it now

RESOLUTION: defer bug 28060 to level 2

<JohnJansen> scribe: JohnJansen

GetText()

current situation is that there has not been any review of the HTML spec

rniwa_: it is currently not interoperable

jgraham: it has landed

juangj: there are still interop concerns
... white space
... probabably solvable when comparing text to known constant
... work we'd like to not have to ask
... ran some tests on google codebase that will fail

AutomatedTester: multiple browsers?

juangj: firefox and chrome

ato: concern is also innerText
... even though it might not be interop amongst browsers
... don't want to spec the javascript result for visible text
... it is an approximation

AutomatedTester: as CSS progresses hopefully a platform API covers those
... difficult for us to play "catch up"
... groups driving things forward
... what we consider visible now, might change
... so we should normatively reference those specs

juangj: do we expect browser to converge?

AutomatedTester: guessing so, that's why it was added to HTML
... but no idea right now

rniwa_: vendors are likely working towards interop

ato: if we spec something ourselves, it might not be the right thing for the future
... remove it entirely or norm ref the html spec

juangj: as Jim says, there is concern with futureproofing

AutomatedTester: with getText() we have a path for interop, with isDisplayed, we don't

simonstewart: this is not a trivial thing to fix
... real world backcompat concerns that will hinder adoption

wilhelm_: this is a case where quirks mode approach might help

simonstewart: the javascript representation works right now, and would allow vendors to move slowly toward interop

ato: we would be using something not standards compliant

jgraham: there is nothing saying you have to use the C++ algorithm of getText
... you could use script injection that is equivalent
... in the end node you can do whatever you like
... that is indistinquishable from the algorithm

simonstewart: output of innertext is not the same as the algorithm
... are there other specs that depend on this?

jgraham: no, but if it breaks website, they get priority

rniwa_: innertext is very important to us
... we could not change it would break user expectations

ato: better then to have a migration path

rniwa_: likely we would argue very hard not to change out implementation
... lot of uses on the web [lists...]

AutomatedTester: selenium's version does not consider a lot of specs currently

simonstewart: that's why we are where we are

AutomatedTester: lots of cases where you "copy" and selenium gets it wrong

simonstewart: no, it's correct for Selenium
... lots of test examples with single word, new line, new lines
... built up a lot of code for this
... but we have not defined behavior for some stuff (css text)

jgraham: is this the argument:
... this is not implementable
... we will not implement because it will break too much stuff, then we cannot have this in the spec
... if you want to know what the text is actually like, then use script

simonstewart: it's not either/or
... we can have a solution
... we need to specify this with consideration for the current users who have written tests for 10 years
... it would be nice to have something interoperable
... but at no point can we throw existing users under the bus
... therefore 'quirks mode' seems good

jgraham: that's two endpoints, then
... essentially we are being told we have to have the selenium text

ato: maybe use capabilities?
... if you are conformant with innertext, then use innertext
... at some point in the future Selenium 4

simonstewart: we've made an effort to keep our changes from breaking users

ato: we need to have some migration path

simonstewart: agree

ato: at some point we need to move to the standard

simonstewart: I think we can define it simply by including the javascript as an appendix
... we can recommend to users

jgraham: dislike "capability" and sort of dislike "javascript" in the appendix
... we should be able to find evidence of people using innerText

ato: people won't be using it because it's not conformant

jgraham: seems like we can only ship the selenium solution because people don't want the un-interopable solution

JohnJansen: people want innertext to be interoperable, they do want to use this, but cannot

simonstewart: we need the old as well as a way to move forward

wilhelm_: if we don't have it, we will patch it forever

jgraham: seems like that could wait 6 months
... why are we talking about this in v1

rniwa_: it will not be interoperable in the next 3 years

ato: we are trying to be standard for the future and backward compat

jgraham: right, so let's wait
... let's not add work TODAY for work that will not be useful for YEARS

rniwa_: it's also likely if we do this today, people will write ua sniffing and will break in the future

jgraham: we would never change what Selenium does today
... when in years' time we have users asking for this
... then we add innertext at that point

simonstewart: there are places where we say "browser can do what it wants..."

jgraham: very few

simonstewart: this applies to link text and partial link text
... as well

jgraham: if we add this then the endpoint changes
... maybe at that point you look at a global session setting

<simonstewart> jimevans: nope

<simonstewart> The current proposal is to put the Selenium JS for “get text” into an appendix

<simonstewart> and then use that for now

simonstewart: writing this as prose is not going to work, should just have the code sample

wilhelm_: how sane is the current algorithm?

rniwa_: not very
... just found two bugs today in it

RESOLUTION: Include the current Selenium solution in the spec and revisit innerText for V2

ato: we will likely end up with the same situation with even more tests in the future

jgraham: true, but the innertext solution is not currently a solution

simonstewart: should we submit our tests to the HTML working group?

JohnJansen: yes. if we have good tests that will help them, we hsould

wilhelm_: our selenium algorithm may actually INFORM the html spec and help

simonstewart: we don't WANT the selenium solution, but need it.

<scribe> ACTION: figure out if our algorithm and tests help with innertext interop [DONE] [recorded in http://www.w3.org/2016/09/20-webdriver-minutes.html#action03]

break for lunch

until 2

<simonstewart> scribe: simonstewart

extensions

[AutomatedTester looks around]

mikepie introduces himself. Works as MS with JohnJansen. He and kmag2 are part of the browser extension WG

mikepie: thinks there are some use case where webdriver is good for extensions
... where people want to test extensions on sites
... where extensions need to be tested cross-browser
... testing combinations of extensions
... offer coimpliance tests for extension support

mikepie demoes extensions

mikepie: additions include other ways that users need to interact with elements (such as click or send keys)
... for example, translate text via a right click menu
... also interact with browser chrome via a “page action”
... also similar browser actions that initiate some kind of action
... last piece is essentially a new window. Extension creates a new popup that it can present and user interact with

mikepie demonstrates suggestions by JohnJansen for extensions in WebDriver spec

AutomatedTester: can’t see anything too contentious now

sam_u: agrees with what’s been said
... chromedriver has some support for chrome extensions. Used by google for testing some extensions already

mikepie: skips to section 5 of the spec (capabilities)
... talks about “enabledExtensions” capability, whihc contains a list of extensions

simonstewart: points out that relative paths might not work on a distributed system

mikepie: known issue: if you specify a folder, this should be the folder that contains the manifest json file

<mikepie> https://mikepie1.github.io/browserext-1/webdriver.html#capabilities

sam_u: chrome extensions have an ID that’s based on the key in the manifest. Is that part of the extensions work?

mikepie: will be discussed on Thursday about how to generate those IDs

sam_u: chrome unpacks a crx and takes the manifest and pulls out the ID

The id is stable for an extension

sam_u: will that work well with your extensions api?

mikepie: good question. We can discuss that on Thursday
... in section 6 (sessions), no modification that’s needed for extensions here, but the top-level browsing context may need to be expanded to handle extension pop-ups
... section 8 (window handles) would need to return the handle for an extension pop-up too
... same goes for 8.3 (switch to window)
... also 8.4 (get window handles)

sam_u: in chrome, extensions have another tab that they can use (“webview”)
... runs it as a separate render process, designed for embedding HTML into a page.
... in chromedriver that appears as a window

rniwa_: safari does something similar
... you want to be able to interact with the page in the popup

mikepie: the page within the webview would appear as the window

sam_u: does this exist in other browsers? And do we want to add this to windows or frames?
... feels like it’s something that’s better as windows

kmag2: asks for clarification on windows in webdriver

<sam_u> https://developer.chrome.com/apps/tags/webview

AutomatedTester: discusses how marionette works

sam_u: the popup looks like an iframe to a user, but to everything else it looks like a window. It won’t inherit the permissions of the extension (eg. may not be allowed to use a camera)

kmag2: so it’s like an iframe in a web page but sandboxed?

ato: from the webdriver point of view, it’s just another top-level window. It’s just another top-level browsing context

rniwa_: in some browsers, a pop-up may appear, but you can still switch tabs
... is a popup associated with a particular tab in chrome?

sam_u: only in the sense that it’s inside a tab
... we used “window” because it’s easier to implement

kmag2: from a user point of view, it may look like an iframe. Could be considered a different window
... not defined

JohnJansen: it “feels” like a separate window

kmag2: user expects popups to close when window owning the extenion closes

mikepie: section 17 (“extensions”) — new section
... this is about invoking separarte actions. First is “getExtensionActions”, which lets you know that there’s an action that can be used. Returns the text of the action (currently the title property)
... it also returns badge text as a separate string
... takeExtensionAction will activate the action
... another open issue is what’s returned from “getExtensionActions”. Currently returns a json object of “id”, “text”/“title”, “icon” and “badge”
... you may want extra information to handle similarly named extensions

kmag2: leaning towards generating an ID for each extension at runtime for the lifetime of the session

mikepie: useful for translated strings too

kmag2: thinks some things are missing. Being able to link a popup to an action would be useful. Also getting the location of the action (between browser actions and element actions). In firefox there’s also a drop down menu
... also in firefox and old versions of chrome, extensions could be hidden and disabled

mikepie: getExtensionContextMenuItems, specific to an individual element

mikepie describes actions available through context menu

kmag2: thinks some changes could be made. Both chrome and firefox support checkbox items, so being able to read those would be useful
... items can be enabled and disabled in all browsers
... most browsers also support separators in context menus
... context menus for actions. There’s currently no way to open them

AutomatedTester describes the “Actions” section and right clicking

AutomatedTester: “this is where it gets tricky”. The webdriver spec only handles content
... at mozilla, we have support for going into the browser chrome
... All we need to do is switch the context
... talks about extension methods

ato: handling chrome in firefox only works because firefox’s UI is XUL
... thinks proposal sounds reasonable

mikepie: describes how to active an extension context menu
... other than one change to rename section to “Browser extensions"

AutomatedTester: I dont think this needs to be in the webdriver spec. Would be better in the extensions spec
... allows the spec and how to test it to be combined in the same place
... and the browser tools and testing wg should be part of the review process
... the webdriver spec gives you primitives to get you 80% of the way there. WG’s can then extend the spec as necessary
... that’s because theyre the subject matter experts. We know the WebDriver community, and we can help provide help as necessary

jgraham: additional point: if you need to patch any current text, that should be in the webdriver spec

ato: this is a discussion we’ve not yet had
... there are other WG’s that want to do something similar. Permissions and bluetooth are examples
... also for our spec. Aiming for REC by March. Open question for us, how the process of adding new things to the spec will look for us

wilhelm_: this is a good opportunity to talk about that

kmag2: the additional browsing contexts should be in the webdriver specs

AutomatedTester: we do have context menus “kind of” supported.

jgraham: right click is supported, but we don’t say what that should do

ato: in marionette, we do synthetic input. Executing a right click shouldn’t open a native menu
... we could support that in marionette, but if you want interaction with the context menu, we should allow providing access to certtain items for security reasons
... that’s why I think it makes sense to have a separate end point for selecting an item in the context menu
... you’re the first WG to come to us with a ready made proposal

mikepie: thank you. We’ll separate out the bits that need to be separated

AutomatedTester: feel free to email our mailing list

kmag2: do have an issue with how extensions are enabled, disabled and installed. Also opening special parts of the UI

AutomatedTester: briefly touched on that with “new session”. Provides a common mechanism for passing in preferences and extensions

kmag2: thought extensions could be installed at run time

sam_u: also stuff to do with handling permissions at run time

ato: marionette also has a need for that
... standardising that API sounds reasonable

kmag2: since you mentioned permissions, is theere anything for dealing with user permission prompts?

AutomatedTester: no
... hard for us to describe how we’ll handle coat hangers until we know how they’re going to be implemented

ato: but permissions would be good to support

AutomatedTester bemoans lack of common behaviour between browser implementations for accepting permissions

Discssion on existing browser support for extensions

AutomatedTester: webdriver spec has a section on privacy. WebDriver should not impact users of the machine. Ideally installed extensions need to be removed

JohnJansen: we should call that out

ato: is there a temporary extension?
... avaliable for the life time of a browser

JohnJansen: that’s how we handle extensions now

Discussion about security and privacy

mikepie and kmag2 thank everyone

discuss send keys and determine best course of action for unicode entries

AutomatedTester: but first, let’s have a break

sam_u: we have visitors from Google to talk about send keys. In addition, we could talk about testing keyboard input events.

<ato> RRSAgent: please draft the minutes

brrian: describes how the high level sendKeys command works by iterating over all characters
... doesn’t work for unicode. IME spec is not doing much.
... has suggested using “graphine clusters” instead
... what we want to figure out is what the right thing to do is

Gary Karcmarcik (garyk): mentions unicode

jgraham: describes the high level “send keys” method

ato: and it has specialisation for certain form elements

garyk: talks about keyboard layout differences between US and other keyboard layouts

jgraham: the way webdriver works at the moment is that we assume that the keyboard is a 104 US keyboard
... there is some possibility that later webdriver will support different keyboard layouts
... main thrust of the question: what’s the correct way to break up a string into keystrokes?

garyk: if you approach it that way, you’re stuck with US English. Because outside of that you need knowledge of all keyboard layouts

rniwa_: mentions complex scripts

jgraham: if you send something that’s not represented on a US keyboard, we’ll send a keyboard event with the key code set to 0

rniwa_: this will make webdriver useless for testing anything outside of the US
... talks about testing sites outside of the US
... the only way to make this feature useful in Chinese or similar languages, you need IME methods

garyk: name is a little misleading, but it sounds useful.
... keys come as individual events.

garyk is the coeditor of the UI events spec. Added support for code and key to events

garyk: the common keys are compatible, but other things aren’t
... key value (what user sees), and code (scan code)
... we’re aiming to avoid bizarre edge cases. Had a situation where hardware changes broke things in interesting ways
... right now there’s no possible way to do this. Currently doing manual testing. Want to be able to test US International keyboard with dead keys.
... suggests injecting keys at the system level, then we can validate the physical keys being sent

rniwa_: on some platforms, it’s impossible to send simulated system level keyboard events

garyk: if that’s the case, we can’t test on those platforms.

rniwa_: another way to do this is an emulated mark text and selection in the browser.
... you send the keys to the browser as webdriver does already, but you need a slightly higher level API, offering key events and expected text, with ordered events

garyk: discusses how text input processing is done by the browser

jgraham: the existing situation with the lowest level API, the key events handwaves a little about how events are fired.
... you’re injecting something into the browser event loop that filters through the system that would generate the same web key events

simonstewart: describes local end implementations and how keyboard input is done
... thought there was an API on Windows to figure out the keycode and scancode from a keyboard is

rniwa_: nope :)
... describes difficulty of figuring out keyboard layouts

garyk: but if an OS can’t guess the keyboard, no-one can
... Windows has an API to fill out the rest of the events given the scan code

rniwa_: thinks there’s an API to do that too.
... this handles the keyboard problem, but not the way that things like how an OS does accents
... iOS, OS X and Windows are all different

https://seleniumhq.github.io/selenium/docs/api/java/index.html?org/openqa/selenium/package-summary.html

https://seleniumhq.github.io/selenium/docs/api/java/org/openqa/selenium/remote/server/handler/ImeActivateEngine.html

simonstewart links to Selenium WebDriver APIs

<ato> Scribe: ato

simonstewart: It was added to allow us to do proper localisation testing.
... At a lower level than sendkeys.
... Call low level actions API, and that would use the IME.
... But that again assumed a partiucular keyboardlayout.

wilhelm_: Predefined keyboard layouts? Can we feed it with that?
... If you catch 50, it’s inifinitely better than what we hav enow.

garyk: There are six physical keyboard layouts.

rniwa_: Europeans are all 102.

garyk: There are locale differences.
... [explains about keyboard layouts]
... Then locales on top of these.
... Literally dozens.
... Pressing physical key, look it up in the locale, generate the key

<simonstewart> Final API from Selenium WebDriver: https://seleniumhq.github.io/selenium/docs/api/java/org/openqa/selenium/WebDriver.ImeHandler.html

garyk: Browser compat testing like I do is not the common use case here.

wilhelm_: I want to test the simple use case.
... Virtual IMEs are fine for the common use case.

garyk: That wouldn’t work for my use case.
... There are different ways of getting access to characters.
... E with accent: type dead key and that other key over there.
... That is what I want to write a test case for.

rniwa_: I don't think it's necessary to define particular IME behaviour.
... Each behaviour tested spearately does make sense to test.
... Type the character, then insert another code point for accent.
... Three or four different types of accents.
... People probably want to these four different cases.
... But they probably don't want to go through the native layer and look these up. It's fine to predefine.
... Single IME that you can feed with some configuration should probably be fine.
... Something that _behaves_ like a Korean IME.
... Chinese IME is _very_ different from Taiwanese IME.

brrian: Most developers have no idea what a key code is.

rniwa_: Portuguese accent testing, Japanese kana input mode, Thai: Every testing machine would need to be natively set up to run that test.

garyk: You’re right. You would need physical keyboard and a lot of locales.
... But WebDriver needs to be able to press physical key and set the locale.
... It’s the concern of the test and the tester to make the right assertions.

brrian: That sounds limiting.
... What level do we aim at here?

garyk: What would be the convenience method for Japanese users be?
... We have a convenience method for US users.

rniwa_: I don’t think sending text is good enough…

[discussion about what users want]

garyk: We have a simple API and an andvanced API. It would be good if the advanced one actually was sufficiently advanced to support my use case.

rniwa_: If you implement autocompletion you need to generate out the right characters so you get the right composition.

garyk: Typing the simple characters, even in Japanese, is rather straight forward.

rniwa_: If you know the locale.

garyk: Yes.

rniwa_: Suggestions are in the heuristic space, so you can never rely on that.

wilhelm_: [explains about his idea about idealised IME]

rniwa_: User wants these characters, then replace the first with this. If you can specify that, you can generate a lot of the actions a user wants to make on a keyboard.

wilhelm_: When you write the test, you know what you expect.

garyk: Could have a dummy IME for Japanese
... Simple default conversion that never tries to be adaptive.

rniwa_: Sometimes you want to test specific sequence of events.
... When cursor moves to the left, it changes something, and then breaks that model.
... Arbitrary conversion sequence needs to be able to be coded, I think.

brrian: Virtual IME, basically.
... This doesn’t sonud high level any more.

garyk: Or just a really simple one!

rniwa_: The key is how to specify the sequenece of keyup and keydown, along with current composable text.

garyk: Order of key composition events, order of keyup keydown events, swapping, composition update events that are firing [plus a few more things]

rniwa_: Extending current keysend command and adda current composition state

garyk: ?

rniwa_: sendKey("k")
... Also, here’s the current composiing character
... Then user types 8, here’s the current composing character [x].

jgraham: It sounds like the way that this could potentially
... We have this low level API
... I want to keydown A keyup A on this virtual keyboard.
... Each keyboad should have a way to define that the next event on this keyboard should be go into IME mode.
... Then press "k", "a" key in this IME.
... You’ve put JS in the page that listens for these events.

garyk: Also you can enter and exit the IME via the keyboard.
... You don’t need a special API.

jgraham: The reason I want it to be specific.
... Is then obviously not browser/platform independent.

garyk: That seems reasonable.
... What does the Selenium IME API do?

simonstewart: Once you activate the IME, you’re on your own.

rniwa_: On a Mac there’s no such thing as an IME.
... If you disable key combinations, you select something from a menu.

garyk: I’m not uncomfortable with the idea that there’s a specific machine dedicated to do this.
... Assumption I’m willing to make.

rniwa_: I think that will work for European languages, for usre.
... This is the same problem with testing spell checking.
... It learns, and you need to reset the state.

garyk: There are probably hacky ways to do this too. Overwrite some file.
... Without reinstalling the OS.

<scribe> ACTION: Add note to remind web browsers to disable their spell checking and autocorrection. [recorded in http://www.w3.org/2016/09/20-webdriver-minutes.html#action04]

ato: [explains how synthesised events work in Firefox]

brrian: Safari uses nsEvents.
... As if they were processed by the OS.
... Slightly different than what Gecko does. Slightly lower level.

sam_u: In Chrome we do it in EventSource::SendEventToProcessor.
... This is before it gets to Blink.

garyk: Sounds cross platform.

sam_u: Maybe minus Android.

AutomatedTester: Do we want to be splitting grapheme clusters or [unicode code points]?

garyk: That’s not going to be good enough for Japanese or deep testing. But if it works it works.
... You can split on unicode code point.
... How do you specify thing sin the low level API?

jgraham: For keys, you say you want key down then a value that is like a character, and if that is a character that it can match on a US keyboard, then use that.
... That's the only thing at the moment.
... Not something that will change in the next six months.
... But may change in the future.

garyk: It would also be nice to not try to look up and just do the thing requested.

brrian: You send a key code _not_ a scan code.

jgraham: You send a key character, actually.

simonstewart: We can change that at some point.

jgraham: I think that API is extensible enough that we can change all the data in it.

simonstewart: You have services like Sauce that run their machiens in US data centres.
... With users in Germany with German layout.
... Their test will fail either locally or remotely.

jgraham: Scan code wont work.

simonstewart: Yes.

wilhelm_: You can assume US

garyk: Only the scan code, and expect the system to figure out the rest, then that would be different.

jgraham: Yes.
... This scan code sounds useful but that it is not something we want to add today.
... Something we may want to add int he future.
... Probably don't need to figure out all the details by this point.

garyk: By the time you do that it will not be useful for the work we are doing.
... I’m going to continue with manual testing.

RRSAgent: stop listnening

RRSAgent: off

- DRAFT -

WebDriver F2F TPAC 20 September 2016

20 Sep 2016

Attendees

Contents