WebDriver-BiDi – 12 January 2022

Meeting minutes

<jgraham> RRSAgent: make logs public

<jgraham> RRSAgent: create minutes

<jgraham> AutomatedTester is going to be late I think, so we need a volunteer to scribe

<simonstewart> G'day folks. I've a meeting at work that's bumping into the editorial meeting today. I will be late, but I'll dial in when I can

<simonstewart> Well, that meeting :)

Sandboxed Script Execution

jgraham: turn real talk into text

<jgraham> RRSAgent: create minutes

github-bot: https://github.com/w3c/webdriver-bidi/issues/144

<github-bot> foolip, Sorry, I don't understand that command. Try 'help'.

github: https://github.com/w3c/webdriver-bidi/issues/144

jgraham: We've talked about sandboxed script execution before. If you haven't paged it in, we want some way to run scripts in a way that you can access the page DOM, but the JS properties you set aren't visible to content scripts, and you're not affected by what the content script has done.

jgraham: This is relevant for the ability to implement "find element" with JS without worrying about querySelector being overridden.

jgraham: Browsers already have a capability like this. The existing primitives aren't the same cross-browser, but getting convergence will be tricky, so we'll probably want some non-normative description on what authors can rely on.

jgraham: I've put up a draft PR at https://github.com/w3c/webdriver-bidi/pull/158

jgraham: the model is we have a target (realm or context ID) and in the case of a browsing context you get to specify a sandbox name. If it exists, use it, otherwise create a new one.

jgraham: or you can specify the realm ID if you already know it.

jgraham: This is different to CDP in that there's no explicit "create sandbox" command, although you can recreate it by running an empty sandbox script. It's a superset of CDP.

foolip: what about this is especially complicated or worth considering tradeoffs carefully?

jgraham: first, do people think this is an acceptable API surface? or should we have explicit creation like in CDP? the tradeoff here is you can make mistakes by making typos and create a new sandbox.

jgraham: the actual API surface feels fairly straightforward though.

jgraham: in the PR, there's just issues where you'd expect to understand how the sandbox stuff works.

jgraham: I've put up https://github.com/w3c/webdriver-bidi/pull/169 with some of the support we might need for this.

foolip: does one PR depend on another?

jgraham: https://github.com/w3c/webdriver-bidi/pull/169 is the one to review, as the actual thing to land. #158 isn't ready for review.

ack

foolip: I guess the upside of the API surface is you save a roundtrip and it's easier to use?

jgraham: Yes. What Puppeteer does is create a sandbox upfront and use that for everything. Given that, why not just have that hardcoded as a sandbox name?

foolip: is there any notion of creating or garbage collecting a sandbox?

jgraham: the lifetime is tied to the window.

foolip: sounds good

foolip: I'd suggest requesting review from folks on the PR (Brandon isn't here, for example)

Communication channel from the browser to the client

github: https://github.com/w3c/webdriver-bidi/issues/157

sadym: there are a few approaches here

sadym: we could make it bi-directional, or just one direction with a binding

sadym: bindings would trigger an event on the BiDi client side.

foolip: this question was previously tied up with sandbox scripts and communicating with them. with the bindings approach, will we talk to sandbox scripts and content scripts in the same way?

sadym: yes, and if the client wants to talk to multiple sandboxes they have to implement that.

foolip: I agree that this is a good starting point, with just a way to emit an event.

foolip: there are questions about how exactly bindings should work still

jgraham: an earlier suggestion was that sandboxes get a postMessage. the new suggestion is bindings which is a method you can invoke. that works in both sandbox and content scripts.

jgraham: an issue we discussed last time is that once we have bootstrap scripts, we won't have the chance to pass in that binding. then maybe we need to pass it in when we configure the page load script. or an interface that exists only for page load scripts.

jgraham: but Brandon's proposal in the GitHub issue was to make the protocol level bits look the same regardless of the API

jgraham: at the protocol layer we seem to have broad agree, possible apart from bikeshedding the name,

jgraham: there's a question about what it should look like in the API

foolip: what is API and protocol in this discussion?

jgraham: the API is what an injected script sees

sadym: scripts initiated by a sandbox are seen by the client too? (scribe didn't get it all)

jgraham: for bootstrap scripts, we could pass in a function to support arguments (which is how bindings are passed)

foolip: ok, so you'd pass in a function to establish the binding, similar to how you now do it post-load for regular scripts?

jgraham: yes

foolip: there was previously discussion about what a binding looks like. Is a binding a function which if invoked does the magic to emit the event?

jgraham: yes

WECG Coordination

<jgraham> RRSAgent: make minutes

Simeon: I'm a co-chair of the web extensions CG

Simeon: It started as core set of APIs common to all browsers

Simeon: As part of that, some CG member has requested capabilities for testing web extensions. I'm relatively new to standardization effort/process. We're not pursuing a standard, but trying to find the commonalities first.

foolip: do you have some details on testing web extensions? (I haven't looked at the email thread, sorry)

Simeon: what we've heard is (1) browser UI interactions, like clicking the extension icon in the toolbar, there are action buttons in that menu. This can't be emulated currently. Currently it's tested by opening a new tab, but that requires additional work for the sizing.

Simeon: (2) permissions are granted to extensions, and that capability can't be emulated. I've reached out to the DevTools team, but it's not clearly in scope of WebDriver / BiDi.

foolip: have you seen https://w3c.github.io/permissions-automation/?

jgraham: Having thought about this for only a minute, at a high level we've previously said that WebDriver is mostly focused on web content. But if others want to add extensions to the BiDi spec the right place to do that is probably in their own standards.

jgraham: So the web extensions spec could define the WebDriver APIs it needs.

jgraham: But we're happy to be involved and give review. But it seems unlikely those of us here are the right people to produce the right design.

jgraham: clicking UI buttons sounds a bit scary. maybe you don't want to emulate finding the button and clicking it, but maybe you want an API that behaves as if the button was clicked.

jgraham: for other parts of the problem, maybe you want events for the extension lifecycle. and maybe extension scripts could be a target for places to inject WebDriver code.

jgraham: it would be useful to have a "this is what we need" for the WECG, and I'd be happy to review drafts/proposals.

<jgraham> permissions spec: https://w3c.github.io/permissions-automation/

foolip: is BiDi support what you're looking for, or current WebDriver spec?

Simeon: we've talked about current WebDriver, because that's what the current testing stack is

simonstewart: back in the day geckodriver had the ability to switch between contexts. there's probably some prior art there. but the best way to do is probably to do a WebDriver extension

Simeon: is there a good example for how to extend WebDriver?

simonstewart: there's a section in the WebDriver spec

<simonstewart> The section on extending the original webdriver spec is here: https://w3c.github.io/webdriver/#extensions-0

jgraham: WebDriver classic is command/response so it probably works for clicking a button. but it won't allow you to get events for the extension lifecycle

Simeon: that's useful. It sounds like the best course for us it to review how WebDriver extensions work, gather objectives into a document and then reach out to this group for this group

jgraham: that sounds reasonable

Simeon: with manifest v3 chrome is changing background pages to service workers. someone asked for the ability to stop a service worker to test that it will respond as expected to certain events.

foolip: https://github.com/web-platform-tests/wpt/issues/6866 from back in 2017 is a request for an API to terminate service workers

Alright, if there was something just now that needs to be scribed, can someone do that?

I can hear now, but not be heard I think.

jgraham: a lot of things like network intercept don't have any standards now. we'd definitely welcome collaboration on that sort of thing.

Simeon: I suspect we'll overlap

jgraham: we'll save the final agenda item for next time

jgraham: we have an editorial meeting in 2 weeks, next WG meeting in ~1 month

<jgraham> RRSAgent: make minutes

– DRAFT –
WebDriver-BiDi

12 January 2022

Attendees

Meeting minutes

Sandboxed Script Execution

Communication channel from the browser to the client

WECG Coordination

Diagnostics