Web Platform Tests, Day 2, TPAC 2019 -- 17 Sep 2019

<zcorpan> RRSAgent: this meeting spans midnight

<jorydotcom> jugglinmike are you dialed in?

<foolip> jugglinmike: you can try joining now

Test Automation

<JohnJansen> Title: Browser Testing and Tools, TPAC 2019

<foolip> jgraham: talked with WebXR people yesterday, they're seemingly the first to define a test-only API which has multiple implementations

<AutomatedTester> scribenick: foolip

jgraham: there might also be tests that break for the wrong reason, like assuming that the mojo stuff is there
... our intent, depending on foolip et al, is to make the mojo part transparent so that you don't have to explictly included
... but maybe some metadata somewhere to say for which directories that injection needs to happen
... in the end, it'd work as if you set a pref and get a test-only API
... that probably covers it for mojo

<odejesush> present*

reillyg: want to say sorry, my work was the first to use mojo, and other have copied that. some haven't hidden their use of mojo as much
... question of how such tests should fail if the browser doesn't support the API or mojo
... if we can get infra improvements to get a guide for how this should work, then that's the right way forward
... more long term, APIs like this should probably be webdriver extensions, but open questions about capabilities of it
... but as long as we have general agreement about how to structure to not cause issues, that's fine

jgraham: if we can get to the point where we don't need an explicit mojo include, we'd just lint to check that you don't

<ato> Meeting: Web Platform Tests, Day 2, TPAC 2019

reillyg: the pattern you can see in wpt isn't the way it should be
... the most contentious was scripts just included mojo using <script>

<ato> RRSAgent: make minutes

reillyg: but for scripts that detect if mojo is available, that then loads mojo stubs, is that acceptable, or should impl and files be moved to elsewhere?

jgraham: it should be possible to make it almost completely transparent, so that you can just use the test API
... in the short term the pattern of check if mojo is available seems reasonable
... a pattern that we can point to as "this is what you should do" then that's a good step on the way

Hexcles: sounds about right
... general principle is that /gen/ shouldn't be checked into wpt, and we're looking at ways at archiving and releases those somewhere, along with Chromium

<ato> RRSAgent: make minutes

Hexcles: as for impl of test APIs, impl itself can be in wpt I think

<BitBot> (14wpt) [PR] lukebjerring 03merged 13#19106 into 07master: Update interfaces/geometry.idl - https://git.io/JeOBX

<jorydotcom> JohnJansen did you also have a question

boaz: hi Mike

<ato> RRSAgent: This meeting spans midnight

reillyg: only concern is I want to avoid it being too magical, figuring out how and why they got loaded was a bit of a journey
... needs to be clearly at the top of a test

jgraham: in other browsers that don't do the same thing, it wouldn't be accurate

reillyg: but everyone would have it implemented somehow, and pointing out how it works would be good

jgraham: maybe a meta flag for the manifest
... for gecko it'd be behind a pref

reillyg: if I'm developing in gecko, I'd like to know what pref I have to enable

jgraham: we could do a <meta name="something"> to communicate

jugglinmike: last year part of the discussion was whether APIs would be exposed to web devs, a compelling reason to use webdriver
... if in Chromium it's done by injecting code that won't happen
... wonder if that's still a goal
... can we reach a state where we just run the browser with a flag?

jgraham: agreement that where we can use webdriver that is better
... but some of these APIs can add significant binary bloat, increasing download size
... adding to webdriver is always good, historically we have cases tested internally that should have been tested using webdriver
... but there will always be cases where it's not possible or expedient to figure out the webdriver API for a specific thing
... probably makes sense to have some test-only APIs in some cases long term

reillyg: the use of mojo is out of expedience and development speed, adding webdriver surface is a lot of work
... first cut using mojo will be easier. would like to see webdriver for some of these things, but value isn't there for all of them
... for webusb, what I need to test in depth is very different to what a web developer needs to test their code
... I want to mock the underlying OS, they want do mock user interaction with permissions UI

jgraham: for webxr, in the long term probably want some automation solution, but hard to get that done quickly

foolip: what guidance should we give to people who want to know what path to take: webdriver of test-only API?

jgraham: I think, prefer webdriver, and if you're doing test-only API it needs to have a spec
... it's not an invitation to expose internal guts that can't be implemented by anyone else
... haven't had cases of multi-vendor interest until now, webxr being the first.

<zcorpan> ato: we can use the same https://docs.google.com/spreadsheets/d/1cqPK6ze2OCLsho4twJHNLZUPktfejIiiDlMwv0TaZBg/edit?usp=sharing

jgraham: when we do get more interest, I'm sure we'll get feedback on test-only APIs accidentally relying on internal details
... you should look to get as much feedback on your test-only API as the rest of the spec

boaz: in Chrome, how do you decide whether to pick webdriver or test-only API?

foolip: we don't have guidance and don't know, that's why I ask :)

reillyg: writing the test-only API was informative for crystalizing the behavior I wanted, recommend doing that

<jugglinmike> q

jgraham: it would be *nice* to know we don't depend on insufficiently defined API
... when the second impl comes along turns out you have to make a lot of changes
... I imagine we'll have the same problem with test-only APIs

boaz: anything we can do to avoid that?
... like not contributing tests that can only run in one browser?

jgraham: think that's difficult, deeper question is how valuable are tests for features with only one impl?

jugglinmike: that's the real interesting thing
... whether webdriver or test-only best discussed with Browser Testing and Tools WG
... I'm not sure how much to push for/against test-only APIs
... should we require when we add a new dir in wpt that the spec says something about automation?
... that may be too drastic, but would be good with more clarity for reviewers

jgraham: people probably agree that it's OK to complain about tests using non-std APIs
... put people determined to write tests for their experimental feature, hard to stop them from doing that
... difficult to do that gatekeeping
... if there are cases that have a good spec for the web-exposed part but not for the testing we should push back

reillyg: I'd be sad to this group reject tests for features that aren't standardized yet

jgraham: by "standardized" I mean just "spec exists", as opposed to just an explainer. this isn't about the standards track, it's about making an attempt to make a standard

reillyg: in early stages of development you'll write some code, some tests, some spec, seems reasonable to have tests before there's a formal spec
... whether those are in wpt or kept private, is a reasonable question
... I would like to see more linting, testing that we aren't depending on random things
... some level of linting to check you aren't assuming things that aren't well-defined

foolip: tentative tests are a thing

jgraham: but not tentative directories

ato: the webdriver group has engaged with other groups, like permissions
... returning problem is lack of motivation for implementation
... webdriver covers an enormous part of the browser, so people on webdriver don't have expertise to implement things specific to say webxr
... should encourage tests to think hard about testing, but some things are very hard with current webdriver protocol
... some things also might not make sense to test using webdriver
... jugglinmike mentioned what would be useful for web devs also, part of our responsibility to think about that

boaz: agree that writing tests early is great, and that the material should be shared, last thing we want to do is to keep that from happening
... for some of these experiments, would another place make sense?
... for tests that depend on things that are proprietay

jgraham: two cases. if you depend on proprietary APIs don't upstream.
... ... if you're depending on something you have no intent to standardize
... but for something like std-toast the tentative test mechanism works

boaz: but isn't there material that can't be run in all browser?

jgraham: unclear if there are tests that will never work even if they implement the feature. that's the line: would this be testable if the browser implemented the spec as written?
... there have been cases where changes make the tests wrong and not per spec, browser-specific by accident

Hexcles: a bit concerned about adding yet another place for staging
... in chromium we already have wpt_internal
... this is for things that are very early stage or unlikely to be testable
... a level between that and wpt is too much to organize
... but tentative directories is reasonable
... like if you don't yet have a spec draft, mark the whole directory as tentative

<ato> Hexcles: ++

Hexcles: agree with jgraham that the bar is if it could be tested in other browsers, that's good enough for wpt
... need not require that it can already be tested in multiple impls

https://cs.chromium.org/chromium/src/third_party/blink/web_tests/wpt_internal/

jgraham: we have the same thing, a directory for things we can't export for some reason
... there may also be a wpt.fyi feature request to search for tentative tests, other than in the test name
... might be a wpt.fyi feature request to filter by tentative

`!tentatve` should work

zcorpan: what's the effect of having tentative tests?

jgraham: informative only

zcorpan: so not different in wpt.fyi
... wonder what tentative directories would be for?

JohnJansen: hello jugglinmike
... a long time ago we had vendor folders in wpt

jgraham: *laugh*

JohnJansen: the idea itself was sound, an indication they weren't meant to run in other browsers
... sounds similar to tentative folder or another staging repository

https://wpt.fyi/results/old-tests/submission/Microsoft

JohnJansen: coming up with ideas here for how to store non-all-browser tests is irrelevant if we can't enforce that all tests follow that pattern

jgraham: rule for wpt is that in theory all tests should run in all browsers

<JohnJansen> +1

jgraham: let's clear the queue and break

ato: I'm not aware of the history of vendor directories, but in principle sounds like a good idea to run other's tests
... in Mozilla have tests in mozilla-central where we might want to reuse the wpt harness
... and turns out some of the wpt tooling is quite good

zcorpan: in response to JohnJansen, how to enforce?
... one way is lint checker, which jugglinmike already did for mojo tests

<jgraham> Zakim: close the queue

zcorpan: think jgraham reviewed that, so we have a lint already for mojo tests

<jgraham> close the queue

zcorpan: we may need a policy written down

foolip: we also have a lint for LayoutTests-specific APIs

boaz: can we have an action item for that?

foolip: you have edit access now :)

NavidZ_: webdriver pointer actions, we have that now.
... but some things depend on the browser or the platform, like disabling flinging
... because I don't want the page to scroll along some random curve
... second is platform-specific things like #ifdefs in the code
... browsers match platform conventions, e.g. with how stylus works
... context menus are also different on different platforms
... should we expose those differences to tests?

<zcorpan> https://github.com/web-platform-tests/wpt/pull/18509 is the lint check PR

<zcorpan> break till 11

<BitBot> (14wpt) [PR] chromium-wpt-export-bot requested 13#19107 merge into 07master: Port perspective and perspective-origin interpolation tests to wpt - https://git.io/JeORo

<BitBot> (14wpt) [PR] chromium-wpt-export-bot 03merged 13#19016 into 07master: Expand 'autofocus' attribute support to all of HTML and SVG elements. - https://git.io/JemyF

<AutomatedTester> scribenick: AutomatedTester

<JohnJansen> https://github.com/web-platform-tests/rfcs/pull/24

RFC: Test Editor

foolip: this session is for the rfc for the test editor

<jgraham> scribe AutomatedTester

<ato> Reminder, this is the current seat map: https://docs.google.com/spreadsheets/d/1cqPK6ze2OCLsho4twJHNLZUPktfejIiiDlMwv0TaZBg/edit#gid=0

fremy: let's talk about the history of this rfc. Part of my work there was a need to quickly make wpt. This would be from reducing a webcompat issues
... one of my ideas was to use something like jsfiddle and would be quick and easy to make wpt
... there would just have the basics for the doc
... and then you can add simple assertions.
... people do not want to have to fork wpt/ github repo. They just want to quickly make a test.

<ato> RRSAgent: please make the minutes

fremy: you can go from "i need to create a quick test" to "here is a PR" and make it seemless as possible

<ato> ScribeNick: AutomatedTester

fremy: this is how it started and as part of CSSWG we have expanded it

JohnJansen: there was a request from several people outside of MS to move it away from MS to wpt
... I have renamed the repo to editor
... and we have been getting a bunch of questions on how we can use it
... the request then to see if we can move forward and what is required to close the rfc

foolip: is there someone to willing to deploy and maintain it? including infra?

JohnJansen: we have modified the azure instance to do all of this

<fremy> https://wptest.center

foolip: is there someone to work on this?

<JohnJansen> https://github.com/web-platform-tests/editor

JohnJansen: yes

<JohnJansen> https://wptest.center/#/new

fremy: what does it take to make this project useful for people at large? Workflow changes? Is there value for this group? How can we manage the contribution story
... I am happy to continue to contributing this and make the changes to make it work
... but it would be good to make sure that there are other vendors willing to participate

jorydotcom: This moved out of MS github to wpt github.
... have we done all the due diligence on the move

JohnJansen: yes, MS legal agreed
... I have been using it to test reductions that I have made on web compat issues
... we were in a state where reductions were more important and therefore no one was doing wpt for these issues

jgraham: the people in this group are less likely to use this project and this might be the wrong group to figure out what is missing and what features could be missing
... having said that speaking to the webcompat or MDN people might mean the feature requests would come from people adjacent to this group

JohnJansen: the request in the RFC is to make editor available from this group for other people to use
... we could use take it and then see if in the year that is great then we support it or if it doesnt then just archive the repo

fremy: the CSSWG like and we have changed the process to start changing a spec then it needed a test. We used this process to remove the friction in getting new tests. TabAtkins has used this in their workflow and seems happy so far
... we need to make tools to allow editors that they dont need to change their workflow to make more wpt

<BitBot> (14rfcs) [issue] boazsender opened 13#30: Create an RFC for what material is and is not allowed in WPT - https://git.io/JeOR5

fremy: so if we can make this more official then people are likely to use it more

jgraham: there is a community building thing here. people could use but there are "things" that need fixing. But if we can build a community and see how it goes. We can link to the docs but not say it is the prefered way

foolip: is going to need someone to support it?

fremy: is MS going to support it?

JohnJansen: it depends on how much work but we can have a look to see how it goes. If MS fails in the contribution then we can re-evaluate it

foolip: if it is going to be called a editor then it needs to fix tests are already in the repo

fremy: this is currently being hosted on its own environment to manipulate it. E.g. slow network.

foolip: so edit and then run on the CI.

fremy: yes

jgraham: if you are writing a test then you can use wpt-test.live to host it
... you can reuse server side features this way

foolip: unless you are changing the resources in the backend you can do a lot of tests now

jgraham: [describes how the resources could work as an example and use the wpt]

fremy: for a new test it is nice to have the test resouces (css, js) split and easy to use

foolip: to resolve this RFC, we do have some questions that need answering

<TabAtkins> While there are lots of cool things the editor could do as well, I can day with confidence that its current functionality is extremely useful for writing tests for myself. It's surprisingly hard to do the test-commit dance by hand.

foolip: like CoC.

JohnJansen: there was a question about the reference to mongodb

Hexcles: there is an issue with google people needing to have mongo and we cant

fremy: you could have it locally but you don't really use it. in azure we use a doc storage
... you just need the mongodb protocol

Hexcles: is there other alternatives that use mongodb protocol

fremy: yes

foolip: so this could be abstracted away
... but it uses mongodb npm package?

fremy: thats the compat issue

jgraham: who implemented it?

Hexcles: we can just double check with legal but we can't install the database

foolip: there are details that we dont but we need to figure this out

[discussion about what database needs to be installed locally]

fremy: we dont need a database locally. We just needed something that was able to store documents and to make sure that it doesn't limit you to the MS cloud document storage

foolip: if you run the tests locally with no mongodb it just errors so we can talk to legal
... so it jkust needs a kv storage

fremy: yes, its really simple
... the use of the database is for login and storing the tests during dev

AutomatedTester: could we move to sqlite?

jgraham: yes but the model is different

fremy: for no database we could just write to files, thats how it used to work. It's really really simple

foolip: there is the CLA question

fremy: we should delete it

JohnJansen: agreed
... we can create individual issues on the repo and approve this rfc

foolip: we just to make sure that some things are done in order
... things that need to happen. 1) change the wpt CoC 2) delete the cla 3) pick a URL

3) will not block the RFC

foolip: is there a hostname we should use

jgraham: we can have it under web-platform-tests.org

foolip: so editor.web-platform-tests.org

fremy: MS is going continue to do hosting
... and then TabAtkins, Anton and I will continue to look at it

foolip: is there going to be a big change to make it an editor and not just a reduction tool

<ato> RRSAgent: make minutes

fremy: you would need to have a file list

<ato> RRSAgent: make minutes

[technical discussion on editor and how that would work]

ACTION JohnJansen to file issues for 1) change the wpt CoC 2) delete the cla

<ato> RRSAgent: Please make the minutes

<Boaz> http://web-platform-tests.live/

lukebjerring: one thing to mention, there is an implicit flow and how it copies it to web-platform-tests.live

<Boaz> will be http://wpt.live

<ato> ACTION: JohnJansen to file issues for (1) change the WPT CoC (2) delete the CLA

lukebjerring: the flow for http://web-platform-tests.live/ will allow resources to work for people

<ato> RRSAgent: Please draft the minutes

fremy: I was unaware of this process
... we can discuss how this could work and make/fix features in the editor

CoC

<zcorpan> github: https://github.com/web-platform-tests/rfcs/pull/17

Boaz: there was a PR for the CoC that was opened a year ago. I and jorydotcom are happy to be points of contact
... and we need to get consesus on the RFC
... and make wpt a good place to contribute

jorydotcom: I spoke to gsnedders about some of the process. They had concerns about 2 people from the same company

jgraham: and I have spoke to gsnedders about this and one of the concerns was that for a CoC enforcing group would need to reflect the diversity that we want and have a large diverse group
... we need to make sure that we cover the diversity groups covered and while the people offering are more diverse than this group but it doesnt reflect where we want to be

jorydotcom: absent of the diversity we want the people on the group need to make sure that the relevant people have the relevant training

Boaz: the right thing to do is to also add the process for people to voluteer on the CoC group

foolip: is the RFC process for adding/removing people

jgraham: that is unclear. We need to have that properly documented
... and not everything needs to be documented in the public
... e.g. if there are concerns for the moderating being in the team that doesnt need to be in the public sphere.

plh: we have moderates that feel supported to jump in and make the relevant comments to protect the group
... and it shouldnt really be from the core team

jorydotcom: I can share the TC39 CoC how that works. People can email saying that would like to join and then the CoC team will discuss and then CoC team emails a private list for TC39 only to say who is being added

jgraham: that process works for TC39 since they are closed membership where we dont have that process and tooling for that
... we dont have a closed mailing list
... we would need to have a process that someone is nominated, self nominated is fine, and return feedback would come back. The announce needs to be public but the feedback is private

<Zakim> MikeSmith, you wanted to comment

MikeSmith: I want to ask if the moderates responding is private?

Boaz: no this is for the process for joinging the moderates
... this sounds like we need to make the process and move it forward
... since we are from the same company we would like to get more people to join so it is not just us

jgraham: I can't speak for gsnedders as they have the concerns and they would need to give it
... I think that if we dont have it perfect and iterate with the idea of where we want to be then it is a good start

zcorpan: I want to discuss something with the moderates. What is the consequence if someone violates the CoC. Ban seems harsh and warnings is not harsh enough. We need to know how this could work

jorydotcom: There is a limited set of things that we can do as a group of volunteers. You need to build a ladder from "you broke our norms and here is some docs on what we agree and we want you in the community" to a full ban. In TC39 we do have temp bans for X months and keep that in a private file.

zcorpan: that is the thing that we need to record a history for a specific person on each violation
... and what was done and said to that person

Boaz: we can also explicitly add addedums on what we need to do

jgraham: the Rust community has a moderation policy on top of the CoC
... and what the ladder looks like and describes what the valid responses can be

<jgraham> https://www.rust-lang.org/policies/code-of-conduct#moderation

jgraham: e.g. keep responses private if you have been moderated

fremy: how would you appeal?
... if the group is small then the appeals might cause issues

Boaz: next steps?

AutomatedTester: we need to get gsnedders opinion on this

jgraham: I am not sure the RFC process is valid for this...
... the consesus is that it is not perfect but it is a good start and we need iterate on this to make sure things arent blocked

<scribe> ACTION: create process for adding moderators via RFC

RRSAgent: make notes

<BitBot> (14wpt) [PR] chromium-wpt-export-bot requested 13#19108 merge into 07master: Move 'autofocus' IDL attribute to HTMLOrSVGElement - https://git.io/JeO0z

<spectranaut_> presentation: https://bocoup.github.io/presentation-aria-and-webdriver/#/

<spectranaut_> whoops wrong channel ^

<cb> thanks for sharing anyway !

<BitBot> (14wpt) [PR] moz-wptsync-bot 03merged 13#19093 into 07master: [Gecko Bug 1488530] Add more tests for clip-path:path(). - https://git.io/JeOch

<BitBot> (14wpt) [PR] chromium-wpt-export-bot requested 13#19109 merge into 07master: Add expression representation to InterpolableLength - https://git.io/JeO0A

<foolip> Alright, we should start the wpt.fyi session now

<ato> We should just start.

RRSAgent: make minutes

<foolip> scribenick: lukebjerring

<AutomatedTester> scribenick lukebjerring

WPT.FYI features, etc

<ato> RRSAgent: Please draft the minutes

<foolip> scribenick foolip

<foolip> lukebjerring: overview of the triage metadata project:

<ato> ScribeNick: foolip

lukebjerring: we've created https://github.com/web-platform-tests/wpt-metadata parallel to wpt
... it's out-of-tree metadata about the results, which wpt.fyi consumes to let people triage their searches
... there's a `link:something` atom that you can filter by
... work also underway to create UI for adding links when you spot the failures
... had some feedback which made us revise, new plan is to use GitHub login and have cookie to do stuff on your behalf

foolip: why is it important to do anything on the user's behalf?

jgraham: don't necessarily need to use GitHub API call with user's credentials, but it is a requirement that we know who the user is
... wpt.fyi wants to validate with GitHub who you are

lukebjerring: feature requests in CSS working group...
... they'd like to be able to search by flags in a test
... I've filed https://github.com/web-platform-tests/wpt.fyi/issues/1491 for labeling of results
... then a bot could extract the flags from the wpt repo's tests
... then search by label on wpt.fyi

jgraham: similar to what we do for manifest
... one way would be to put it into the manifest and update from that, but we'd worry about bloating it

lukebjerring: question of whether we need a separate repo
... main reason is we expect a lot of bloat that's only for wpt.fyi
... also have requests to filter by test type, and another for `<link rel="help">` links

jgraham: historically the only information in the manifest is what you need to run the tests
... now 18 MB or so, could easy grow much bigger

lukebjerring: another kind of metadata is flakiness
... we want a few different ways to add flaky metadata to the repo, and have that information be used for filtering in wpt.fyi

AutomatedTester: does flakiness apply to all browsers?

lukebjerring: it's like the triage links, you can specify which browser

<lukebjerring> lukebjerring: other suggestions welcome

<lukebjerring> jgraham: labels should be generic enough to cover most stuff

<ato> RRSAgent: Please draft the minutes

<ato> ScribeNick: lukebjerring

jgraham: wondering what the tradeoff for keeping it separate to wpt repo justified
... could have a single job that has the right creds and only worry about the one repo

foolip: it's not too bad to use tokens

jgraham: GitHub actions give you a token more securely

foolip: there is an API you can call to trigger an action anyway

jgraham: one repo may end up being the right trade-off

foolip: would have to drop the files on import/export

jgraham: could just leave it there. info might be useful. you could atomically add a test and its metadata at the same time

foolip: taskcluster jobs will just have to be restarted, which wouldn't be the case for one repo

lukebjerring: also an issue with file renames, could catch it straight away

jgraham: we could add lint rules etc

lukebjerring: it's not clear whether we would break consumers of the wpt META.yml files

foolip: would we nest another file tree? or re-use the existing META.yml files

lukebjerring: one upside of separation is grabbing the whole repo as an archive without any bloat

foolip: deriving labels from the tests themselves, having bots that update the metadata, etc... could get messy

jgraham: initially we were only considering human-generated stuff. now we're trying to crawl data from the tests and exposing it

foolip: an extended manifest with derived info jgraham suggested could be an option here

jgraham: --extended manifest could just include extra data. if you need it you use it, if you don't, you don't
... also potentially gives you a complete list of tests that should have been run, which will help with validation of reports
... sounds like if we're extracting from the tests, we shouldn't have it in the META.yml files. that'd be... difficult / potentially divergent
... larger version of the manifest is the better option

JohnJansen: what would be the extra info?

jgraham: spec URL(s), some parsed meta tag info
... reason its not done is just the sheer bloat. performance is also an issue, but can be solved with some incremental updates

foolip: manifest is huge, but most people don't update it, they download the latest one

jgraham: Windows struggles because of the amount of io

<BitBot> (14wpt) [PR] chromium-wpt-export-bot 03merged 13#19108 into 07master: Move 'autofocus' IDL attribute to HTMLOrSVGElement - https://git.io/JeO0z

jgraham: we'll just silently produce two copies of the manifest, wpt.fyi will be the only extended manifest consumer.
... sounds like this addresses the needs.

foolip: so having the meta in wpt would solve the issues ? are we agreeing that we should converge the two?

lukebjerring: yep. and reuse the existing META.yml files, no need for parallel dir

foolip: deleting irrelevant META now becomes a bit more of an ask? lots of spam

Hexcles: We can have a GitHub action to archive out all the META.yml in isolation. maybe even the extended manifest in the same repo.

foolip: there's an item here for triaging regressions in major releases

lukebjerring: what's the feature request here? helper for building version-comparison query/diff?

Hexcles: difficulty is that WPT is changing over time, as well as the browsers

jgraham: just compare stable and beta ongoingly.
... could come up with a deliberate overlap for older stable versions for a while.

Hexcles: we have different flags for beta. what we need is apples to apples comparison

<BitBot> (14wpt) [PR] fred-wang requested 13#19110 merge into 07master: Add MathML tests to check that legacy mstyle attributes are ignored. - https://git.io/JeOEw

jgraham: We've tried to get beta and stable as close to vanilla as possible
... so far we haven't had anyone requesting this
... should the bugs be triaged exactly the same? I think probably the should.

foolip: browser specific failures are probably more useful, but the beta comparison is still good.

lukebjerring: let's talk about manual results submission
... hard part is aggregating conflicting submissions (which was a feature request)

jgraham: wpt.fyi is very run-centric. submitting one result at a time is not

JohnJansen: the model that they're using is running the test, saying pass/fail, maybe wait a long time, run another test, and then aggregate it all next to the automatically collected results

jgraham: we can have this convo with them, but, you'll probably find that lots of manual data munging to get it all into one view that doesn't aggregate with the automatic data, they won't use it, they'll use the existing tool

JohnJansen: they used to have a locked test suite (locked to a SHA) and they would not change the rev of the suite

Hexcles: so they locked multiple components to different SHAs?

JohnJansen: yes because they're interested in whether it most recently passed

jgraham: If you can take a SHA, collapse by product (single column), that would fix the way wpt.fyi treats runs separately

<BitBot> (14wpt) [PR] aluochromium requested 13#19111 merge into 07master: Configure chrome_android and android_webview to use server_host env option. - https://git.io/JeOED

JohnJansen: how would collapsing different runs in the UI handle conflicting results?

lukebjerring: most recent wins?

JohnJansen: I doubt they'd be OK with that, they'd want most authoritative wins
... they actually have a voting UI

<BitBot> (14wpt) [PR] fred-wang requested 13#19112 merge into 07master: Add MathML test to check that menclose@notation="radical" is not supp… - https://git.io/JeOEQ

<fantasai> yo

<BitBot> (14wpt) [PR] chromium-wpt-export-bot 03merged 13#19082 into 07master: [webnfc] Add tests for NFCPushOptions.ignoreRead - https://git.io/JeOL8

<JohnJansen> yo

<ato> RRSAgent: Please draft the minutes

<gsnedders> RRSAgent: make the minutes

QUIC

<ato> ScribeNick: AutomatedTester

Hexcles: introduction: We have chromium team asking for support for QUIC. The problem is python2 doesnt have crypto issues
... we want to see if there a way that we can use a Go implementation

jgraham: what is the difference for the H2?

Victor: what does it have in it?

jgraham: [explains details of H2 support]

Victor: the QUIC and H2 are very different due to tls support

Victor if you look at the QUIC impl matrix are there are 20 on there and 10 well done

Victor [describes different example impl]

<BitBot> (14wpt) [PR] fred-wang requested 13#19113 merge into 07master: Add MathML reftest to check that the mode attribute has no effect. - https://git.io/JeOu4

jgraham: 1 question: what are the use cases for testing are and what API are you thinking?
... traditionally we have been testing the interaction between protocol and features but not the protocol directly

<inserted> Victor: tests are for WebTransport, which is similar to WebSocket but in QUIC

jgraham: What would the test look like?

<BitBot> (14wpt) [PR] fred-wang 03merged 13#19110 into 07master: Add MathML tests to check that legacy mstyle attributes are ignored. - https://git.io/JeOEw

yhirano_: we have a python script that ...(websocket mention)

<BitBot> (14wpt) [PR] fred-wang 03merged 13#19112 into 07master: Add MathML test to check that menclose@notation="radical" is not supp… - https://git.io/JeOEQ

jgraham: if something that needs to be in the API, what would you do it in

Victor: that would be in Go or Rust

jgraham: we currently dont have and compiled code step atm, we can add it but its not there at the moment
... there is some impl complexity that needs to be supported

<gsnedders> https://wicg.github.io/web-transport/ is the specification in question, right?

<BitBot> (14wpt) [PR] fred-wang 03merged 13#19113 into 07master: Add MathML reftest to check that the mode attribute has no effect. - https://git.io/JeOu4

Hexcles: could the QUIC team give us a prebuilt binary and compiling tests
... and be separate repo for tests

jgraham: we need to have ideally in a single repo

yhirano_: we would have to have the handlers somewhere that can be updated regularly

victor: is the requirement that no compilation or it must be python

jgraham: there is no precedent for compiled code in wpt

Hexcles: but the language doesnt matter here if it is rust or go

jgraham: the complexity is that Firefox CI does Rust not Go
... and google will probably have the opposite problem

ricea: [describes how we could have python over the go and connections to wptserve]
... or we could do the handlers in JS but it would be horrible
... my concern is that if we ship it we are limiting who can contribute to fix it while if we ship the build script then more people could add things. if the compiled step then having a handler in a non-compiuled language makes sense

gsnedders: my concerns if we want webkit more involved they have pushed back if there has been runtime dependencies

jgraham: we could have a process for compiling and then uploading to github for upstream, the browser version would be very different
... understanding the constraints for everyone would be simplify things.

youennf: we have multiple servers why should we have it
... but at the end of the day the value of tests will win

<Hexcles> https://github.com/web-platform-tests/wpt/issues/19114

<Hexcles> ScribeNick: Hexcles

ergonomics of test writing

reillyg: I've found it difficult to debug WPT

foolilp: is it the error message or wpt.fyi that's confusing?

reillyg: e.g. the "expand" button on wpt.fyi

lukebjerring: [explains the UI]

jgraham: what is the ideal state?

reillyg: summary of all the tests failed per browser (referring to Blink layout test viewer: expected, actual, diff)
... make it as obvious as possible -- a link to failing tests, especially in PR

lukebjerring: wpt.fyi GitHub Checks are not great (see discussion yesterday)

jgraham: the "diff" on wpt.fyi is different from the diff in Blink CI -- we don't have baselines

lukebjerring: perhaps expand the results by default

reillyg: want a summary page of all browsers instead of five different links

lukebjerring: probably better to build a new view for the PR use case

jgraham: collapse everything into one single check

lukebjerring: open issue: attach a list of run IDs to a check run instead of using e.g. "chrome@SHA" to avoid a race condition and to be able to show pending status

<ato> RRSAgent: Please draft the minutes.

<ato> RRSAgent: Please draft the minutes

shimazu asked about sub test time out (one subtest time out makes the whole test time out)

jgraham: promise_test is especially problematic because they run sequentially
... if a sub test times out the subsequent tests won't run

lukebjerring: "difficult to determine subtest timeout" isn't a great argument. Being able to set subtest timeout is better and more meaningful than harness timoeout

jgraham: allowing authors to define subtest timeout is error-prone and flaky (e.g. a virtual environment can be paused)

lukebjerring: but step_timeout takes multiplier into account; we can do the same for subtest timeout
... or we can set "short" "long" etc. timeout instead of concrete numbers

reillyg: slow test is usually a test with too many subtests
... shall we prefer splitting those
... can the harness monitor the progress of subtests?

jgraham: it sounds like a good design and possible, but we don't do that now (harness sends a JSON blob at the end)
... it doesn't solve your promise_test case?

reillyg: combine the two: the harness monitors the promise chain

jgraham: good idea

lukebjerring: e.g. similar to Travi (timeout if no output in 10m)

jgraham: there's a lot of effort; testdriver.js & wptrunner need to be re-engineered
... testdriver.js already has a queue

reillyg: improve the communication between harness and wptrunner. the test can declare the total number of subtests

jgraham: there are more complicated cases

foolip: for now we can split tests

shimazu: yes we are doing that
... one more thing: service worker un/registration is expensive, which becomes a big overhead if we split promise tests
... we use a promise test to do the setup

foolip: this is the same as IDL harness test

lukebjerring: this is not a great pattern

foolip: we don't have an async setup function
... [explains how this breaks async_test]

jgraham: yeah the lifecycle is tricky
... if we say "if you use async_setup you can only use promise_test" then we might be fine
... we can have promise_setup!

foolip: and emit errors if authors mix wrong setup/tests
... write an RFC!

lukebjerring: I imagine it's very common to have lots of promise tests stuffed in a test file to avoid the setup overhead

foolip: can we just call promise_setup setup? if it returns a thenable then it's a setup for promise_test

<gsnedders> jgraham: everyone loves things that are polymorphic on their return type, right?

jgraham: put this as an option in the RFC

lukebjerring: I actually sent an RFC earlier for explicit naming / against polymorphic parameters

ACTION jgraham file an issue to keep running as long as they’re producing subtest results

<scribe> ACTION: foolip to RFC for async setup() (or promise_setup())

<scribe> ACTION: jgraham to file an issue to keep running as long as they’re producing subtest results

foolip: switching topic to single page tests
... [explains single page tests]
... it was added earlier to reduce boilerplate (similar to mochi tests)
... useful for async tests (saves a layer of wrapping)
... it's become problematic because of accidental opt-in
... ~600 tests that fail strangely because they accidentally trigger single page mode (e.g. they trigger an exception before starting any test)

jgraham and foolip debating what result type is / and should be in this case

foolip: [explains again when single page mode is triggered -- see RFC / docs]

https://github.com/web-platform-tests/rfcs/pull/28

lukebjerring: what's the motivation?

jgraham: not for async tests, but e.g. something that responds to multiple events, each of which is a callback that needs to be wrapped in t.step
... if you don't wrap the callbacks, it ends up as a harness error
... so the real use case is to avoid wrapping every single callback

foolip: my proposal is an explicit opt-in
... the real issue is difficulty to write correct async tests
... if we solve that we can get rid of single page tests altogether

jgraham: I'm historically opposed to adding explicit opt-in to a feature which aims to avoid boilerplate

lukebjerring: there are so few of single-page tests (~130), so adding a one-line opt-in isn't too bad

jgraham: if there are only 130 of these -- although it seems to me this is a useful feature -- people are not using the feature much so it doesn't really matter?

odejesush: another issue: sometimes forget an event listener (e.g. for bluetooth disconnection)

foolip: is the problem just forgetting to write an await

reillyg: perhaps a linter?
... rewrite all in TypeScript!

(laugh)

foolip: [back to single-page tests] can we require setup.. explict done?

jgraham: that sounds like a bad opt-in

foolip: to fix 9000(?) tests, we can either add the t.step wrapper or turn them into single-page tests
... 9000 tests are the number of tests that have only one subtest
... they are not necessarily broken
... but if there's a harness error in such case we could fix it by adding the t.step wrapper or turning them into single-page tests

2020 priorities

https://docs.google.com/document/u/1/d/1gie7LFb6cAUfabY86MYuWM7m7ux_FaKhDkLdpz0zWkg/edit?usp=sharing

foolip: last year we wrote lots of things
... but nobody looked at it over the year
... though we managed to get lots of them done

<MikeSmith> denis: https://www.w3.org/2019/09/17-testing-minutes.html has not been getting (re)generated for most of the day

jgraham: this is not going to be something that has teeth, but something to align on

<MikeSmith> s/denis: https:\/\/www.w3.org\/2019\/09\/17-testing-minutes.html has not been getting (re)generated for most of the day/

lukebjerring: let's enumerate stuff that's important

Hexcles: can we start by going over action items?

jgraham: they are concrete things to do but not goals
... need to find the underlying theme

<MikeSmith> actions?

[please refer to https://docs.google.com/document/d/1gie7LFb6cAUfabY86MYuWM7m7ux_FaKhDkLdpz0zWkg/edit]

[people are discussing wording of the goals]

<ato> RRSAgent: make minutes

<BitBot> (14wpt) [PR] Honry 04closed 13#11600: [sensors] Bring setUpdateSensorReadingFunction from sensor mojo inter… - https://git.io/JeOzb

<denis> RRSAgent: make minutes

<BitBot> (14wpt) [issue] Hexcles opened 13#19114: Add a QUIC server to test WebTransport - https://git.io/JeOzj

<zcorpan> https://bocoup.github.io/wpt-disabled-tests-report/

i/scribenick AutomatedTester/Topic: QUIC

- DRAFT -

Web Platform Tests, Day 2, TPAC 2019

17 Sep 2019

Attendees

Contents

Test Automation

RFC: Test Editor

CoC

WPT.FYI features, etc

QUIC

ergonomics of test writing

2020 priorities

Summary of Action Items

Summary of Resolutions

Scribe.perl diagnostic output