Copyright © 2002-2004 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
The principal goal of this document is to help W3C Working Groups to develop more useful and usable test materials. The material is presented as a set of organizing guidelines and verifiable checkpoints. This document is one in a family of Framework documents of the Quality Assurance (QA) Activity, which includes the other existing or in-progress specifications: Introduction, Operational Guidelines, and Specification Guidelines.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This document is a W3C Working Draft made available by the W3C Quality Assurance (QA) Working Group for discussion by W3C members and other interested parties. For more information about the QA Activity, please see the QA Activity statement.
This update to the QA Framework: Test Guideline is incomplete and still very much a work in progress. However it does reflect changes resulting from feedback that has been received since the previous version was published. A significant number of outstanding issues and comments have yet to be resolved. Where appropriate, these are indicated in the text by Editorial Notes that appear as follows:
Editorial note. ...body of note, etc....
Additional feedback and comments are encouraged. This Working Draft also does not reflect issues about the QA Framework that have arisen during the Candidate Recommendation period of Operational Guidelines [QAF-OPS] and Specification Guidelines [QAF-SPEC]. The QA Working Group is currently considering the disposition of these issues, and potential changes to the QA Framework family of documents.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
The QA Working Group Patent Disclosure page contains details on known patents related to this specification, in conformance with the 24 January 2002 CPP as amended by the W3C Patent Policy Transition Procedure.
You may email comments on this document to www-qa@w3.org, the publicly archived list of the QA Interest Group [QAIG]. Please note that comments that you make will be publicly archived and available, do not send information you would not want to see distributed, such as private data.
The scope of this specification is a set of requirements for Test Materials that, if satisfied, will enhance the usability and usefulness of the test materials. It covers the analysis and coverage of specifications, the priorization and management of test cases, test frameworks and result reporting.
The goal is to help W3C Working Groups (WGs) and test developers to develop test materials that provide consistent, reproducible results within a well defined and clear scope.
This specification addresses two classes of product: conformance test materials - including conformance test suites, validation tools, conformance checklists, and any other materials that are used to check or indicate conformance - and certain metadata about these test materials (for example, the results of reviewing or testing them). For a list of some of the types of specification for which test materials may be developed, see the QA Framework: Specification Guidelines.
The intended audience of these guidelines is developers of conformance test materials for W3C specifications. However they should also be of interest to W3C Working Group members as well as users of test materials.
While it is preferable that test development begin early in the process, these guidelines are intended for newly chartered and existing Working Groups alike. WGs whose work is already in progress should review this document and incorporate its principles and guidelines into their test materials as much as possible.
The development of test materials helps to enhance the quality of specifications by identifying aspects of the specification that are ambiguous, contradictory or unimplementable. By helping to improve the clarity of the specification, the quality of implementations is also improved. Conformance test materials also help to ensure that independent implementations conform to the specification, thereby increasing the likelihood that these implementations are interoperable.
This document is part of a family of QA Framework documents designed to help the WGs improve all aspects of their quality practices. The QA Framework documents are:
Although the QA Framework documents are interrelated and complement each other, they are independently implementable. For example, the anatomy of a specification is related to the type of test materials that will be built, hence there is an interrelationship between this document and the Specification Guidelines. The reader is strongly encouraged to be familiar with the other documents in the family.
The Framework as a whole is intended for all Working Groups, as well as developers of conformance materials for W3C specifications. Not only are the Working Groups the consumers of these guidelines, they are also key contributors. The guidelines capture the experiences, good practices, activities, and lessons learned of the Working Groups and present them in a comprehensive, cohesive set of documents for all to use and benefit from. The objective is to reuse what works rather than reinvent and to foster consistency across the various Working Group quality activities and deliverables.
This specification applies a model of organizing guidelines and verifiable checkpoints to test materials. Each guideline includes:
Editorial Note.Provide an overview of the flow of guidelines.
The checkpoint definitions in each guideline define what needs to be done in order to accomplish the guideline. Each checkpoint definition includes:
Each checkpoint is intended to be specific enough so that someone can implement the checkpoint as well as verify that the checkpoint has been satisfied. A checkpoint will contain at least one, and may contain multiple, individual requirements, that use RFC2119 normative keywords. See the Conformance section for further discussion of requirements and test assertions. Two separate appendices to this document [TEST-CHECKLIST] and [TEST-ICS] present all checkpoints in a tabular form sorted in their original order and sorted by their priorities, for convenient reference. The latter is an Implementation Conformance Statement (ICS) pro-forma for this specification.
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY ", and "OPTIONAL" will be used as defined in RFC 2119 [RFC2119]. When used with the normative RFC2119 meanings, they will be all uppercase. Occurrences of these words in lowercase comprise normal prose usage, with no normative implications.
Unusual terms in these framework documents are defined when first used. When used in this specification, terms have the meaning assigned in the "Definitions" chapter and the QA Glossary [QA-GLOSSARY]. Terms in Definitions may supplement or build on the definitions in the generic QA Glossary, further refining the definition in the context of the Specifications guidelines. Terms herein that also appear in the QA Glossary include a link to the QA definition.They will not contradict the generic definitions
Some checkpoints are more critical than others for producing a high quality, testable standard that is a sound basis for successfully interoperable products. Therefore each checkpoint is assigned a priority level based on the checkpoint's impact on the quality of the specifications produced by the Working Groups.
[Priority 1]
Critical/Essential. These checkpoints are considered to be basic requirements
for ensuring an acceptable level of quality and usability in the test
materials.
[Priority 2]
Important/desirable. Satisfying these checkpoints, in addition to the
priority 1 checkpoints, should significantly improve the quality and
usability of the test materials.
[Priority 3]
Useful/beneficial. Satisfying these checkpoints, on top of all the others,
will further improve the quality and usability of the test materials.
Working Groups choose to develop conformance tests for a variety of reasons:
The term Quality Assurance is commonly used to refer to a a wide variety of software testing activities, including:
Testing conducted to verify that an implementation conforms to a formal specification (typically one defined by a standards organization).
Testing conducted to verify that software meets its functional requirements (which may include, but typically exceed, conformance to formal standards).
Testing to verify that two or more software products are capable of interacting with each other, perhaps via a communications or messaging protocol, or by exchanging data through some other means. Note that conformance to specifications is a necessary, but probably not sufficient, condition for two systems to be interoperable.
Testing conducted to evaluate the compliance of a system or component with specified performance requirements.
Testing conducted to evaluate a system or component at or beyond the limits of its specified requirements.
Testing conducted to evaluate the ease with which users can learn and use a software product.
These guidelines primarily address conformance testing (a later revision of this document may address other types of testing). However, since conformance testing focuses on verifying conformance to (compliance with) formal specifications, the principles and practices outlined in this document should prove applicable to other forms of testing that can be expressed in these requirements-based terms.
Working Groups adopt a variety of strategies for developing specifications and their associated tests. Some adopt a formal 'waterfall model' of development whereby conceptualization, requirements analysis, design, implementation, and testing follow each other in a linear fashion. Others prefer a more exploratory or iterative approach, developing tests in parallel with the specification, while still others might prefer a 'test-driven development' approach, whereby test cases are developed first and the implementation afterwards. The latter two approaches can be particularly effective in helping to improve the quality of the specification, and in clarifying whether proposed features are actually implementable.
These Guidelines address the results of the test development process (the test materials and their associated metadata) rather than the process by which tests are developed. Each of these development models can be an appropriate way of developing conformance tests and in practice many conformance tests are developed using the iterative development model, through which the process of test development helps to clarify and improve the quality of the specification by identifying ambiguous and unimplementable requirements.
However, if tests are developed before or concurrently with the relevant portion of the specification it is essential that they be reviewed after the specification has reached its final form in order to verify that they do actually test behaviour that is defined in the specification. Tests that were developed to explore functionality that was ultimately eliminated from or redefined within the specification, while useful, cannot be considered conformance tests. (Depending on the circumstances it might be possible to modify them so that they do test specified behaviour.)
Editorial Note.To be supplied.
In order to determine the testing strategy or strategies to be used, a high-level analysis of the structure of the specification (the subject of the test suite) must be performed. The better the initial analysis, the clearer the testing strategy will be.
The test documentation MUST identify the specifications to be tested.
Many specifications make reference to and/or depend on other specifications. It is important to determine the extent to which the specification under test can assume that referenced specifications have already been conformance-tested and conversely, the extent to which referenced specifications must be explicitly tested.
The test suite documentation MUST define its scope and intended purpose.
When developing test suites it is critical to understand their purpose and scope.
The scope describes the areas covered by the test suite, thereby indicating the applicability of the test suite, the motivation and objectives, and coverage. For example, the intended purpose and coverage of tests for a CR document may be different from those for a Recommendation.
The test suite documentation MUST describe the testing approach
Different areas of the specification under test may require different testing approaches (for example, low-level API testing as opposed to higher-level user-scenario testing). The test-suite documentation should explain any partitioning of the specification (and other referenced specifications) and the testing approach adopted for each partition.
As recommended in the QA Specification Guidelines, an important prerequisite of test development is the identification of test assertions from the specification.
Test assertions MUST be identified and documented.
Conformance testing involves the verification of normative statements within specifications. If such statements ("test assertions") cannot be unambiguously identified, the validity of the corresponding tests will be in dispute.
In some cases assertions can be directly identified within the specification (that is, by identifying text from within the specification). In other cases assertions may be derived from the specification, perhaps automatically.
The Working Group MUST define the required set of metadata that will be associated with test assertions. This MUST include at least the following data:
It must be possible to uniquely identify assertions, and to map them to a particular location, or to particular text, within the specification.
Note that the metadata defined in this checkpoint is not exhaustive; Working Groups are encouraged to define additional metadata as appropriate. For example, some Working Groups may choose to record whether specified behaviour is ambiguous, contradictory, incomplete, untestable, or deprecated.
Editorial Note. These checkpoints overlap with 10.1 and 10.2 from SpecGL ("Provide test assertions" and "Provide a mapping between the specification and the test assertions list"). Is it appropriate to duplicate them here?
When test materials are submitted to the Working Group they will need to be cataloged, reviewed, and tested before being they can be documented, packaged, and published.
In order to facilitate these activities it is advisable to provide appropriate metadata for managing test materials (including how test materials are identified, where they will be stored, and how they can be selected and filtered according to various criteria).
The Working Group MUST define (or reuse, if existing) the required set of metadata that will be associated with test materials. This MUST include at least the following data:
Well defined test metadata will allow tests to be filtered and selected (whether at 'build time', while a collection of tests is being constructed, or at 'run time' when they are executed), will facilitate the test development and review processes, and is essential for the definition of a test execution process (see Guideline 4). 'Build time' test selection makes it possible to select for execution only those tests that are applicable to the implementation under test. (Tests for unsupported features should be filtered out.) Runtime test selection allows test execution to focus on particular areas of interest or concern.
Some Working Groups may choose to create some test metadata elements even before test development begins, as an aid to planning and managing the test development process. For example, it may be helpful to provide the test purpose and description as a form of test-specification.
Note that the metadata defined in this checkpoint is not exhaustive; Working Groups are encouraged to define additional metadata as appropriate. Some examples are:
Note that as these Guidelines become more widely adopted it is increasingly likely that Working Groups will be able to adopt existing metadata schema rather than having to define their own.When the Working Group requests test submissions, it should also request that the appropriate metadata be supplied.
Once the appropriate metadata has been defined, automation, perhaps by providing a web-based interface to a database backend, would simplify the process of organizing, selecting, and filtering test materials.
Editorial Note.Comment on automation belongs in ExTech?
Editorial Note. The term "test materials" seems clumsy in this context; the majority of metadata elements refer to "tests").
Editorial Note. Add to ExTech the recommendation that definition of metadata be the responsibility of a single individual or a small group - don't "design by committee".
The test materials management process MUST provide coverage data. At a minimum, a mapping of test cases to assertion ids MUST be published, and the percentage of assertions for which at least one test-case exists MUST be calculated and published.
In order to thoroughly test an implementation, each area of functionality must be systematically tested. Test developers and the Working Group must know what has been covered, what still needs to be covered and what is not applicable. A mapping of test cases to test assertions is the best way to track this.
In more thorough test suites it is likely that a single assertion would be associated with more than one test case. Note that while the coverage metric specified above provides a measurement of breadth of coverage, it says nothing about the extent to which assertions are thoroughly tested (depth of coverage).
Editorial Note. Now that we've dropped the previous Checkpoint 3.1 (Define the process for managing test materials) this checkpoint comes as something of a surprise (what process for managing test materials?).
The test materials management process MUST be automated.
Automation of the test materials management process, perhaps by providing a web-based interface to a database backend, will simplify the process of organizing, selecting, and filtering test materials.
The process for executing tests MUST be well defined and documented.
The test-execution process should be repeatable and reproducible; different test execution runs on the same system, even if performed by different testers, should return identical results. This can only be ensured by explicitly specifying the selection of tests to be run, the mechanisms for invoking them, and the way in which test results should be interpreted.
Clear and unambiguous instructions on how to run the test suite should be supplied to ensure that no two users running the same test suite following the instructions under the same conditions achieve different results (the test suite should be deterministic).
The test material documentation MUST specify where test results may not be repeatable or reproducible.
If some test results are potentially not reproducible or repeatable, this MUST be documented.
If test results may vary depending on the order in which tests are executed, or for any other reason, this must be documented to allow results to be interpreted correctly, and to enable different test runs to be compared. Wherever possible, the documentation should explain what must be done in order to ensure that the results are reproducible and repeatable to the greatest extent possible.
The documentation for the test execution process MUST explain how to filter out tests that should not be run for a particular implementation.
Only those tests that are applicable to the implementation under test should be run. It must be possible to filter out those tests that do not apply.
It should be made clear what the tests test and what the implementation is designed to support before it is decided what tests to select and run for that implementation. In addition, it should be documented what tests have been filtered, chosen and run before conformance is assessed.
Test execution MUST be automated in a cross-platform manner.
If feasible, automating the test execution process is the best way to ensure that it is repeatable and deterministic, as required by Checkpoint 4.1. If the test execution process is automated, this should be done in a cross-platform manner, so that all implementers may take advantage of the automation. The automation system must support running a subset of tests based on various selection criteria, as suggested by Checkpoint 4.3.
It is important that the test suite be executable on as many platforms as possible; to this end strive to make the test suite run independently of platform.
Editorial Note.If an automation framework is supplied and available on the implementation platform, should it be required, or should testers be permitted to provide their own and still make conformance claims? (See Issue #27)
If the test execution process is automated, the automation system MUST also capture test results.
If tests are executed automatically by some kind of test harness it will be a relatively simple matter to design the harness to capture test results for review by the person executing the tests, and for later automated processing.
However test materials are developed (this is typically performed in a distributed manner by vendors who are interested in the specific technology) they must be reviewed, tested, documented, and packaged together with other materials (documentation, data files, perhaps a test harness) into a "test suite", which itself must be tested before it is finally released.
The test material metadata MUST include the results of reviewing the tests.
Working Groups usually state requirements for the submission of test materials. These will typically include a request that certain metadata be provided with the test materials (see Checkpoint 3.1 above), and may include additional requirements, for example that assertion lists and coverage information be provided. Thorough review of the submitted materials will help to ensure that these additional requirements have been met. The results of this review should be included in the test materials metadata.
Under some circumstances it may be appropriate or even necessary that test materials not meet all of the submission requirements. Reviewers should be granted adequate discretion to deal with cases such as these. Note also that "review" need not necessarily imply that each test was scrutinized by a human reviewer. An automated scanning and parsing process may be sufficient in some cases, whereas in other cases the review might consist of executing a large group of tests against multiple implementations, or otherwise testing them in an automated manner.
The test materials MUST include documentation explaining how they are to be used.
Guideline 4 requires that the test execution process be repeatable and reproducible. This will typically require the creation of some end-user documentation that explains to testers how to use the test materials.
The test materials MUST be packaged together with all applicable user documentation, licenses, data files, tools, and test harness into a comprehensive test suite.
Editorial Note.As written, this conformance requirement is untestable. Need to specify what is required, what is optional, and also must define terms.
Guideline 4 requires that the test execution process be repeatable and reproducible. This would be difficult to achieve if it were left to the users of the test materials to determine which materials are required, and how to combine them.
A simple tar or zip file, containing all relevant test materials and documentation, should be sufficient to meet the requirements of this checkpoint.
Editorial Note.Checkpoints 5.2 and 5.3 overlap with 5.1 from OpsGL ("Ensure test materials are documented and usable for their intended purposes").
The Working Group MUST create and publish a test plan for testing the test materials and the test suite as a whole. The results of executing this test plan MUST be published.
If tests are buggy or incorrect this will be noticed eventually, and will lead to their exclusion from the test suite. Similarly, if the test suite as a whole has problems (if it is difficult to install, or has inaccurate documentation) then these problems will be reported by users. It is preferable to discover and correct these problems before the test suite is released rather than to wait for users to discover them while testing their implementations. The test materials must therefore be tested to ensure that they function correctly, and that they correctly test what they claim to test, and the test suite as a whole must be tested to ensure that it is usable and that it performs as designed. Publishing the test plan and test results will help to ensure that users of the test suite have confidence in its quality.
Editorial Note.Should we create separate checkpoints addressing testing the individual tests and testing the test suite as a whole?>
Editorial Note.Add recommendation to ExTech that the test suite be beta-tested before release.
The test materials MUST include a mechanism for users to provide feedback.
Test suites can reach high quality only if they themselves are thoroughly tested. This testing process should include real-world testing by users and potential users of the test materials, and the results of this testing should be fed back to the test suite developers.
Test suites MUST be published with version numbers.
Test suites must be explicitly released rather than continuously evolved by constantly updating a website), and must be versioned. If they are not, test runs will not be repeatable and reproducible, and it will be impossible to effectively compare the results of different test runs.
A well defined and consistent mechanism for reporting test results is essential to ensure that the test materials are easy to use and that it is possible to compare the results of executing the tests on different implementations. This will make easier to compare implementation feature-sets to determine interoperability. It also provides data that supports the process of identifying overall and feature-specific levels of support among implementations.
Tests MUST report their execution outcome in a consistent manner. At a minimum, tests MUST report the following statuses:
If tests do not report their outcome, or if they do so in an inconsistent manner, it will be difficult or even impossible to unambiguously determine the results of a test execution run.
These result statuses are derived from EARL (the Evaluation And Report Language)
It is not necessary for tests to automatically report status for this checkpoint to be met. It would be sufficient, in the case of manually executed tests, for the test execution procedure to unambiguously define how the person executing the tests should determine the test execution status.
When tests fail, they MUST report on the reason for failure.
In order to improve an implementation the developers must know what is failing and how it is failing in order to fix it. The more details that are provided about the failure, the easier it is for the developers to locate and fix the problem. Failing tests should report whatever information they have about the nature of the failure (for example, they should report the conditions they expected to encounter and what actually occurred).
Define an interface to enable publishing the results of test execution runs
A well defined and standardized mechanism for publishing the results of test execution runs will facilitate the comparison of different test results. Working Groups must define a standard form for results reporting, and make the necessary style sheets available to implementers. This will facilitate their publication on the web.
This section defines conformance of Working Group Test Materials to the requirements of this QA Framework guidelines specification. The requirements of this guidelines specification are detailed in the checkpoints of the preceding "Guidelines" chapter, and apply to the Test Materials produced by Working Groups.
The following parts of this document are normative:
Text that is designated as normative is directly applicable to achieving conformance to this document. Informative parts of this document consist of examples, extended explanations, terminology, and other matter that contains information that should be understood for proper implementation of this document.
This specification is extensible. That is, adding conformance related information and structure to the test materials in ways beyond what is presented in this specification is allowed and encouraged. Extensions to this specification MUST NOT contradict nor negate the requirements of this specification
The rationale for allowing Working Groups to define extensions to these test guidelines is that these requirements are considered to be the minimal requirements for developing effective conformance test materials. Doing more than the minimum is not only acceptable, but beneficial. Extensions also allow Working Groups to tailor their test materials more closely to the technology and their specific needs.
Within each prioritized checkpoint there is at least one conformance
requirement.These are prefaced by "Conformance requirements"; and
highlighted
in a different style.
Editorial Note.Insert statement about test assertions.
This section defines three degrees of conformance to this guidelines specification:
A checkpoint is satisfied by fulfilling all of the mandatory conformance requirements. Mandatory requirements are those that use the conformance keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", or "SHALL NOT".
Note that it is allowed (even encouraged) to implement checkpoints in addition to those required to satisfy a particular degree of conformance (A, AA, or AAA). The more checkpoints that are satisfied, the better. However, there is no additional conformance designation for such intermediate collections of checkpoints (i.e., for all checkpoints of a given level plus some, but not all, of the checkpoints of the next levels).
The minimum recommended conformance to this specification is A-conforming -- all Priority 1 (critical/essential) checkpoints satisfied.
Test materials conform to the QA Framework: Test Guidelines at degree X (A, AA, or AAA) if the Working Group meets at least all degree X conformance requirements.
An assertion of conformance to this specification -- i.e., a conformance claim -- MUST specify:
Example:
Editorial Note. Insert example conformance claim.
The checklist for this specification ([TEST-ICS]) is the Implementation Conformance Statement (ICS) pro forma for this specification. Any assertion of conformance to this specification MUST link to a completed ICS.
The checkpoints of this guidelines specification present verifiable conformance requirements about the test materials that Working Groups produce. As with any verifiable conformance requirements, users should be aware that:
Passing all of the requirements to achieve a given degree of conformance — A, AA, or AAA — does not guarantee that the subject test materials are well-suited to or will achieve their intended purposes.
This section contains terms used in this specification, with functional or contextual definitions appropriate for this specification. See also [QA-GLOSSARY]. Some terms in this section have been borrowed or adapted from other specifications.
The following present and former QA Working Group and Interest Group participants have contributed to this document:
Major revision based on written feedback and face-to-face discussions.
Added Patrick's text for GL 1 and 2. Wrote text for GL 3 and 4. Rewrote GL 5 and merged it into GL 3 and 4 and tried to address some of the comments made by Mark and Sandra in e-mail.
Moved the definitions section to its new location and updated it.
Started work on new introduction using the same format as the other framework documents.
Added new outline to document; commented out a number of sections that need editing.
Edited, improved the Introduction(goals, motivation, document's structure).
Updated the definition of the checkpoint's Priorities.
Corrected abstract, SOT.
Changed the goal of the document and wording of the checkpoints/guidelines to focus it on testing strategy, moving all the tips on tactics to be EX-TECH.
Added examples to most of the checkpoints.
incorporated all the editors comments.
Rewrote several checkpoints in the GL5; defined results verification and reporting as part of the test framework to remove redundant checkpoints.
Rewrote guideline 6 and 7 to focus on strategy of test development and testing, rather then on tactics.
Updated the conformance section.
Expanded introduction, added motivation, etc.
Added examples to the checkpoints in the GL1, 2, 3.
[MS] Changed the text of many checkpoints to make them verifiable.
[DD] First pass on Introduction, added more text to the checkpoints in the GL3-5.
Fixed definitions of priorities.
Fixed the glitch with the "Test Areas" guideline.
Added clarification to CP 1.1, 1.2, 1.5 (removed "vague"), 1.6.
Added short prose to each checkpoint.
First draft outline.