WebGPU Shading Language

W3C Working Draft,

More details about this document
This version:
https://www.w3.org/TR/2022/WD-WGSL-20221031/
Latest published version:
https://www.w3.org/TR/WGSL/
Editor's Draft:
https://gpuweb.github.io/gpuweb/wgsl/
Previous Versions:
History:
https://www.w3.org/standards/history/WGSL
Feedback:
public-gpu@w3.org with subject line “[WGSL] … message topic …” (archives)
GitHub
Inline In Spec
Editors:
(Google)
(Google)
Former Editors:
(Apple Inc.)
(Google)
Participate:
File an issue (open issues)
Tests:
WebGPU CTS shader/

Abstract

Shading language for WebGPU.

Status of this document

This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

Feedback and comments on this specification are welcome. GitHub Issues are preferred for discussion on this specification. Alternatively, you can send comments to the GPU for the Web Working Group’s mailing-list, public-gpu@w3.org (archives). This draft highlights some of the pending issues that are still to be discussed in the working group. No decision has been taken on the outcome of these issues including whether they are valid.

This document was published by the GPU for the Web Working Group as a Working Draft using the Recommendation track. This document is intended to become a W3C Recommendation.

Publication as a Working Draft does not imply endorsement by W3C and its Members.

This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 2 November 2021 W3C Process Document.

1. Introduction

WebGPU Shading Language (WGSL) is the shader language for [WebGPU]. That is, an application using the WebGPU API uses WGSL to express the programs, known as shaders, that run on the GPU.

@fragment
fn main() -> @location(0) vec4<f32> {
    return vec4<f32>(0.4, 0.4, 0.8, 1.0);
}

1.1. Technical Overview

WebGPU issues a unit of work to the GPU in the form of a GPU command. WGSL is concerned with two kinds of GPU commands:

Both kinds of pipelines use shaders written in WGSL.

A shader is the portion of a WGSL program that executes a shader stage in a pipeline. A shader comprises:

Note: A WGSL program does not require an entry point; however, such a program cannot be executed by the API because an entry point is required to create a WebGPU § GPUProgrammableStage.

When executing a shader stage, the implementation:

A WGSL program is organized into:

WGSL is an imperative language: behavior is specified as a sequence of statements to execute. Statements can:

WGSL is statically typed: each value computed by a particular expression is in a specific type, determined only by examining the program source.

WGSL has types to describe booleans, numbers, vectors, matrices, and aggregations of these in the form of arrays and structures. Additional types describe memory.

WGSL does not have implicit conversions or promotions from concrete types, but does provide implicit conversions and promotions from abstract types. Converting a value from one concrete numeric or boolean type to another requires an explicit conversion, construction, or reinterpretation of bits; however, WGSL does provide some limited facility to promote scalar types to vector types. This also applies to composite types.

WGSL has texture and sampler types. Together with their associated built-in functions, these support functionality commonly used for graphics rendering, and commonly provided by GPUs.

The work of a shader stage is partitioned into one or more invocations, each of which executes the entry point, but under slightly different conditions. Invocations in a shader stage share access to certain variables:

However, the invocations act on different sets of shader stage inputs, including built-in inputs that provide an identifying value to distinguish an invocation from its peers. Each invocation has its own independent memory space in the form of variables in the private and function address spaces.

Invocations within a shader stage execute concurrently, and may often execute in parallel. The shader author is responsible for ensuring the dynamic behavior of the invocations in a shader stage:

WGSL sometimes permits several possible behaviors for a given feature. This is a portability hazard, as different implementations may exhibit the different behaviors. The design of WGSL aims to minimize such cases, but is constrained by feasibility, and goals for achieving high performance across a broad range of devices.

1.2. Mathematical Terms and Notation

Angles:

The floor expression is defined over real numbers x extended with +∞ and −∞:

The ceiling expression is defined over real numbers x extended with +∞ and −∞:

The truncate function is defined over real numbers x extended with +∞ and −∞:

The roundUp function is defined for positive integers k and n as:

The transpose of an c-column r-row matrix A is the r-column c-row matrix AT formed by copying the rows of A as the columns of AT:

The transpose of a column vector is defined by interpreting the column vector as a 1-row matrix. Similarly, the transpose of a row vector is defined by interpreting the row vector as a 1-column matrix.

2. Shader Lifecycle

There are four key events in the lifecycle of a WGSL program and the shaders it may contain. The first two correspond to the WebGPU API methods used to prepare a WGSL program for execution. The last two are the start and end of execution of a shader.

The events are:

  1. Shader module creation

    • This occurs when the WebGPU createShaderModule method is called. The source text for a WGSL program is provided at this time.

  2. Pipeline creation

  3. Shader execution start

  4. Shader execution end

    • This occurs when all work in the shader completes:

      • all its invocations terminate, and

      • all accesses to resources complete, and

      • outputs, if any, are passed to downstream pipeline stages.

The events are ordered due to:

2.1. Processing Errors

A WebGPU implementation may fail to process a shader for two reasons:

A processing error may occur during three phases in the shader lifecycle:

Note: For example, a race condition may not be detectable.

Each requirement will be checked at the earliest opportunity. That is:

When unclear from context, this specification indicates whether failure to meet a particular requirement results in a shader-creation, pipeline-creation, or dynamic error.

The WebGPU specification describes the consequences of each kind of error.

3. Textual Structure

A WGSL program is Unicode text using the UTF-8 encoding, with no byte order mark (BOM).

WGSL program text consists of a sequence of Unicode code points, grouped into contiguous non-empty sets forming:

The program text must not include a null code point (U+0000).

3.1. Parsing

To parse a WGSL program:

  1. Remove comments:

    • Replace the first comment with a space code point (U+0020).

    • Repeat until no comments remain.

  2. Parse the whole text, attempting to match the translation_unit grammar rule. Parsing uses a LALR(1) parser (one token of lookahead) [DeRemer1969], with the following customization:

    • Tokenization is interleaved with parsing, and is context-aware. When the parser requests the next token:

      • Consume and ignore an initial sequence of blankspace code points.

      • A token candidate is any WGSL token formed from the non-empty prefix of the remaining unconsumed code points.

      • The token returned is the longest token candidate that is also a valid lookahead token for the current parser state. [VanWyk2007]

A shader-creation error results if:

3.2. Blankspace and line breaks

Blankspace is any combination of one or more of code points from the Unicode Pattern_White_Space property. The following is the set of code points in Pattern_White_Space:

_blankspace :

| /[\u0020\u0009\u000a\u000b\u000c\u000d\u0085\u200e\u200f\u2028\u2029]/uy

A line break is a contiguous sequence of blankspace code points indicating the end of a line. It is defined as the blankspace signalling a "mandatory break" as defined by UAX14 Section 6.1 Non-tailorable Line Breaking Rules LB4 and LB5. That is, a line break is any of:

Note: Diagnostics that report source text locations in terms of line numbers should use line breaks to count lines.

3.3. Comments

A comment is a span of text that does not influence the validity or meaning of a WGSL program, except that a comment can separate tokens. Shader authors can use comments to document their programs.

A line-ending comment is a kind of comment consisting of the two code points // (U+002F followed by U+002F) and the code points that follow, up until but not including:

A block comment is a kind of comment consisting of:

Note: Block comments can be nested. Since a block comment requires matching start and end text sequences, and allows arbitrary nesting, a block comment cannot be recognized with a regular expression. This is a consequence of the Pumping Lemma for Regular Languages.

EXAMPLE: Comments
const f = 1.5; // This is line-ending comment.
const g = 2.5; /* This is a block comment
                that spans lines.
                /* Block comments can nest.
                 */
                But all block comments must terminate.
               */

3.4. Tokens

A token is a contiguous sequence of code points forming one of:

3.5. Literals

A literal is one of:

literal :

| int_literal

| float_literal

| bool_literal

3.5.1. Boolean Literals

EXAMPLE: boolean literals
const a = true;
const b = false;
bool_literal :

| true

| false

3.5.2. Numeric Literals

The form of a numeric literal is defined via pattern-matching.

An integer literal is:

int_literal :

| decimal_int_literal

| hex_int_literal

decimal_int_literal :

| /0[iu]?/

| /[1-9][0-9]*[iu]?/

EXAMPLE: decimal integer literals
const a = 1u;
const b = 123;
const c = 0;
const d = 0i;
hex_int_literal :

| /0[xX][0-9a-fA-F]+[iu]?/

EXAMPLE: hexadecimal integer literals
const a = 0x123;
const b = 0X123u;
const c = 0x3f;

A floating point literal is either a decimal floating point literal or a hexadecimal floating point literal.

float_literal :

| decimal_float_literal

| hex_float_literal

decimal_float_literal :

| /0[fh]/

| /[1-9][0-9]*[fh]/

| /[0-9]*\.[0-9]+([eE][+-]?[0-9]+)?[fh]?/

| /[0-9]+\.[0-9]*([eE][+-]?[0-9]+)?[fh]?/

| /[0-9]+[eE][+-]?[0-9]+[fh]?/

EXAMPLE: decimal floating point literals
const a = 0.e+4f;
const b = 01.;
const c = .01;
const d = 12.34;
const f = .0f;
const g = 0h;
const h = 1e-3;
hex_float_literal :

| /0[xX][0-9a-fA-F]*\.[0-9a-fA-F]+([pP][+-]?[0-9]+[fh]?)?/

| /0[xX][0-9a-fA-F]+\.[0-9a-fA-F]*([pP][+-]?[0-9]+[fh]?)?/

| /0[xX][0-9a-fA-F]+[pP][+-]?[0-9]+[fh]?/

EXAMPLE: hexadecimal floating point literals
const a = 0xa.fp+2;
const b = 0x1P+4f;
const c = 0X.3;
const d = 0x3p+2h;
const e = 0X1.fp-4;
const f = 0x3.2p+2h;

When a numeric literal has a suffix, the literal denotes a value in a specific concrete scalar type. Otherwise, the literal denotes a value one of the abstract numeric types defined below.

Mapping numeric literals to types
Numeric Literal Suffix Type Examples
integer literal i i32 42i
integer literal u u32 42u
integer literal AbstractInt 124
floating point literal f f32 42f 1e5f 1.2f 0x1.0p10f
floating point literal h f16 42h 1e5h 1.2h 0x1.0p10h
floating point literal AbstractFloat 1e5 1.2 0x1.0p10

A shader-creation error results if:

Note: The hexadecimal float value 0x1.00000001p0 requires 33 mantissa bits to be represented exactly, but f32 only has 23 explicit mantissa bits.

Note: If you want to use an f suffix to force a hexadecimal float literal to be of type, the literal must also use a binary exponent. For example, write 0x1p0f. In comparison, 0x1f is a hexadecimal integer literal.

3.6. Keywords

A keyword is a token which refers to a predefined language concept. See § 15.1 Keyword Summary for the list of WGSL keywords.

3.7. Identifiers

An identifier is a kind of token used as a name. See § 4 Declaration and Scope.

WGSL uses two grammar nonterminals to separate use cases:

ident :

| ident_pattern_token

member_ident :

| ident_pattern_token

The form of an identifier is based on the Unicode Standard Annex #31 for Unicode Version 14.0.0, with the following elaborations.

Identifiers use the following profile described in terms of UAX31 Grammar:

<Identifier> := <Start> <Continue>* (<Medial> <Continue>+)*

<Start> := XID_Start + U+005F
<Continue> := <Start> + XID_Continue
<Medial> :=

This means identifiers with non-ASCII code points like these are valid: Δέλτα, réflexion, Кызыл, 𐰓𐰏𐰇, 朝焼け, سلام, 검정, שָׁלוֹם, गुलाबी, փիրուզ.

With the following exceptions:

ident_pattern_token :

| /([_\p{XID_Start}][\p{XID_Continue}]+)|([\p{XID_Start}])/uy

Unicode Character Database for Unicode Version 14.0.0 includes non-normative listing with all valid code points of both XID_Start and XID_Continue.

Note: The return type for some built-in functions are structure types whose name cannot be used WGSL source. Those structure types are described as if they were predeclared with a name starting with two underscores. The result value can be saved into newly declared let or var using type inferencing, or immediately have one of its members immediately extracted by name. See example usages in the description of frexp and modf.

3.7.1. Identifier Comparison

Two WGSL identifiers are the same if and only if they consist of the same sequence of code points.

Note: This specification does not permit Unicode normalization of values for the purposes of comparison. Values that are visually and semantically identical but use different Unicode character sequences will not match. Content authors are advised to use the same encoding sequence consistently or to avoid potentially troublesome characters when choosing values. For more information, see [CHARMOD-NORM].

Note: A user agent should issue developer-visible warnings when the meaning of a WGSL program would change if all instances of an identifier are replaced with one of that identifier’s homographs. (A homoglyph is a sequence of code points that may appear the same to a reader as another sequence of code points. Examples of mappings to detect homoglyphs are the transformations, mappings, and matching algorithms mentioned in the previous paragraph. Two sequences of code points are homographs if the identifier can transform one into the other by repeatedly replacing a subsequence with its homoglyph.)

3.8. Context-Dependent Names

A context-dependent name is a token used to name a concept, but only in specific grammatical contexts. The spelling of the token may be the same as an identifier, but the token does not resolve to a declared object.

Section § 15.4 Context-Dependent Name Tokens lists all such tokens.

3.9. Attributes

An attribute modifies an object. WGSL provides a unified syntax for applying attributes. Attributes are used for a variety of purposes such as specifying the interface with the API.

Generally speaking, from the language’s point-of-view, attributes can be ignored for the purposes of type and semantic checking. Additionally, the attribute name is a context-dependent name, and some attribute parameters are also context-dependent names.

An attribute must not be specified more than once per object or type.

Attributes defined in WGSL
Attribute Valid Values Description
align Must be a const-expression that resolves to an i32 or u32.
Must be positive.
Must only be applied to a member of a structure type.

Must be a power of 2, and must satisfy the required-alignment for the member type:

If align(n) is applied to a member of S with type T, and S is the store type or contained in the store type for a variable in address space C, then n must satisfy: n = k × RequiredAlignOf(T,C) for some positive integer k.

See § 5.3.6 Memory Layout

binding Must be a const-expression that resolves to an i32 or u32.
Must be non-negative.
Must only be applied to a resource variable.

Specifies the binding number of the resource in a bind group. See § 10.3.2 Resource Interface.

builtin Name of a built-in value Must only be applied to an entry point function parameter, entry point return type, or member of a structure.

Declares a built-in value with the given context-dependent name. See § 16 Built-in Values.

const None Must only be applied to function declarations.

Specifies that the function can be used as a const-function. This attribute must not be applied to a user-defined function.

Note: This attribute is used as a notational convention to describe which built-in functions can be used in const-expressions.

group Must be a const-expression that resolves to an i32 or u32.
Must be non-negative.
Must only be applied to a resource variable.

Specifies the binding group of the resource. See § 10.3.2 Resource Interface.

id Must be a const-expression that resolves to an i32 or u32.
Must be non-negative.
Must only be applied to an override-declaration of scalar type.

Specifies a numeric identifier as an alternate name for a pipeline-overridable constant.

interpolate One or two parameters.

The first parameter must be an interpolation type.

The second parameter, if present, must specify the interpolation sampling.

Must only be applied to a declaration that has a location attribute applied.

Specifies how the user-defined IO must be interpolated. The attribute is only significant on user-defined vertex outputs and fragment inputs. See § 10.3.1.4 Interpolation.

Both parameters are context-dependent names.

invariant None Must only be applied to the position built-in value.

When applied to the position built-in output value of a vertex shader, the computation of the result is invariant across different programs and different invocations of the same entry point. That is, if the data and control flow match for two position outputs in different entry points, then the result values are guaranteed to be the same. There is no affect on a position built-in input value.

Note: This attribute maps to the precise qualifier in HLSL, and the invariant qualifier in GLSL.

location Must be a const-expression that resolves to an i32 or u32.
Must be non-negative.
Must only be applied to an entry point function parameter, entry point return type, or member of a structure type. Must only be applied to declarations of objects with numeric scalar or numeric vector type. Must not be used with the compute shader stage.

Specifies a part of the user-defined IO of an entry point. See § 10.3.1.3 Input-output Locations.

size Must be a const-expression that resolves to an i32 or u32.
Must be positive.
Must only be applied to a member of a structure type. The member type must have creation-fixed footprint.

The number of bytes reserved in the struct for this member.

This number must be at least the byte-size of the type of the member:

If size(n) is applied to a member with type T, then SizeOf(T) ≤ n.

See § 5.3.6 Memory Layout

workgroup_size One, two or three parameters.

Each parameter must be a const-expression or an override-expression. All parameters must be the same type, either i32 or u32.

A pipeline-creation error results if any specified parameter is 0 or exceeds an upper bound specified by the WebGPU API, or if the product of the parameter values exceeds the upper bound specified by the WebGPU API (see WebGPU § limits).

Must be applied to a compute shader entry point function. Must not be applied to any other object.

Specifies the x, y, and z dimensions of the workgroup grid for the compute shader.

The first parameter specifies the x dimension. The second parameter, if provided, specifies the y dimension, otherwise is assumed to be 1. The third parameter, if provided, specifies the z dimension, otherwise is assumed to be 1.

The shader stage attributes below designate a function as an entry point for a particular shader stage. These attributes must only be applied to function declarations, and at most one may be present on a given function. They take no parameters.

Shader Stage Attributes
Attribute Description
vertex
Declares the function to be an entry point for the vertex shader stage of a render pipeline.
fragment
Declares the function to be an entry point for the fragment shader stage of a render pipeline.
compute
Declares the function to be an entry point for the compute shader stage of a compute pipeline.
attribute :

| attr 'align' paren_left expression attrib_end

| attr 'binding' paren_left expression attrib_end

| attr 'builtin' paren_left builtin_value_name attrib_end

| attr 'const'

| attr 'group' paren_left expression attrib_end

| attr 'id' paren_left expression attrib_end

| attr 'interpolate' paren_left interpolation_type_name attrib_end

| attr 'interpolate' paren_left interpolation_type_name comma interpolation_sample_name attrib_end

| attr 'invariant'

| attr 'location' paren_left expression attrib_end

| attr 'size' paren_left expression attrib_end

| attr 'workgroup_size' paren_left expression attrib_end

| attr 'workgroup_size' paren_left expression comma expression attrib_end

| attr 'workgroup_size' paren_left expression comma expression comma expression attrib_end

| attr 'vertex'

| attr 'fragment'

| attr 'compute'

attrib_end :

| comma ? paren_right

3.10. Directives

A directive is a token sequence which modifies how a WGSL program is processed by a WebGPU implementation.

Directives are optional. If present, all directives must appear before any declarations or static assertions.

See § 11.1 Enable Directive.

global_directive :

| enable_directive

4. Declaration and Scope

A declaration associates an identifier with one of the following kinds of objects:

In other words, a declaration introduces a name for an object.

A declaration is at module scope if the declaration appears outside the text of any other declaration.

A function declaration appears at module-scope. A function declaration contains declarations for formal parameters, if it has any, and it may contain variable and value declarations inside its body. Those contained declarations are therefore not at module-scope.

Note: The only kind of declaration that contain another declaration is a function declaration.

Certain objects are provided by the WebGPU implementation, and are treated as if they have been declared before the start of the WGSL program source. We say such objects are predeclared. For example, WGSL predeclares built-in functions, and built-in types such as i32 and f32.

The scope of a declaration is the set of program source locations where a declared identifier potentially denotes its associated object. We say the identifier is in scope (of the declaration) at those source locations.

Where a declaration appears determines its scope:

Two declarations in the same WGSL source program must not simultaneously:

Note: A predeclared object does not have a declaration in the WGSL source. So a declaration at module-scope or inside a function can have the same name as a built-in function.

TODO(#2941): Once built-in types are not named by keywords or reserved words, then user declarations can also have the same name as a built-in type.

Identifiers are used as follows, distinguished by grammatical context:

When an ident token appears as a name denoting an object declared elsewhere, it must be in scope for some declaration. The object denoted by the identifier token is determined as follows:

When the above algorithm is used to map an identifier to a declaration, we say the identifier resolves to that declaration. Similarly, we also say the identifier resolves to the declared object.

It is a shader-creation error if any module scope declaration is recursive. That is, no cycles can exist among the declarations:

Consider the directed graph where:

This graph must not have a cycle.

Note: The function body is part of the function declaration, thus functions must not be recursive, either directly or indirectly.

Note: Use of a non-module scope identifier must follow the declaration of that identifier in the text.

EXAMPLE: Valid and invalid declarations
// Valid, user-defined variables can have the same name as a built-in function.
var<private> modf: f32 = 0.0;

// Valid, foo_1 is in scope for the entire program.
var<private> foo: f32 = 0.0; // foo_1

// Valid, bar_1 is in scope for the entire program.
var<private> bar: u32 = 0u; // bar_1

// Valid, my_func_1 is in scope for the entire program.
// Valid, foo_2 is in scope until the end of the function.
fn my_func(foo: f32) { // my_func_1, foo_2
  // Any reference to 'foo' resolves to the function parameter.

  // Invalid, modf resolves to the module-scope variable.
  let res = modf(foo);

  // Invalid, the scope of foo_2 ends at the of the function.
  var foo: f32; // foo_3

  // Valid, bar_2 is in scope until the end of the function.
  var bar: u32; // bar_2
  // References to 'bar' resolve to bar_2
  {
    // Valid, foo_4 is in scope until the end of the compound statement.
    var foo : f32; // foo_4

    // Valid, bar_3 is in scope until the end of the compound statement.
    var bar: u32; // bar_3
    // References to 'bar' resolve to bar_3

    // Invalid, bar_4 has the same end scope as bar_3.
    var bar: i32; // bar_4

    // Valid, i_1 is in scope until the end of the for loop
    for ( var i: i32 = 0; i < 10; i++ ) { // i_1
      // Invalid, i_2 has the same end scope as i_1.
      var i: i32 = 1; // i_2.
    }
  }

  // Invalid, bar_5 has the same end scope as bar_2.
  var bar: u32; // bar_5

  // Valid, later_def, a module scope declaration, is in scope for the entire program.
  var early_use : i32 = later_def;
}

// Invalid, bar_6 has the same scope as bar_1.
var<private> bar: u32 = 1u; // bar_6

// Invalid, my_func_2 has the same end scope as my_func_1.
fn my_func() { } // my_func_2

// Valid, my_foo_1 is in scope for the entire program.
fn my_foo( //my_foo_1
  // Valid, my_foo_2 is in scope until the end of the function.
  my_foo: i32 // my_foo_2
) { }

var<private> later_def : i32 = 1;

5. Types

Programs calculate values.

In WGSL, a type is a set of values, and each value belongs to exactly one type. A value’s type determines the syntax and semantics of operations that can be performed on that value.

For example, the mathematical number 1 corresponds to these distinct values in WGSL:

WGSL treats these as different because their machine representation and operations differ.

A type is either predeclared, or created in WGSL source via a declaration.

We distinguish between the concept of a type and the syntax in WGSL to denote that type. In many cases the spelling of a type in this specification is the same as its WGSL syntax. For example:

Some WGSL types are only used for analyzing a source program and for determining the program’s runtime behavior. This specification will describe such types, but they do not appear in WGSL source text.

Note: WGSL reference types are not written in WGSL programs. See § 5.4.1 Reference and Pointer Types.

5.1. Type Checking

A WGSL value is computed by evaluating an expression. An expression is a segment of source text parsed as one of the WGSL grammar rules whose name ends with "_expression". An expression E can contain subexpressions which are expressions properly contained in the outer expression E. A top-level expression is an expression that is not itself a subexpression. See § 7.18 Expression Grammar Summary.

The particular value produced by an expression evaluation depends on:

The values that may result from evaluating a particular expression will always belong to a specific WGSL type, known as the static type of the expression. The rules of WGSL are designed so that the static type of an expression depends only on the expression’s static context.

A type assertion is a mapping from some WGSL source expression to a WGSL type. The notation

e : T

is a type assertion meaning T is the static type of WGSL expression e.

Note: A type assertion is a statement of fact about the text of a program. It is not a runtime check.

Statements often use expressions, and may place requirements on the static types of those expressions. For example:

Type checking a successfully parsed WGSL program is the process of mapping each expression to its static type, and verifying that type requirements of each statement are satisfied. If type checking fails, a special case of a shader-creation error, called a type error, results.

Type checking can be performed by recursively applying type rules to syntactic phrases, where a syntactic phrase is either an expression or a statement. A type rule describes how the static context for a syntactic phrase determines the static type for expressions contained within that phrase. A type rule has two parts:

Type rules may have type parameters in their preconditions and conclusions. When a type rule’s conclusion or preconditions contain type parameters, we say it is parameterized. When they do not, we say the rule is fully elaborated. We can make a fully elaborated type rule from a parameterized one by substituting a type for each of its type parameters, using the same type for all occurrences of a given parameter in the rule. An assignment of types to a rule’s type parameters is called a substitution.

For example, here is the type rule for logical negation (an expression of the form !e):

Precondition Conclusion
e: T
T is bool or vecN<bool>
!e: T

This is a parameterized rule, because it contains the type parameter T, which can represent any one of four types bool, vec2<bool>, vec3<bool>, or vec4<bool>. Applying the substitution that maps T to vec3<bool> produces the fully elaborated type rule:

Precondition Conclusion
e: vec3<bool>
!e: vec3<bool>

Each fully elaborated rule we can produce from a parameterized rule by applying some substitution that meets the rule’s other conditions is called an overload of the parameterized rule. For example, the boolean negation rule has four overloads, because there are four possible ways to assign a type to its type parameter T.

Note: In other words, a parameterized type rule provides the pattern for a collection of fully elaborated type rules, each one produced by applying a different substitution to the parameterized rule.

A type rule applies to a syntactic phrase when:

A parameterized type rule applies to an expression if there exists a substitution producing a fully elaborated type rule that applies to the expression.

Consider the expression, 1u+2u. It has two literal subexpressions: 1u and 2u, both of type u32. The top-level expression is an addition. Referring to the rules in § 7.9 Arithmetic Expressions, the type rule for addition applies to the expression, because:

When analyzing a syntactic phrase, three cases may occur:

Continuing the example above, only one type rule applies to the expression 1u+2u, and so type checking accepts the conclusion of that type rule, which is that 1u+2u is of type u32.

A WGSL source program is well-typed when:

Otherwise there is a type error and the source program is not a valid WGSL program.

WGSL is a statically typed language because type checking a WGSL program will either succeed or discover a type error, while only having to inspect the program source text.

5.1.1. Type Rule Tables

The WGSL type rules for expressions are organized into type rule tables, with one row per type rule.

The semantics of an expression is the effect of evaluating that expression, and is primarily the production of a result value. The Description column of the type rule that applies to an expression will specify the expression’s semantics. The semantics usually depends on the values of the type rule parameters, including the assumed values of any subexpressions. Sometimes the semantics of an expression includes effects other than producing a result value, such as the non-result-value effects of its subexpressions.

TODO: example: non-result-value effect is any side effect of a function call subexpression.

5.1.2. Conversion Rank

When a type assertion e:T is used as a type rule precondition, it is satisfied when:

The rule is codified by the ConversionRank function over pairs of types, defined in the table below. The ConversionRank function expresses the preference and feasibility of automatically converting a value of one type (Src) to another type (Dest). Lower ranks are more desirable.

A feasible automatic conversion converts a value from type Src to type Dest, and is allowed when ConversionRank(Src,Dest) is finite. Such conversions are value-preserving, subject to limitations described in § 13.6 Floating Point Evaluation.

Note: Automatic conversions only occur in two kinds of situations. First, when converting a const-expression to its corresponding typed numeric value that can be used on the GPU. Second, when a load from a reference-to-memory occurs, yielding the value stored in that memory.

Note: A conversion of infinite rank is infeasible, i.e. not allowed.

Note: When no conversion is performed, the conversion rank is zero.

ConversionRank from one type to another
Src Dest ConversionRank(Src,Dest) Description
T T 0 Identity. No conversion performed.
ref<AS,T,AM>
for address space AS, and where access mode AM is read or read_write.
T 0 Apply the Load Rule to load a value from a memory reference.
AbstractFloat f32 1 See § 13.6.2 Floating Point Conversion
AbstractFloat f16 2 See § 13.6.2 Floating Point Conversion
AbstractInt i32 3 Identity if the value is in i32. Produces a shader-creation error otherwise.
AbstractInt u32 4 Identity if the value is in u32. Produces a shader-creation error otherwise.
AbstractInt AbstractFloat 5 See § 13.6.2 Floating Point Conversion
AbstractInt f32 6 Behaves as AbstractInt to AbstractFloat, and then AbstractFloat to f32
AbstractInt f16 7 Behaves as AbstractInt to AbstractFloat, and then AbstractFloat to f16
vecN<S> vecN<T> ConversionRank(S,T) Inherit conversion rank from component type.
matCxR<S> matCxR<T> ConversionRank(S,T) Inherit conversion rank from component type.
array<S,N> array<T,N> ConversionRank(S,T) Inherit conversion rank from component type. Note: Only fixed-size arrays may have an abstract component type.
S T
where above cases don’t apply
infinity There are no automatic conversions between other types.

The type T is the concretization of type S if:

The concretization of a value e of type T is the value resulting from applying, to e, the feasible conversion that maps T to the concretization of T.

5.1.3. Overload Resolution

When more than one type rule applies to a syntactic phrase, a tie-breaking procedure is used to determine which one should take effect. This procedure is called overload resolution, and assumes type checking has already succeeded in finding static types for subexpressions.

Consider a syntactic phrase P, and all type rules that apply to P. The overload resolution algorithm calls these type rules overload candidates. For each candidate:

Overload resolution for P proceeds as follows, with the goal of finding a single most preferable overload candidate:

  1. For each candidate C, enumerate conversion ranks for subexpressions in the syntactic phrase. The candidate’s preconditions have been met, and so for the i’th subexpression in the P:

    • Its static type has been computed.

    • There is a feasible automatic conversion from the expression’s static type to the type required by the corresponding type assertion in the preconditions. Let C.R(i) be the ConversionRank of that conversion.

  2. Eliminate any candidate where one of its subexpressions resolves to an abstract type after feasible automatic conversions, but another of the candidate’s subexpressions is not a const-expression.

  3. Rank candidates: Given two overload candidates C1 and C2, C1 is preferred over C2 if:

    • For each expression position i in P, C1.R(i) ≤ C2.R(i).

      • That is, each expression conversion required to apply C1 to P is at least as preferable as the corresponding expression conversion required to apply C2 to P.

    • There is at least one expression position i where C1.R(i) < C2.R(i).

      • That is, there is at least one expression conversion required to apply C1 that is strictly more preferable than the corresponding conversion required to apply C2.

  4. If there is a single candidate C which is preferred over all the others, then overload resolution succeeds, yielding the candidate type rule C. Otherwise, overload resolution fails.

TODO: Examples

5.2. Plain Types

Plain types are types for the machine representation of boolean values, numbers, vectors, matrices, or aggregations of such values.

A plain type is either a scalar type, an atomic type, or a composite type.

Note: Plain types in WGSL are similar to Plain-Old-Data types in C++, but also include atomic types and abstract numeric types.

5.2.1. Abstract Numeric Types

These types cannot be spelled in WGSL source. They are only used by type checking.

Certain expressions are evaluated at shader-creation time, and with a numeric range and precision that may be larger than directly implemented by the GPU.

WGSL defines two abstract numeric types for these evaluations:

An evaluation of an expression in one of these types must not overflow or produce infinite, NaN, undefined, or indeterminate results.

A type is abstract if it is an abstract numeric type or contains an abstract numeric type. A type is concrete if it is not abstract.

A numeric literal without a suffix denotes a value in an abstract numeric type:

Example: The expression log2(32) is analyzed as follows:

Example: The expression 1 + 2.5 is analyzed as follows:

Example: let x = 1 + 2.5;

Example: 1u + 2.5 results in a shader-creation error:

EXAMPLE: Type inference for literals
// Explicitly-typed unsigned integer literal.
var u32_1 = 1u; // variable holds a u32

// Explicitly-typed signed integer literal.
var i32_1 = 1i; // variable holds a i32

// Explicitly-typed floating point literal.
var f32_1 = 1f; // variable holds a f32

// Explicitly-typed unsigned integer literal cannot be negated.
var u32_neg = -1u; // invalid: unary minus does not support u32

// When a concrete type is required, but no part of the statement or
// expression forces a particular concrete type, an integer literal is
// interpreted as an i32 value:
//   Initializer for a let-declaration must be constructible (or pointer).
//   The most preferred automatic conversion from AbstractInt to a constructible type
//   is AbstractInt to i32, with conversion rank 2.  So '1' is inferred as i32.
let some_i32 = 1; // like let some_i32: i32 = 1i;

// Inferred from declaration type.
var i32_from_type : i32 = 1; // variable holds i32.  AbstractInt to i32, conversion rank 2
var u32_from_type : u32 = 1; // variable holds u32.  AbstractInt to u32, conversion rank 3

// Unsuffixed integer literal can convert to floating point when needed:
//   Automatically convert AbstractInt to f32, with conversion rank 5.
var f32_promotion : f32 = 1; // variable holds f32

// Invalid: no feasible conversion from floating point to integer
var i32_demotion : i32 = 1.0; // Invalid

// Inferred from expression.
var u32_from_expr = 1 + u32_1; // variable holds u32
var i32_from_expr = 1 + i32_1; // variable holds i32

// Values must be representable.
let u32_too_large   : u32 = 1234567890123456890; // invalid, overflow
let i32_too_large   : i32 = 1234567890123456890; // invalid, overflow
let u32_large : u32 = 2147483649; // valid
let i32_large : i32 = 2147483649; // invalid, overflow
let f32_out_of_range1 = 0x1p500; // invalid, out of range
let f32_hex_lost_bits = 0x1.0000000001p0; // invalid, not exactly representable in f32

// Minimum integer: unary negation over AbstractInt, then infer i32.
// Most preferred conversion from AbstractInt to a constructible type (with lowest
// conversion rank) is AbstractInt to i32.
let i32_min = -2147483648;  // has type i32

// Invalid.  Select AbstractInt to i32 as above, but the value is out of
// range, producing shader-creation error.
let i32_too_large_2 = 2147483648; // Invalid.

// Subexpressions can resolve to AbstractInt and AbstractFloat.
// The following examples are all valid and the value of the variable is 6u.
// var u32_expr1 = (1 + (1 + (1 + (1 + 1)))) + 1u;
// var u32_expr2 = 1u + (1 + (1 + (1 + (1 + 1))));
// var u32_expr3 = (1 + (1 + (1 + (1u + 1)))) + 1;
// var u32_expr4 = 1 + (1 + (1 + (1 + (1u + 1))));

// Inference based on built-in function parameters.

// Most-preferred candidate is clamp(i32,i32,i32)->i32
let i32_clamp = clamp(1, -5, 5);
// Most preferred candidate is clamp(u32,u32,u32).
// Literals use automatic conversion AbstractInt to u32.
let u32_clamp = clamp(5, 0, u32_from_expr);
// Most preferred candidate is clamp(f32,f32,f32)->f32
// literals use automatic conversion AbstractInt to f32.
let f32_clamp = clamp(0, f32_1, 1);

// TODO: When AbstractFloat gains support for addition, then these will become valid,
// via promotion.
// let f32_promotion1 = 1.0 + 2 + 3 + 4; // TODO: like let f32_promotion1:f32 = 10f;
// let f32_promotion2 = 2 + 1.0 + 3 + 4; // TODO: like let f32_promotion1:f32 = 10f;
// let f32_promotion3 = 1f + ((2 + 3) + 4); // TODO: like let f32_promotion1:f32 = 10f;
// let f32_promotion4 = ((2 + (3 + 1f)) + 4); // TODO: like let f32_promotion1:f32 = 10f;

// Type rule violations.

// Invalid, the initializer can only resolve to f32:
// No feasible automatic conversion from AbstractFloat to u32.
let mismatch : u32 = 1.0;

// Invalid. There is no overload of clamp that allows mixed sign parameters.
let ambiguous_clamp = clamp(1u, 0, 1i);

// Inference completes at the statement level.

// Initializer for a let-declaration must be constructible (or pointer).
// The most preferred automatic conversion from AbstractInt to a constructible type
// is AbstractInt to i32, with conversion rank 2.  So '1' is inferred as i32.
let some_i32 = 1; // like let some_i32: i32 = 1i;

let some_f32 : f32 = some_i32; // Type error: i32 cannot be assigned to f32

// Another overflow case
let overflow_u32 = (1 -2) + 1u; // invalid, -1 is out of range of u32

// Ideal value out of range of 32-bits, but brought back into range
let out_and_in_again = (0x1ffffffff / 8);

// Similar, but invalid
let out_of_range = (0x1ffffffff / 8u); // requires computation is done in 32-bits,
                     // making 0x1ffffffff out of range.

5.2.2. Boolean Type

The bool type contains the values true and false.

Boolean literal type rules
Precondition Conclusion Description
true: bool The true value.
false: bool The false value.

5.2.3. Integer Types

The u32 type is the set of 32-bit unsigned integers.

The i32 type is the set of 32-bit signed integers. It uses a two’s complementation representation, with the sign bit in the most significant bit position.

5.2.4. Floating Point Type

The f32 type is the set of 32-bit floating point values of the IEEE-754 binary32 (single precision) format. See § 13.6 Floating Point Evaluation for details.

The f16 type is the set of 16-bit floating point values of the IEEE-754 binary16 (half precision) format. It is a shader-creation error if the f16 type is used unless the program contains the enable f16; directive to enable the f16 extension. See § 13.6 Floating Point Evaluation for details.

5.2.5. Scalar Types

The scalar types are bool, AbstractInt, AbstractFloat, i32, u32, f32, and f16.

The numeric scalar types are AbstractInt, AbstractFloat, i32, u32, f32, and f16.

The integer scalar types are AbstractInt, i32, and u32.

5.2.6. Vector Types

A vector is a grouped sequence of 2, 3, or 4 scalar components.

Type Description
vecN<T> Vector of N components of type T. N must be in {2, 3, 4} and T must be one of the scalar types. We say T is the component type of the vector.

A vector is a numeric vector if its component type is a numeric scalar.

Key use cases of a vector include:

Many operations on vectors act component-wise, i.e. the result vector is formed by operating on each component independently.

EXAMPLE: Vector
vec2<f32>  // is a vector of two f32s.
EXAMPLE: Component-wise addition
let x : vec3<f32> = a + b; // a and b are vec3<f32>
// x[0] = a[0] + b[0]
// x[1] = a[1] + b[1]
// x[2] = a[2] + b[2]

5.2.7. Matrix Types

A matrix is a grouped sequence of 2, 3, or 4 floating point vectors.

Type Description
matCxR<T> Matrix of C columns and R rows of type T, where C and R are both in {2, 3, 4}, and T must be f32, f16, or AbstractFloat. Equivalently, it can be viewed as C column vectors of type vecR<T>.

The key use case for a matrix is to embody a linear transformation. In this interpretation, the vectors of a matrix are treated as column vectors.

The product operator (*) is used to either:

See § 7.9 Arithmetic Expressions.

EXAMPLE: Matrix
mat2x3<f32>  // This is a 2 column, 3 row matrix of 32-bit floats.
             // Equivalently, it is 2 column vectors of type vec3<f32>.

5.2.8. Atomic Types

An atomic type encapsulates a concrete integer scalar type such that:

Type Description
atomic<T> Atomic of type T. T must be either u32 or i32.

An expression must not evaluate to an atomic type.

Atomic types may only be instantiated by variables in the workgroup address space or by storage buffer variables with a read_write access mode. The memory scope of operations on the type is determined by the address space it is instantiated in. Atomic types in the workgroup address space have a memory scope of Workgroup, while those in the storage address space have a memory scope of QueueFamily.

An atomic modification is any operation on an atomic object which sets the content of the object. The operation counts as a modification even if the new value is the same as the object’s existing value.

In WGSL, atomic modifications are mutually ordered, for each object. That is, during execution of a shader stage, for each atomic object A, all agents observe the same order of modification operations applied to A. The ordering for distinct atomic objects may not be related in any way; no causality is implied. Note that variables in workgroup space are shared within a workgroup, but are not shared between different workgroups.

5.2.9. Array Types

An array is an indexable grouping of element values.

Type Description
array<E,N> A fixed-size array with N elements of type E.
N is called the element count of the array.
array<E> A runtime-sized array of elements of type E. These may only appear in specific contexts.

The first element in an array is at index 0, and each successive element is at the next integer index. See § 7.7.3 Array Access Expression.

An expression must not evaluate to a runtime-sized array type.

The element count expression N of a fixed-size array is subject to the following constraints:

Note: The element count value is fully determined at pipeline creation time.

An array element type must be one of:

Note: The element type must be a plain type.

Two array types are the same if and only if all of the following are true:

EXAMPLE: Example fixed-size array types, non-overridable element count
// array<f32,8> and array<i32,8> are different types:
// different element types
var<private> a: array<f32,8>;
var<private> b: array<i32,8>;
var<private> c: array<i32,8u>;  // array<i32,8> and array<i32,8u> are the same type

const width = 8;
const height = 8;

// array<i32,8>, array<i32,8u>, and array<i32,width> are the same type.
// Their element counts evaluate to 8.
var<private> d: array<i32,width>;

// array<i32,height> and array<i32,width> are the same type.
var<private> e: array<i32,width>;
var<private> f: array<i32,height>;

Note: The only valid use of an array type sized by an overridable constant is as a memory view in the workgroup address space. This includes the store type of a workgroup variable.

EXAMPLE: Workgroup variables sized by overridable constants
override blockSize = 16;

var<workgroup> odds: array<i32,blockSize>;
var<workgroup> evens: array<i32,blockSize>; // Same type

// None of the following have the same type as 'odds' and 'evens'.

// Different type: Not the identifier 'blockSize'
var<workgroup> evens_0: array<i32,16>;
// Different type: Uses arithmetic to express the element count.
var<workgroup> evens_1: array<i32,(blockSize * 2 / 2)>;
// Different type: Uses parentheses, not just an identifier.
var<workgroup> evens_2: array<i32,(blockSize)>;

// An invalid example, because the overridable element count may only occur
// at the outer level.
// var<workgroup> both: array<array<i32,blockSize>,2>;

// An invalid example, because the overridable element count is only
// valid for workgroup variables.
// var<private> bad_address_space: array<i32,blockSize>;
array_type_specifier :

| array less_than type_specifier ( comma element_count_expression ) ? greater_than

element_count_expression :

| additive_expression

| bitwise_expression

5.2.10. Structure Types

A structure is a named grouping of named member values.

Type Description
struct AStructName {
  M1 : T1,
  ...
  MN : TN,
}
A declaration of a structure type named by identifier AStructName and having N members, where member i is named by identifier Mi and is of type Ti.

N must be at least 1.

Two members of the same structure type must not have the same name.

Structure types are declared at module scope. Elsewhere in the program source, a structure type is denoted by its identifier name. See § 4 Declaration and Scope.

Two structure types are the same if and only if they have the same name.

A structure member type must be one of:

Note: All structure types are concrete.

Note: Each member type must be a plain type.

Some consequences of the restrictions structure member and array element types are:

EXAMPLE: Structure
// A structure with three members.
struct Data {
  a: i32,
  b: vec2<f32>,
  c: array<i32,10>, // last comma is optional
}

// Declare a variable storing a value of type Data.
var<private> some_data: Data;
struct_decl :

| struct ident struct_body_decl

struct_body_decl :

| brace_left struct_member ( comma struct_member ) * comma ? brace_right

struct_member :

| attribute * member_ident colon type_specifier

The following attributes can be applied to structure members:

Attributes builtin, location, interpolate, and invariant are IO attributes. An IO attribute on a member of a structure S has effect only when S is used as the type of a formal parameter or return type of an entry point. See § 10.3.1 Inter-stage Input and Output Interface.

Attributes align and size are layout attributes, and may be required if the structure type is used to define a uniform buffer or a storage buffer. See § 5.3.6 Memory Layout.

EXAMPLE: Structure declaration
struct my_struct {
  a: f32,
  b: vec4<f32>
}
EXAMPLE: Structure used to declare a buffer
// Runtime Array
type RTArr = array<vec4<f32>>;
struct S {
  a: f32,
  b: f32,
  data: RTArr
}
@group(0) @binding(0) var<storage> buffer: S;

5.2.11. Composite Types

A type is composite if it has internal structure expressed as a composition of other types. The internal parts do not overlap, and are called components. A composite value may be decomposed into its components. See § 7.7 Composite Value Decomposition Expressions.

The composite types are:

For a composite type T, the nesting depth of T, written NestDepth(T) is:

5.2.12. Constructible Types

Many kinds of values can be created, loaded, stored, passed into functions, and returned from functions. We call these constructible.

A type is constructible if it is one of:

Note: All constructible types have a creation-fixed footprint.

Note: Atomic types and runtime-sized array types are not constructible. Composite types containing atomics and runtime-sized arrays are not constructible.

5.2.13. Fixed-Footprint Types

The memory footprint of a variable is the number of memory locations used to store the contents of the variable. The memory footprint of a variable depends on its store type and becomes finalized at some point in the shader lifecycle. Most variables are sized very early, at shader creation time. Some variables may be sized later, at pipeline creation time, and others as late as the start of shader execution.

A type has a creation-fixed footprint if its concretization has a size that is fully determined at shader creation time.

A type has a fixed footprint if its size is fully determined at pipeline creation time.

Note: Pipeline creation depends on shader creation, so a type with creation-fixed footprint also has fixed footprint.

The types with creation-fixed footprint are:

Note: A constructible type has a creation-fixed footprint.

The plain types with fixed footprint are any of:

Note: The only valid use of a fixed-size array with an element count that is an override-expression that is not a const-expression is as a memory view in the workgroup address space. This includes the store type of a workgroup variable.

Note: A fixed-footprint type may contain an atomic type, either directly or indirectly, while a constructible type cannot.

Note: Fixed-footprint types exclude runtime-sized arrays, and any structure that contains a runtime-sized array.

5.3. Memory

In WGSL, a value of storable type may be stored in memory, for later retrieval. This section describes the structure of memory, and how WGSL types are used to describe the contents of memory.

5.3.1. Memory Locations

Memory consists of a set of distinct memory locations. Each memory location is 8-bits in size. An operation affecting memory interacts with a set of one or more memory locations.

Two sets of memory locations overlap if the intersection of their sets of memory locations is non-empty. Each variable declaration has a set of memory locations that does not overlap with the sets of memory locations of any other variable declaration. Memory operations on structures and arrays will not access padding memory locations.

5.3.2. Memory Access Mode

A memory access is an operation that acts on memory locations.

A single operation can read, write, or both read and write.

Particular memory locations may support only certain kinds of accesses, expressed as the memory’s access mode:

read

Supports read accesses, but not writes.

write

Supports write accesses, but not reads.

read_write

Supports both read and write accesses.

When a token matches the access_mode grammar nonterminal, it is considered a context-dependent name. In particular, the token does not resolve to any declared object.

5.3.3. Storable Types

The value contained in a variable must be of a storable type. A storable type may have an explicit representation defined by WGSL, as described in § 5.3.6.4 Internal Layout of Values, or it may be opaque, such as for textures and samplers.

A type is storable if it is both concrete and one of:

Note: That is, the storable types are the concrete plain types, texture types, and sampler types.

5.3.4. Host-shareable Types

Host-shareable types are used to describe the contents of buffers which are shared between the host and the GPU, or copied between host and GPU without format translation. When used for this purpose, the type may additionally have layout attributes applied as described in § 5.3.6 Memory Layout. We will see in § 6.3 var Declarations that the store type of uniform buffer and storage buffer variables must be host-shareable.

A type is host-shareable if it is both concrete and one of:

Note: Many types are host-shareable, but not IO-shareable, including atomic types, runtime-sized arrays, and any composite types containing them.

Note: Both IO-shareable and host-shareable types have specified sizes, but counted differently. IO-shareable types are sized by a location-count metric, see § 10.3.1.3 Input-output Locations. Host-shareable types are sized by a byte-count metric, see § 5.3.6 Memory Layout.

5.3.5. Address Spaces

Memory locations are partitioned into address spaces. Each address space has unique properties determining mutability, visibility, the values it may contain, and how to use variables with it. See § 6 Variable and Value Declarations for more details.

Address Spaces
Address space Sharing among invocations Notes
function Same invocation only
private Same invocation only
workgroup Invocations in the same compute shader workgroup The element count of an outermost array may be a pipeline-overridable constant.
uniform Invocations in the same shader stage For uniform buffer variables
storage Invocations in the same shader stage For storage buffer variables
handle Invocations in the same shader stage For sampler and texture variables.

When a token matches the address_space grammar nonterminal, it is considered a context-dependent name. In particular, the token does not resolve to any declared object.

Note: The token handle is reserved: it is never used in a WGSL program.

5.3.6. Memory Layout

Uniform buffer and storage buffer variables are used to share bulk data organized as a sequence of bytes in memory. Buffers are shared between the CPU and the GPU, or between different shader stages in a pipeline, or between different pipelines.

Because buffer data are shared without reformatting or translation, it is a dynamic error if buffer producers and consumers do not agree on the memory layout, which is the description of how the bytes in a buffer are organized into typed WGSL values.

The store type of a buffer variable must be host-shareable, with fully elaborated memory layout, as described below.

Each buffer variable must be declared in either the uniform or storage address spaces.

The memory layout of a type is significant only when evaluating an expression with:

An 8-bit byte is the most basic unit of host-shareable memory. The terms defined in this section express counts of 8-bit bytes.

We will use the following notation:

5.3.6.1. Alignment and Size

Each host-shareable data type T has an alignment and size.

The alignment of a type is a constraint on where values of that type may be placed in memory, expressed as an integer: a type’s alignment must evenly divide the byte address of the starting memory location of a value of that type. Alignments enable use of more efficient hardware instructions for accessing the values, or satisfy more restrictive hardware requirements on certain address spaces. (See address space layout constraints).

Note: Each alignment value is always a power of two, by construction.

The byte-size of a type or structure member is the number of contiguous bytes reserved in host-shareable memory for the purpose of storing a value of the type or structure member. The size may include non-addressable padding at the end of the type. Consequently, loads and stores of a value might access fewer memory locations than the value’s size.

Alignment and size for host-shareable types are defined recursively in the following table:

Alignment and size for host-shareable types
Host-shareable type T AlignOf(T) SizeOf(T)
i32, u32, or f32 4 4
f16 2 2
atomic<|T|> 4 4
vec2<T>, T is i32, u32, or f32 8 8
vec2<f16> 4 4
vec3<T>, T is i32, u32, or f32 16 12
vec3<f16> 8 6
vec4<T>, T is i32, u32, or f32 16 16
vec4<f16> 8 8
matCxR (col-major)

(General form)

AlignOf(vecR) SizeOf(array<vecR, C>)
mat2x2<f32> 8 16
mat2x2<f16> 4 8
mat3x2<f32> 8 24
mat3x2<f16> 4 12
mat4x2<f32> 8 32
mat4x2<f16> 4 16
mat2x3<f32> 16 32
mat2x3<f16> 8 16
mat3x3<f32> 16 48
mat3x3<f16> 8 24
mat4x3<f32> 16 64
mat4x3<f16> 8 32
mat2x4<f32> 16 32
mat2x4<f16> 8 16
mat3x4<f32> 16 48
mat3x4<f16> 8 24
mat4x4<f32> 16 64
mat4x4<f16> 8 32
struct S with members M1...MN max(AlignOfMember(S,1), ... , AlignOfMember(S,N))
roundUp(AlignOf(S), justPastLastMember)

where justPastLastMember = OffsetOfMember(S,N) + SizeOfMember(S,N)
array<E, N>
AlignOf(E) N × roundUp(AlignOf(E), SizeOf(E))
array<E>
AlignOf(E) Nruntime × roundUp(AlignOf(E),SizeOf(E))

where Nruntime is the runtime-determined number of elements of T
5.3.6.2. Structure Member Layout

The internal layout of a structure is computed from the sizes and alignments of its members. By default, the members are arranged tightly, in order, without overlap, while satisfying member alignment requirements.

This default internal layout can be overriden by using layout attributes, which are:

The i’th member of structure type S has a size and alignment, denoted by SizeOfMember(S, i) and AlignOfMember(S, i), respectively. The member sizes and alignments are used to calculate each member’s byte offset from the start of the structure, as described in § 5.3.6.4 Internal Layout of Values.

SizeOfMember(S, i) is k if the i’th member of S has attribute size(k). Otherwise, it is SizeOf(T) where T is the type of the member.

AlignOfMember(S, i) is k if the i’th member of S has attribute align(k). Otherwise, it is AlignOf(T) where T is the type of the member.

If a structure member has the size attribute applied, the value must be at least as large as the size of the member’s type:

SizeOfMember(S, i) ≥ SizeOf(T)
Where T is the type of the i’th member of S.

The first structure member always has a zero byte offset from the start of the structure:

OffsetOfMember(S, 1) = 0

Each subsequent member is placed at the lowest offset that satisfies the member type alignment, and which avoids overlap with the previous member. For each member index i > 1:

OffsetOfMember(S, i) = roundUp(AlignOfMember(S, i ), OffsetOfMember(S, i-1) + SizeOfMember(S, i-1))

EXAMPLE: Layout of structures using implicit member sizes and alignments
struct A {                                     //             align(8)  size(24)
    u: f32,                                    // offset(0)   align(4)  size(4)
    v: f32,                                    // offset(4)   align(4)  size(4)
    w: vec2<f32>,                              // offset(8)   align(8)  size(8)
    x: f32                                     // offset(16)  align(4)  size(4)
    // -- implicit struct size padding --      // offset(20)            size(4)
}

struct B {                                     //             align(16) size(160)
    a: vec2<f32>,                              // offset(0)   align(8)  size(8)
    // -- implicit member alignment padding -- // offset(8)             size(8)
    b: vec3<f32>,                              // offset(16)  align(16) size(12)
    c: f32,                                    // offset(28)  align(4)  size(4)
    d: f32,                                    // offset(32)  align(4)  size(4)
    // -- implicit member alignment padding -- // offset(36)            size(4)
    e: A,                                      // offset(40)  align(8)  size(24)
    f: vec3<f32>,                              // offset(64)  align(16) size(12)
    // -- implicit member alignment padding -- // offset(76)            size(4)
    g: array<A, 3>,    // element stride 24       offset(80)  align(8)  size(72)
    h: i32                                     // offset(152) align(4)  size(4)
    // -- implicit struct size padding --      // offset(156)           size(4)
}

@group(0) @binding(0)
var<storage,read_write> storage_buffer: B;
EXAMPLE: Layout of structures with explicit member sizes and alignments
struct A {                                     //             align(8)  size(32)
    u: f32,                                    // offset(0)   align(4)  size(4)
    v: f32,                                    // offset(4)   align(4)  size(4)
    w: vec2<f32>,                              // offset(8)   align(8)  size(8)
    @size(16) x: f32                           // offset(16)  align(4)  size(16)
}

struct B {                                     //             align(16) size(208)
    a: vec2<f32>,                              // offset(0)   align(8)  size(8)
    // -- implicit member alignment padding -- // offset(8)             size(8)
    b: vec3<f32>,                              // offset(16)  align(16) size(12)
    c: f32,                                    // offset(28)  align(4)  size(4)
    d: f32,                                    // offset(32)  align(4)  size(4)
    // -- implicit member alignment padding -- // offset(36)            size(12)
    @align(16) e: A,                           // offset(48)  align(16) size(32)
    f: vec3<f32>,                              // offset(80)  align(16) size(12)
    // -- implicit member alignment padding -- // offset(92)            size(4)
    g: array<A, 3>,    // element stride 32       offset(96)  align(8)  size(96)
    h: i32                                     // offset(192) align(4)  size(4)
    // -- implicit struct size padding --      // offset(196)           size(12)
}

@group(0) @binding(0)
var<uniform> uniform_buffer: B;
5.3.6.3. Array Layout Examples
EXAMPLE: Fixed-size array layout examples
// Array where:
//   - alignment is 4 = AlignOf(f32)
//   - element stride is 4 = roundUp(AlignOf(f32),SizeOf(f32)) = roundUp(4,4)
//   - size is 32 = stride * number_of_elements = 4 * 8
var small_stride: array<f32, 8>;

// Array where:
//   - alignment is 16 = AlignOf(vec3<f32>) = 16
//   - element stride is 16 = roundUp(AlignOf(vec3<f32>), SizeOf(vec3<f32>))
//                          = roundUp(16,12)
//   - size is 128 = stride * number_of_elements = 16 * 8
var bigger_stride: array<vec3<f32>, 8>;
EXAMPLE: Runtime-sized array layout examples
// Array where:
//   - alignment is 4 = AlignOf(f32)
//   - element stride is 4 = roundUp(AlignOf(f32),SizeOf(f32)) = 4
// If B is the effective buffer binding size for the binding on the
// draw or dispatch command, the number of elements is:
//   N_runtime = floor(B / element stride) = floor(B / 4)
@group(0) @binding(0)
var<storage> weights: array<f32>;

// Array where:
//   - alignment is 16 = AlignOf(vec3<f32>) = 16
//   - element stride is 16 = roundUp(AlignOf(vec3<f32>), SizeOf(vec3<f32>))
//                          = roundUp(16,12)
// If B is the effective buffer binding size for the binding on the
// draw or dispatch command, the number of elements is:
//   N_runtime = floor(B / element stride) = floor(B / 16)
var<storage> directions: array<vec3<f32>>;
5.3.6.4. Internal Layout of Values

This section describes how the internals of a value are placed in the byte locations of a buffer, given an assumed placement of the overall value. These layouts depend on the value’s type, and the align and size attributes on structure members.

The buffer byte offset at which a value is placed must satisfy the type alignment requirement: If a value of type T is placed at buffer offset k, then k = c × AlignOf(T), for some non-negative integer c.

The data will appear identically regardless of the address space.

When a value V of type u32 or i32 is placed at byte offset k of a host-shared buffer, then:

Note: Recall that i32 uses twos-complement representation, so the sign bit is in bit position 31.

A value V of type f32 is represented in IEEE-754 binary32 format. It has one sign bit, 8 exponent bits, and 23 fraction bits. When V is placed at byte offset k of host-shared buffer, then:

A value V of type f16 is represented in IEEE-754 binary16 format. It has one sign bit, 5 exponent bits, and 10 fraction bits. When V is placed at byte offset k of host-shared buffer, then:

Note: The above rules imply that numeric values in host-shared buffers are stored in little-endian format.

When a value V of atomic type atomic<T> is placed in a host-shared buffer, it has the same internal layout as a value of the underlying type T.

When a value V of vector type vecN<T> is placed at byte offset k of a host-shared buffer, then:

When a value V of matrix type matCxR<T> is placed at byte offset k of a host-shared buffer, then:

When a value of array type A is placed at byte offset k of a host-shared memory buffer, then:

When a value of structure type S is placed at byte offset k of a host-shared memory buffer, then:

5.3.6.5. Address Space Layout Constraints

The storage and uniform address spaces have different buffer layout constraints which are described in this section.

All structure and array types directly or indirectly referenced by a variable must obey the constraints of the variable’s address space. Violations of an address space constraint results in a shader-creation error.

In this section we define RequiredAlignOf(S, C) as the byte offset alignment requirement of values of host-shareable type S when used in address space C.

Alignment requirements of a host-shareable type for storage and uniform address spaces
Host-shareable type S RequiredAlignOf(S, storage) RequiredAlignOf(S, uniform)
i32, u32, f32, or f16 AlignOf(S) AlignOf(S)
atomic<T> AlignOf(S) AlignOf(S)
vecN<T> AlignOf(S) AlignOf(S)
matCxR<T> AlignOf(S) AlignOf(S)
array<T, N> AlignOf(S) roundUp(16, AlignOf(S))
array<T> AlignOf(S) roundUp(16, AlignOf(S))
struct S AlignOf(S) roundUp(16, AlignOf(S))

Structure members of type T must have a byte offset from the start of the structure that is a multiple of the RequiredAlignOf(T, C) for the address space C:

OffsetOfMember(S, M) = k × RequiredAlignOf(T, C)
Where k is a positive integer and M is a member of structure S with type T

Arrays of element type T must have an element stride that is a multiple of the RequiredAlignOf(T, C) for the address space C:

StrideOf(array<T, N>) = k × RequiredAlignOf(T, C)
StrideOf(array<T>) = k × RequiredAlignOf(T, C)
Where k is a positive integer

Note: RequiredAlignOf(T, C) does not impose any additional restrictions on the values permitted for an align attribute, nor does it affect the rules of AlignOf(T). Data is laid out with the rules defined in previous sections and then the resulting layout is validated against the RequiredAlignOf(T, C) rules.

The uniform address space also requires that:

Note: The following examples show how to use align and size attributes on structure members to satisfy layout requirements for uniform buffers. In particular, these techniques can be used mechanically transform a GLSL buffer with std140 layout to WGSL.

EXAMPLE: Satisfying offset requirements for uniform address space
struct S {
  x: f32
}
struct Invalid {
  a: S,
  b: f32 // invalid: offset between a and b is 4 bytes, but must be at least 16
}
@group(0) @binding(0) var<uniform> invalid: Invalid;

struct Valid {
  a: S,
  @align(16) b: f32 // valid: offset between a and b is 16 bytes
}
@group(0) @binding(1) var<uniform> valid: Valid;
EXAMPLE: Satisfying stride requirements for uniform address space
struct small_stride {
  a: array<f32,8> // stride 4
}
// Invalid, stride must be a multiple of 16
@group(0) @binding(0) var<uniform> invalid: small_stride;

struct wrapped_f32 {
  @size(16) elem: f32
}
struct big_stride {
  a: array<wrapped_f32,8> // stride 16
}
@group(0) @binding(1) var<uniform> valid: big_stride;     // Valid

5.4. Memory Views

In addition to calculating with plain values, a WGSL program will also often read values from memory or write values to memory, via memory access operations. Each memory access is performed via a memory view.

A memory view comprises:

The access mode of a memory view must be supported by the address space. See § 5.3.5 Address Spaces.

5.4.1. Reference and Pointer Types

WGSL has two kinds of types for representing memory views: reference types and pointer types.

Constraint Type Description
AS is an address space,
T is a storable type,
AM is an access mode
ref<AS,T,AM> The reference type identified with the set of memory views for memory locations in AS holding values of type T, supporting memory accesses described by mode AM.

Here, T is the store type.

Reference types are not written in WGSL program source; instead they are used to analyze a WGSL program.

AS is an address space,
T is a storable type,
AM is an access mode
ptr<AS,T,AM> The pointer type identified with the set of memory views for memory locations in AS holding values of type T, supporting memory accesses described by mode AM.

Here, T is the store type.

Pointer types may appear in WGSL program source.

Two pointer types are the same if and only if they have the same address space, store type, and access mode.

When analyzing a WGSL program, reference and pointer types are fully parameterized by an address space, a storable type, and an access mode. In code examples in this specification, the comments show this fully parameterized form.

However, in WGSL source text:

EXAMPLE: Pointer type
fn my_function(
  /* 'ptr<function,i32,read_write>' is the type of a pointer value that references
     memory for keeping an 'i32' value, using memory locations in the 'function'
     address space.  Here 'i32' is the store type.
     The implied access mode is 'read_write'. See below for access mode defaults. */
  ptr_int: ptr<function,i32>,

  // 'ptr<private,array<f32,50>,read_write>' is the type of a pointer value that
  // refers to memory for keeping an array of 50 elements of type 'f32', using
  // memory locations in the 'private' address space.
  // Here the store type is 'array<f32,50>'.
  // The implied access mode is 'read_write'. See below for access mode defaults.
  ptr_array: ptr<private, array<f32, 50>>
) { }

Reference types and pointer types are both sets of memory views: a particular memory view is associated with a unique reference value and also a unique pointer value:

Each pointer value p of type ptr<AS,T,AM> corresponds to a unique reference value r of type ref<AS,T,AM>, and vice versa, where p and r describe the same memory view.

5.4.2. Access Mode Defaults

The access mode for a memory view is often determined by context:

The storage address spaces supports both read and read_write access modes. Each other address space supports only one access mode. The default access mode for each address space is described in the following table.

Access Mode Defaults.
Address Space Default Access Mode
function read_write
private read_write
workgroup read_write
uniform read
storage read
handle read

When writing a variable declaration or a pointer type in WGSL source:

5.4.3. Originating Variable

In WGSL a reference value always corresponds to the memory view for some or all of the memory locations for some variable. This defines the originating variable for the reference value.

A pointer value always corresponds to a reference value, and so the originating variable of a pointer is the same as the originating variable of the corresponding reference.

Note: The originating variable is a dynamic concept. The originating variable for a formal parameter of a function depends on the call sites for the function. Different call sites may supply pointers into different originating variables.

5.4.4. Invalid Memory Reference

If a reference or pointer access is out of bounds, an invalid memory reference is produced. Loads from an invalid reference return one of:

Stores to an invalid reference may do one of: It is a dynamic error if read-modify-write atomics that operate on an invalid memory reference load and store from different memory locations if they access memory.

5.4.5. Use Cases for References and Pointers

References and pointers are distinguished by how they are used:

Defining references in this way enables simple idiomatic use of variables:

EXAMPLE: Reference types enable simple use of variables
@compute @workgroup_size(1)
fn main() {
  // 'i' has reference type ref<function,i32,read_write>
  // The memory locations for 'i' store the i32 value 0.
  var i: i32 = 0;

  // 'i + 1' can only match a type rule where the 'i' subexpression is of type i32.
  // So the expression 'i + 1' has type i32, and at evaluation, the 'i' subexpression
  // evaluates to the i32 value stored in the memory locations for 'i' at the time
  // of evaluation.
  let one: i32 = i + 1;

  // Update the value in the locations referenced by 'i' so they hold the value 2.
  i = one + 1;

  // Update the value in the locations referenced by 'i' so they hold the value 5.
  // The evaluation of the right-hand-side occurs before the assignment takes effect.
  i = i + 3;
}
EXAMPLE: Returning a reference returns the value loaded via the reference
var<private> age: i32;
fn get_age() -> i32 {
  // The type of the expression in the return statement must be 'i32' since it
  // must match the declared return type of the function.
  // The 'age' expression is of type ref<private,i32,read_write>.
  // Apply the Load Rule, since the store type of the reference matches the
  // required type of the expression, and no other type rule applies.
  // The evaluation of 'age' in this context is the i32 value loaded from the
  // memory locations referenced by 'age' at the time the return statement is
  // executed.
  return age;
}

fn caller() {
  age = 21;
  // The copy_age constant will get the i32 value 21.
  let copy_age: i32 = get_age();
}

Defining pointers in this way enables two key use cases:

Note: The following examples use WGSL features explained later in this specification.

EXAMPLE: Using a pointer as a short name for part of a variable
struct Particle {
  position: vec3<f32>,
  velocity: vec3<f32>
}
struct System {
  active_index: i32,
  timestep: f32,
  particles: array<Particle,100>
}
@group(0) @binding(0) var<storage,read_write> system: System;

@compute @workgroup_size(1)
fn main() {
  // Form a pointer to a specific Particle in storage memory.
  let active_particle: ptr<storage,Particle> =
      &system.particles[system.active_index];

  let delta_position: vec3<f32> = (*active_particle).velocity * system.timestep;
  let current_position: vec3<f32>  = (*active_particle).position;
  (*active_particle).position = delta_position + current_position;
}
EXAMPLE: Using a pointer as a formal parameter
fn add_one(x: ptr<function,i32>) {
  /* Update the locations for 'x' to contain the next higher integer value,
     (or to wrap around to the largest negative i32 value).
     On the left-hand side, unary '*' converts the pointer to a reference that
     can then be assigned to. It has a read_write access mode, by default.
     /* On the right-hand side:
        - Unary '*' converts the pointer to a reference, with a read_write
          access mode.
        - The only matching type rule is for addition (+) and requires '*x' to
          have type i32, which is the store type for '*x'.  So the Load Rule
          applies and '*x' evaluates to the value stored in the memory for '*x'
          at the time of evaluation, which is the i32 value for 0.
        - Add 1 to 0, to produce a final value of 1 for the right-hand side. */
     Store 1 into the memory for '*x'. */
  *x = *x + 1;
}

@compute @workgroup_size(1)
fn main() {
  var i: i32 = 0;

  // Modify the contents of 'i' so it will contain 1.
  // Use unary '&' to get a pointer value for 'i'.
  // This is a clear signal that the called function has access to the memory
  // for 'i', and may modify it.
  add_one(&i);
  let one: i32 = i;  // 'one' has value 1.
}

5.4.6. Forming Reference and Pointer Values

A reference value is formed in one of the following ways:

In all cases, the access mode of the result is the same as the access mode of the original reference.

EXAMPLE: Component reference from a composite reference
struct S {
    age: i32,
    weight: f32
}
var<private> person: S;
// Elsewhere, 'person' denotes the reference to the memory underlying the variable,
// and will have type ref<private,S,read_write>.

fn f() {
    var uv: vec2<f32>;
    // For the remainder of this function body, 'uv' denotes the reference
    // to the memory underlying the variable, and will have type
    // ref<function,vec2<f32>,read_write>.

    // Evaluate the left-hand side of the assignment:
    //   Evaluate 'uv.x' to yield a reference:
    //   1. First evaluate 'uv', yielding a reference to the memory for
    //      the 'uv' variable. The result has type ref<function,vec2<f32>,read_write>.
    //   2. Then apply the '.x' vector access phrase, yielding a reference to
    //      the memory for the first component of the vector pointed at by the
    //      reference value from the previous step.
    //      The result has type ref<function,f32,read_write>.
    // Evaluating the right-hand side of the assignment yields the f32 value 1.0.
    // Store the f32 value 1.0 into the storage memory locations referenced by uv.x.
    uv.x = 1.0;

    // Evaluate the left-hand side of the assignment:
    //   Evaluate 'uv[1]' to yield a reference:
    //   1. First evaluate 'uv', yielding a reference to the memory for
    //      the 'uv' variable. The result has type ref<function,vec2<f32>,read_write>.
    //   2. Then apply the '[1]' array index phrase, yielding a reference to
    //      the memory for second component of the vector referenced from
    //      the previous step.  The result has type ref<function,f32,read_write>.
    // Evaluating the right-hand side of the assignment yields the f32 value 2.0.
    // Store the f32 value 2.0 into the storage memory locations referenced by uv[1].
    uv[1] = 2.0;

    var m: mat3x2<f32>;
    // When evaluating 'm[2]':
    // 1. First evaluate 'm', yielding a reference to the memory for
    //    the 'm' variable. The result has type ref<function,mat3x2<f32>,read_write>.
    // 2. Then apply the '[2]' array index phrase, yielding a reference to
    //    the memory for the third column vector pointed at by the reference
    //    value from the previous step.
    //    Therefore the 'm[2]' expression has type ref<function,vec2<f32>,read_write>.
    // The 'let' declaration is for type vec2<f32>, so the declaration
    // statement requires the initializer to be of type vec2<f32>.
    // The Load Rule applies (because no other type rule can apply), and
    // the evaluation of the initializer yields the vec2<f32> value loaded
    // from the memory locations referenced by 'm[2]' at the time the declaration
    // is executed.
    let p_m_col2: vec2<f32> = m[2];

    var A: array<i32,5>;
    // When evaluating 'A[4]'
    // 1. First evaluate 'A', yielding a reference to the memory for
    //    the 'A' variable. The result has type ref<function,array<i32,5>,read_write>.
    // 2. Then apply the '[4]' array index phrase, yielding a reference to
    //    the memory for the fifth element of the array referenced by
    //    the reference value from the previous step.
    //    The result value has type ref<function,i32,read_write>.
    // The let-declaration requires the right-hand-side to be of type i32.
    // The Load Rule applies (because no other type rule can apply), and
    // the evaluation of the initializer yields the i32 value loaded from
    // the memory locations referenced by 'A[4]' at the time the declaration
    // is executed.
    let A_4_value: i32 = A[4];

    // When evaluating 'person.weight'
    // 1. First evaluate 'person', yielding a reference to the memory for
    //    the 'person' variable declared at module scope.
    //    The result has type ref<private,S,read_write>.
    // 2. Then apply the '.weight' member access phrase, yielding a reference to
    //    the memory for the second member of the memory referenced by
    //    the reference value from the previous step.
    //    The result has type ref<private,f32,read_write>.
    // The let-declaration requires the right-hand-side to be of type f32.
    // The Load Rule applies (because no other type rule can apply), and
    // the evaluation of the initializer yields the f32 value loaded from
    // the memory locations referenced by 'person.weight' at the time the
    // declaration is executed.
    let person_weight: f32 = person.weight;
}

A pointer value is formed in one of the following ways:

In all cases, the access mode of the result is the same as the access mode of the original pointer.

EXAMPLE: Pointer from a variable
// Declare a variable in the private address space, for storing an f32 value.
var<private> x: f32;

fn f() {
    // Declare a variable in the function address space, for storing an i32 value.
    var y: i32;

    // The name 'x' resolves to the module-scope variable 'x',
    // and has reference type ref<private,f32,read_write>.
    // Applying the unary '&' operator converts the reference to a pointer.
    // The access mode is the same as the access mode of the original variable, so
    // the fully specified type is ptr<private,f32,read_write>.  But read_write
    // is the default access mode for function address space, so read_write does not
    // have to be spelled in this case
    let x_ptr: ptr<private,f32> = &x;

    // The name 'y' resolves to the function-scope variable 'y',
    // and has reference type ref<private,i32,read_write>.
    // Applying the unary '&' operator converts the reference to a pointer.
    // The access mode defaults to 'read_write'.
    let y_ptr: ptr<function,i32> = &y;

    // A new variable, distinct from the variable declared at module scope.
    var x: u32;

    // Here, the name 'x' resolves to the function-scope variable 'x' declared in
    // the previous statement, and has type ref<function,u32,read_write>.
    // Applying the unary '&' operator converts the reference to a pointer.
    // The access mode defaults to 'read_write'.
    let inner_x_ptr: ptr<function,u32> = &x;
}

5.4.7. Comparison with References and Pointers in Other Languages

This section is informative, not normative.

References and pointers in WGSL are more restricted than in other languages. In particular:

Note: From the above rules, it is not possible to form a "dangling" pointer, i.e. a pointer that does not reference the memory for a valid (or "live") originating variable.

5.5. Texture and Sampler Types

A texel is a scalar or vector used as the smallest independently accessible element of a texture. The word texel is short for texture element.

A texture is a collection of texels supporting special operations useful for rendering. In WGSL, those operations are invoked via texture builtin functions. See § 17.5 Texture Built-in Functions for a complete list.

A WGSL texture corresponds to a WebGPU GPUTexture.

A texture is either arrayed, or non-arrayed:

A texture has the following features:

texel format

The data in each texel. See § 5.5.1 Texel Formats.

dimensionality

The number of dimensions in the grid coordinates, and how the coordinates are interpreted. The number of dimensions is 1, 2, or 3. Most textures use cartesian coordinates. Cube textures have six square faces, and are sampled with a three dimensional coordinate interpreted as a direction vector from the origin toward the cube centered on the origin.

size

The extent of grid coordinates along each dimension.

mip level count

The mip level count is at least 1 for sampled textures, and equal to 1 for storage textures.
Mip level 0 contains a full size version of the texture. Each successive mip level contains a filtered version of the previous mip level at half the size (within rounding) of the previous mip level.
When sampling a texture, an explicit or implicitly-computed level-of-detail is used to select the mip levels from which to read texel data. These are then combined via filtering to produce the sampled value.

arrayed

whether the texture is arrayed.

array size

the number of homogeneous grids, if the texture is arrayed

A texture’s representation is typically optimized for rendering operations. To achieve this, many details are hidden from the programmer, including data layouts, data types, and internal operations that cannot be expressed directly in the shader language.

As a consequence, a shader does not have direct access to the texel memory within a texture variable. Instead, access is mediated through an opaque handle:

In this way, the set of supported operations for a texture type is determined by the availability of texture builtin functions having a formal parameter with that texture type.

Note: The handle stored by a texture variable cannot be changed by the shader. That is, the variable is read-only, even if the underlying texture to which it provides access may be mutable (e.g. a write-only storage texture).

The texture types are the set of types defined in:

A sampler is an opaque handle that controls how texels are accessed from a sampled texture.

A WGSL sampler maps to a WebGPU GPUSampler.

Texel access is controlled via several properties of the sampler:

addressing mode

Controls how texture boundaries and out-of-bounds coordinates are resolved. The addressing mode for each texture dimension can be set independently. See WebGPU GPUAddressMode.

filter mode

Controls which texels are accessed to produce the final result. Filtering can either use the nearest texel or interpolate between multiple texels. Multiple filter modes can be set independently. See WebGPU GPUFilterMode.

LOD clamp

Controls the min and max levels of details that are accessed.

comparison

Controls the type of comparison done for comparison sampler. See WebGPU GPUCompareFunction.

max anisotropy

Controls the maximum anisotropy value used by the sampler.

Samplers cannot be created in WGSL programs and their state (e.g. the properties listed above) are immutable within a shader and can only be set by the WebGPU API.

It is a pipeline-creation error if a filtering sampler (i.e. any sampler using interpolative filtering) is used with texture that has a non-filterable format.

Note: The handle stored by a sampler variable cannot be changed by the shader.

5.5.1. Texel Formats

In WGSL, certain texture types are parameterized by texel format.

A texel format is characterized by:

channels

Each channel contains a scalar. A texel format has up to four channels: r, g, b, and a, normally corresponding to the concepts of red, green, blue, and alpha channels.

channel format

The number of bits in the channel, and how those bits are interpreted.

Each texel format in WGSL corresponds to a WebGPU GPUTextureFormat with the same name.

Only certain texel formats are used in WGSL source code. The channel formats used to define those texel formats are listed in the Channel Formats table. The last column specifies the conversion from the stored channel bits to the value used in the shader. This is also known as the channel transfer function, or CTF.

Note: The channel transfer function for 8unorm maps {0,...,255} to the floating point interval [0.0, 1.0].

Note: The channel transfer function for 8snorm maps {-128,...,127} to the floating point interval [-1.0, 1.0].

Channel Formats
Channel format Number of stored bits Interpretation of stored bits Shader type Shader value (Channel Transfer Function)
8unorm 8 unsigned integer v ∈ {0,...,255} f32 v ÷ 255
8snorm 8 signed integer v ∈ {-128,...,127} f32 max(-1, v ÷ 127)
8uint 8 unsigned integer v ∈ {0,...,255} u32 v
8sint 8 signed integer v ∈ {-128,...,127} i32 v
16uint 16 unsigned integer v ∈ {0,...,65535} u32 v
16sint 16 signed integer v ∈ {-32768,...,32767} i32 v
16float 16 IEEE-754 binary16 16-bit floating point value v, with 1 sign bit, 5 exponent bits, 10 mantissa bits f32 v
32uint 32 32-bit unsigned integer value v u32 v
32sint 32 32-bit signed integer value v i32 v
32float 32 IEEE-754 binary32 32-bit floating point value v f32 v

The texel formats listed in the Texel Formats for Storage Textures table correspond to the WebGPU plain color formats which support the WebGPU STORAGE usage. These texel formats are used to parameterize the storage texture types defined in § 5.5.5 Storage Texture Types.

When the texel format does not have all four channels, then:

The last column in the table below uses the format-specific channel transfer function from the channel formats table.

Texel Formats for Storage Textures
Texel format Channel format Channels in memory order Corresponding shader value
rgba8unorm 8unorm r, g, b, a vec4<f32>(CTF(r), CTF(g), CTF(b), CTF(a))
rgba8snorm 8snorm r, g, b, a vec4<f32>(CTF(r), CTF(g), CTF(b), CTF(a))
rgba8uint 8uint r, g, b, a vec4<u32>(CTF(r), CTF(g), CTF(b), CTF(a))
rgba8sint 8sint r, g, b, a vec4<i32>(CTF(r), CTF(g), CTF(b), CTF(a))
rgba16uint 16uint r, g, b, a vec4<u32>(CTF(r), CTF(g), CTF(b), CTF(a))
rgba16sint 16sint r, g, b, a vec4<i32>(CTF(r), CTF(g), CTF(b), CTF(a))
rgba16float 16float r, g, b, a vec4<f32>(CTF(r), CTF(g), CTF(b), CTF(a))
r32uint 32uint r vec4<u32>(CTF(r), 0u, 0u, 1u)
r32sint 32sint r vec4<i32>(CTF(r), 0, 0, 1)
r32float 32float r vec4<f32>(CTF(r), 0.0, 0.0, 1.0)
rg32uint 32uint r, g vec4<u32>(CTF(r), CTF(g), 0.0, 1.0)
rg32sint 32sint r, g vec4<i32>(CTF(r), CTF(g), 0.0, 1.0)
rg32float 32float r, g vec4<f32>(CTF(r), CTF(g), 0.0, 1.0)
rgba32uint 32uint r, g, b, a vec4<u32>(CTF(r), CTF(g), CTF(b), CTF(a))
rgba32sint 32sint r, g, b, a vec4<i32>(CTF(r), CTF(g), CTF(b), CTF(a))
rgba32float 32float r, g, b, a vec4<f32>(CTF(r), CTF(g), CTF(b), CTF(a))

5.5.2. Sampled Texture Types

texture_1d<type>
texture_2d<type>
texture_2d_array<type>
texture_3d<type>
texture_cube<type>
texture_cube_array<type>

5.5.3. Multisampled Texture Types

texture_multisampled_2d<type>

5.5.4. External Sampled Texture Types

texture_external

texture_external is an opaque 2d float-sampled texture type similar to texture_2d<f32> but potentially with a different representation. It can be read using textureLoad or textureSampleLevel built-in functions, which handle these different representations opaquely.

See WebGPU § GPUExternalTexture.

5.5.5. Storage Texture Types

A storage texture supports accessing a single texel without the use of a sampler.

A storage texture type must be parameterized by one of the texel formats for storage textures. The texel format determines the conversion function as specified in § 5.5.1 Texel Formats.

For a write-only storage texture the inverse of the conversion function is used to convert the shader value to the stored texel.

See § 17.5 Texture Built-in Functions.

TODO(dneto): Move description of the conversion to the builtin function that actually does the reading.

texture_storage_1d<texel_format,access>
texture_storage_2d<texel_format,access>
texture_storage_2d_array<texel_format,access>
texture_storage_3d<texel_format,access>

5.5.6. Depth Texture Types

texture_depth_2d
texture_depth_2d_array
texture_depth_cube
texture_depth_cube_array
texture_depth_multisampled_2d

5.5.7. Sampler Type

A sampler mediates access to a sampled texture or a depth texture, by performing a combination of:

A sampler types are:

Type Description
sampler Sampler. Mediates access to a sampled texture.
sampler_comparison Comparison sampler. Mediates access to a depth texture.

Samplers are parameterized when created in the WebGPU API. They cannot be modified by a WGSL program.

Samplers can only be used by the texture builtin functions.

sampler
sampler_comparison

5.5.8. Texture and Sampler Types Grammar

texture_and_sampler_types :

| sampler_type

| depth_texture_type

| sampled_texture_type less_than type_specifier greater_than

| multisampled_texture_type less_than type_specifier greater_than

| storage_texture_type less_than texel_format comma access_mode greater_than

sampler_type :

| sampler

| sampler_comparison

sampled_texture_type :

| texture_1d

| texture_2d

| texture_2d_array

| texture_3d

| texture_cube

| texture_cube_array

multisampled_texture_type :

| texture_multisampled_2d

storage_texture_type :

| texture_storage_1d

| texture_storage_2d

| texture_storage_2d_array

| texture_storage_3d

depth_texture_type :

| texture_depth_2d

| texture_depth_2d_array

| texture_depth_cube

| texture_depth_cube_array

| texture_depth_multisampled_2d

5.6. Type Aliases

A type alias declares a new name for an existing type. The declaration must appear at module scope, and its scope is the entire program.

When type T is defined as a type alias for a structure type S, all properties of the members of S, including attributes, carry over to the members of T.

type_alias_decl :

| type ident equal type_specifier

EXAMPLE: Type Alias
type Arr = array<i32, 5>;

type RTArr = array<vec4<f32>>;

type single = f32;     // Declare an alias for f32
const pi_approx: single = 3.1415;
fn two_pi() -> single {
  return single(2) * pi_approx;
}

5.7. Type Specifier Grammar

type_specifier :

| ident

| type_specifier_without_ident

type_specifier_without_ident :

| bool

| float32

| float16

| int32

| uint32

| vec_prefix less_than type_specifier greater_than

| mat_prefix less_than type_specifier greater_than

| pointer less_than address_space comma type_specifier ( comma access_mode ) ? greater_than

| array_type_specifier

| atomic less_than type_specifier greater_than

| texture_and_sampler_types

vec_prefix :

| vec2

| vec3

| vec4

mat_prefix :

| mat2x2

| mat2x3

| mat2x4

| mat3x2

| mat3x3

| mat3x4

| mat4x2

| mat4x3

| mat4x4

When the type is named by an identifier, the use of the identifier must be in scope of a type alias or a structure type declaration for that name. See § 4 Declaration and Scope.

6. Variable and Value Declarations

Variable and value declarations provide names for data values.

A value declaration creates a name for a value, and that value is immutable once it has been declared. The four kinds of value declarations are const, override, let, and formal parameter declarations, further described below (see § 6.2 Value Declarations).

A variable declaration creates a name for memory locations for storing a value; the value stored there may be updated, if the variable has a read_write access mode. There is one kind of variable declaration, var, but it has options for address space and access modes in various combinations, described below (see § 6.3 var Declarations).

Note: A value declaration does not have associated memory locations. For example, no WGSL expression can form a pointer to the value.

A declaration appearing outside of any function definition is at module scope. Its name is in scope for the entire program.

A declaration appearing within a function definition is in function scope. The name is available for use in the statement immediately after its declaration until the end of the brace-delimited list of statements immediately enclosing the declaration. A function-scope declaration is a dynamic context.

Variable and value declarations have a similar overall syntax:

// Specific value declarations.
             const    name [: type]  = initializer ;
[attribute]  override name [: type] [= initializer];
             let      name [: type]  = initializer ;

// General variable form.
[attribute]* var[<address_space[, access_mode]>] name [: type] [= initializer];

// Specific variable declarations.
// Function scope.
             var[<function>] name [: type] [= initializer];

// Module scope.
             var<private>    name [: type] [= initializer];
             var<workgroup>  name : type;
[attribute]+ var<uniform>    name : type;
[attribute]+ var             name : texture_type;
[attribute]+ var             name : sampler_type;
[attribute]+ var<storage[, access_mode]> name : type;

Each such declaration must have an explicitly specified type or an initializer. Both a type and an initializer may be specified. Each such declaration determines the type for the associated data value, known as the effective-value-type for the declaration. The effective-value-type of the declaration is:

Each kind of value or variable declaration may place additional constraints on the form of the initializer expression, if present, and on the effective-value-type.

Variable and Value Declaration Feature Summary.
Declaration Mutability Scope Effective-value-type1 Initializer Support Initializer Expression2 Part of Resource Interface
const Immutable Module or function Constructible (Concrete or abstract) Required const-expression No
override Immutable Module Concrete scalar Optional3 const-expression or override-expression No4
let Immutable Function Concrete constructible or pointer type Required const-expression, override-expression, or runtime expression No
var<storage, read>
var<storage>
Immutable Module Concrete host-shareable Disallowed Yes.
storage buffer
var<storage, read_write>5 Mutable Module Concrete host-shareable Disallowed Yes.
storage buffer
var<uniform> Immutable Module Concrete constructible host-shareable Disallowed Yes.
uniform buffer
var Immutable6 Module Texture Disallowed Yes.
texture resource
var Immutable Module Sampler Disallowed Yes.
sampler resource
var<workgroup>5 Mutable Module Concrete plain type with a fixed footprint7 Disallowed8 No
var<private> Mutable Module Concrete constructible Optional8 const-expression or override-expression No
var<function>
var
Mutable Function Concrete constructible Optional8 const-expression, override-expression, or runtime expression No
  1. Only const-declarations can be abstract types, and only when the type is not explicitly specified.

  2. The type of the expression must be feasibly converted to the effective-value-type.

  3. If an initializer is not specified, a value must be provided at pipeline-creation time.

  4. Override-declarations are part of the shader interface, but are not bound resources.

  5. Atomic types can only appear in mutable storage buffers or workgroup variables.

  6. The data in storage textures with a write access mode is mutable, but can only be modified via textureStore built-in function. The variable itself cannot be modified.

  7. The element count of the outermost array may be an override-expression.

  8. If there is no initializer, the variable is default initialized.

6.1. Variables vs Values

Variable declarations are the only mutable data in a WGSL program. Value declarations are always immutable. Variables can be the basis of reference and pointer values because variables have associated memory locations, whereas a value declaration cannot be the basis of a pointer or reference value.

Using variables is generally more expensive than using value declarations, because using a variable requires extra operations to read or write to the memory locations associated with the variable.

Generally speaking, an author should prefer using declarations in the following order, with the most preferred option listed first:

This will generally result in the best overall performance of a shader.

6.2. Value Declarations

When an identifier resolves to a value declaration, the identifier denotes that value.

WGSL provides multiple kinds of value declarations. The value for each kind of declaration is fixed at a different point in the shader lifecycle. The different kinds of value declarations and when their values are fixed are:

Note: Formal parameters are described in § 9 Functions.

6.2.1. const Declarations

A const-declaration specifies a name for a data value that is fixed at shader-creation time. Each const-declaration requires an initializer. A const-declaration can be declared in module or function scope. The initializer expression must be a const-expression. The type of a const-declaration must be a concrete or abstract constructible type. const-declarations are the only declarations where the effective-value-type may be abstract.

Note: Since abstract numeric types cannot be spelled in WGSL, they can only be used via type inference.

EXAMPLE: const-declarations at module scope
const a = 4;                  // AbstractInt with a value of 4.
const b : i32 = 4;            // i32 with a value of 4.
const c : u32 = 4;            // u32 with a value of 4.
const d : f32 = 4;            // f32 with a value of 4.
const e = vec3(a, a, a);      // vec3 of AbstractInt with a value of (4, 4, 4).
const f = 2.0;                // AbstractFloat with a value of 2.
const g = mat2x2(a, f, a, f); // mat2x2 of AbstractFloat with a value of:
                              // ((4.0, 2.0), (4.0, 2.0)).
                              // The AbstractInt a converts to AbstractFloat.
                              // An AbstractFloat cannot convert to AbstractInt.
const h = array(a, f, a, f);  // array of AbstractFloat with 4 components:
                              // (4.0, 2.0, 4.0, 2.0).

6.2.2. override Declarations

An override-declaration specifies a name for a pipeline-overridable constant value. The value of a pipeline-overridable constant is fixed at pipeline-creation time. The value is one provided by the WebGPU pipeline-creation method, if specified, and otherwise is the value of its concretized initializer expression. The effective-value-type of an override-declaration must be a concrete scalar type.

An initializer expression is optional. If present, it must be an override-expression and represents the pipeline-overridable constant default value. If no initializer is specified, it is a pipeline-creation error if a value is not provided at pipeline-creation time.

If the declaration has an id attribute applied, the literal operand is known as the pipeline constant ID, and must be a unique integer between 0 and 65535 inclusive. That is, two override-declarations must not use the same pipeline constant ID.

The application can specify its own value for an override-declaration at pipeline-creation time. The pipeline creation API accepts a mapping from overridable constants to a value of the constant’s type. The constant is identified by a pipeline-overridable constant identifier string, which is the base-10 representation of the pipeline constant ID if specified, and otherwise the declared name of the constant.

EXAMPLE: Module constants, pipeline-overrideable
@id(0)    override has_point_light: bool = true;  // Algorithmic control
@id(1200) override specular_param: f32 = 2.3;     // Numeric control
@id(1300) override gain: f32;                     // Must be overridden
          override width: f32 = 0.0;              // Specified at the API level using
                                                  // the name "width".
          override depth: f32;                    // Specified at the API level using
                                                  // the name "depth".
                                                  // Must be overridden.
          override height = 2 * depth;            // The default value
                                                  // (if not set at the API level),
                                                  // depends on another
                                                  // overridable constant.

6.2.3. let Declarations

A let-declaration specifies a name for a value that is fixed each time the statement is executed at runtime. A let-declaration must only be declared in function scope, and as such, is a dynamic context. A let-declaration must have an initializer expression. The value is the concretized value of the initializer. The effective-value-type of a let-declaration must be either a concrete constructible type or a pointer type.

EXAMPLE: let-declared constants at function scope
// 'blockSize' denotes the i32 value 1024.
let blockSize: i32 = 1024;

// 'row_size' denotes the u32 value 16u.  The type is inferred.
let row_size = 16u;

6.3. var Declarations

A variable is a named reference to memory that can contain a value of a particular storable type.

Two types are associated with a variable: its store type (the type of value that may be placed in the referenced memory) and its reference type (the type of the variable itself). If a variable has store type T, address space AS, and access mode AM, then its reference type is ref<AS,T,AM>. The store type of a variable is always concrete.

A variable declaration:

When an identifier resolves to a variable declaration, the identifier is an expression denoting the reference memory view for the variable’s memory, and its type is the variable’s reference type. See § 7.13 Variable Identifier Expression.

Variables in the private, storage, uniform, workgroup, and handle address spaces must only be declared in module scope, while variables in the function address space must only be declared in function scope. The address space must be specified for all address spaces except handle and function. The handle address space must not be specified. Specifying the function address space is optional.

The access mode always has a default value, and except for variables in the storage address space, must not be specified in the WGSL source. See § 5.4.2 Access Mode Defaults.

A variable in the uniform address space is a uniform buffer variable. Its store type must be a host-shareable constructible type, and must satisfy the address space layout constraints.

A variable in the storage address space is a storage buffer variable. Its store type must be a host-shareable type and must satisfy the address space layout constraints. The variable may be declared with a read or read_write access mode; the default is read.

A texture resource is a variable whose effective-value-type is a texture type. It is declared at module scope. It holds an opaque handle which is used to access the underlying grid of texels in a texture. The handle itself is in the handle address space and is is always read-only. In many cases the underlying texels are read-only. For a write-only storage texture, the underlying texels are write-only.

A sampler resource is a variable whose effective-value-type is a sampler type. It is declared at module scope, exists in the handle address space, and is immutable.

As described in § 10.3.2 Resource Interface, uniform buffers, storage buffers, textures, and samplers form the resource interface of a shader.

The lifetime of a variable is the period during shader execution for which the memory locations are associated with the variable. The lifetime of a module scope variable is the entire execution of the shader stage. There is an independent version of a variable in the private and function address spaces for each invocation. Function-scope variables are a dynamic context. The lifetime of a function-scope variable is determined by its scope:

Two variables with overlapping lifetimes will not have overlapping memory locations. When a variable’s lifetime ends, its memory may be used for another variable.

Note: WGSL ensures the contents of a variable are only observable during the variable’s lifetime.

When a variable in the private, function, or workgroup address spaces is created, it will have an initial value. If no initializer is specified the initial value is the default initial value. The initial values are computed as follows:

Variables in other address spaces are resources set by bindings in the draw command or dispatch command.

Consider the following snippet of WGSL:

EXAMPLE: Variable initial values
var i: i32;         // Initial value is 0.  Not recommended style.
loop {
  var twice: i32 = 2 * i;   // Re-evaluated each iteration.
  i++;
  if i == 5 { break; }
}
The loop body will execute five times. Variable i will take on values 0, 1, 2, 3, 4, 5, and variable twice will take on values 0, 2, 4, 6, 8.

Consider the following snippet of WGSL:

EXAMPLE: Reading a variable multiple times
var x: f32 = 1.0;
let y = x * x + x + 1;
Because x is a variable, all accesses to it turn into load and store operations. However, it is expected that either the browser or the driver optimizes this intermediate representation such that the redundant loads are eliminated.
EXAMPLE: Module scope variable declarations
var<private> decibels: f32;
var<workgroup> worklist: array<i32,10>;

struct Params {
  specular: f32,
  count: i32
}

// Uniform buffer. Always read-only, and has more restrictive layout rules.
@group(0) @binding(2)
var<uniform> param: Params;    // A uniform buffer

// A storage buffer, for reading and writing
@group(0) @binding(0)
var<storage,read_write> pbuf: array<vec2<f32>>;

// Textures and samplers are always in "handle" space.
@group(0) @binding(1)
var filter_params: sampler;
EXAMPLE: Access modes for buffers
// Storage buffers
@group(0) @binding(0)
var<storage,read> buf1: Buffer;       // Can read, cannot write.
@group(0) @binding(0)
var<storage> buf2: Buffer;            // Can read, cannot write.
@group(0) @binding(1)
var<storage,read_write> buf3: Buffer; // Can both read and write.

struct ParamsTable {weight: f32}

// Uniform buffer. Always read-only, and has more restrictive layout rules.
@group(0) @binding(2)
var<uniform> params: ParamsTable;     // Can read, cannot write.
EXAMPLE: Function scope variables and constants
fn f() {
   var<function> count: u32;  // A variable in function address space.
   var delta: i32;            // Another variable in the function address space.
   var sum: f32 = 0.0;        // A function address space variable with initializer.
   var pi = 3.14159;          // Infer the f32 store type from the initializer.
}

6.4. Variable and Value Declaration Grammar Summary

variable_statement :

| variable_decl

| variable_decl equal expression

| let optionally_typed_ident equal expression

| const optionally_typed_ident equal expression

variable_decl :

| var variable_qualifier ? optionally_typed_ident

optionally_typed_ident :

| ident ( colon type_specifier ) ?

variable_qualifier :

| less_than address_space ( comma access_mode ) ? greater_than

global_variable_decl :

| attribute * variable_decl ( equal expression ) ?

global_constant_decl :

| const optionally_typed_ident equal expression

| attribute * override optionally_typed_ident ( equal expression ) ?

7. Expressions

Expressions specify how values are computed.

The different kinds of value expressions provide a tradeoff between when they are evaluated and how expressive they can be. The sooner the evaluation, the more constrained the operations, but also the more places the value can be used. This tradeoff leads to different flexibility with each kind of value declaration. const-expressions and override-expressions are evaluated prior to execution on the GPU, so only the result of the computation of the expression is necessary in the final GPU code. Additionally, because const-expressions are evaluated at shader-creation time they can be used in more situations than override-expressions, for example, to size arrays in function scope variables. A runtime expression is an expression that is neither a const-expression nor an override-expression. A runtime expression is computed on the GPU during shader execution. While runtime expressions can be used by fewer grammar elements, they can be computed from a larger class of expressions, for example, other runtime values.

7.1. Early Evaluation Expressions

WGSL defines two types of expressions that can be evaluated before runtime:

7.1.1. const Expressions

Expressions that are evaluated at shader-creation time are called const-expressions. In order for an expression to be evaluated at shader-creation time all identifiers in the expression must resolve to:

The type of a const expression must resolve to a type with a creation-fixed footprint.

Note: Abstract types can be the inferred type of a const-expression.

Example: (42) is analyzed as follows:

Example: -5 is analyzed as follows:

Example: -2147483648 is analyzed as follows:

Example: const minint = -2147483648; is analyzed as follows:

Example: let minint = -2147483648; is analyzed as follows:

7.1.2. override Expressions

Expressions that are evaluated at pipeline creation time are called override-expressions. In order for an expression to be evaluated at pipeline creation time all identifiers in the expression must resolve to:

Note: All const-expressions are also override-expressions.

Note: An override-expression may not be usable as the initializer for an override-declaration, because such initializers must resolve to a concrete scalar type.

Example: override x = 42; is analyzed as follows:

Example: let y = x + 1; is analyzed as follows:

Example: vec3(x,x,x) is analyzed as follows:

7.2. Indeterminate values

In limited cases, an evaluation of a runtime expression can occur using unsupported values for its subexpressions.

In such a case, the result of that evaluation is an indeterminate value of the expression’s static type, meaning some arbitrary implementation-chosen value of the static type.

A distinct value may be produced for each unique dynamic context in which the expression is evaluated. For example, if the evaluation occurs once per iteration of a loop, a distinct value may be computed for each loop iteration.

Note: If the type is a floating point type and the implementation supports NaN values, then the indeterminate value produced at runtime may be a NaN value.

EXAMPLE: Indeterminate value example
fn fun() {
   var extracted_values: array<i32,2>;
   const v = vec2<i32>(0,1);

   for (var i: i32 = 0; i < 2 ; i++) {
      // A runtime-expression used to index a vector, but outside the
      // indexing bounds of the vector, produces an indeterminate value
      // of the vector component type.
      let extract = v[i+5];

      // Now 'extract' is any value of type i32.

      // Save it for later.
      extracted_values[i] = extract;

      if extract == extract {
         // This is always executed
      }
      if extract < 2 {
         // This might be executed, but might not be executed.
         // Even though the original vector components are 0 and 1,
         // the extracted value might not be either of those values.
      }
   }
   if extracted_value[0] == extracted_values[1] {
      // This might be executed, but might not be executed.
   }
}

fn float_fun(runtime_index: u32) {
   const v = vec2<f32>(0,1); // A vector of floating point values

   // As in the previous example, 'float_extract' is an indeterminate value.
   // Since it is a floating point type, it may be a NaN.
   let float_extract: f32 = v[runtime_index+5];

   if float_extract == float_extract {
      // This *might not* be executed, because:
      //  -  'float_extract' may be NaN, and
      //  -  a NaN is never equal to any other floating point number,
      //     even another NaN.
   }
}

7.3. Literal Value Expressions

Scalar literal type rules
Precondition Conclusion Notes
true: bool true boolean value.
false: bool false boolean value.
e is an integer literal with no suffix e: AbstractInt Abstract integer literal value.
e is a floating point literal with no suffix e: AbstractFloat Abstract float literal value.
e is an integer literal with i suffix e: i32 32-bit signed integer literal value.
e is an integer literal with u suffix e: u32 32-bit unsigned integer literal value.
e is an floating point literal with f suffix e: f32 32-bit floating point literal value.
e is an floating point literal with h suffix e: f16 16-bit floating point literal value.

7.4. Parenthesized Expressions

Parenthesized expression type rules
Precondition Conclusion Description
e : T ( e ) : T Evaluates to e.
Use parentheses to isolate an expression from the surrounding text.

7.5. Type Constructor Expressions

A type constructor expression explicitly creates a value of a given concrete constructible type.

There are three kinds of constructor expressions:

In the following sections, when a type name precedes a parenthesized argument list, any alias for that type can be used instead, with the same effect.

EXAMPLE: Type constructor expressions using type aliases
type my_vec3f = vec3<f32>;
type my_vec4f = vec4<f32>;

// Computes vec3<f32>(0.0f, 1.0f, 0.0f)
const threeD_e2 = my_vec3f(0.0, 1.0, 0.0);

// Same as writing vec4<f32>(threeD_e2, 0.0)
// Computes vec4<f32>(0.0f, 1.0f, 0.0f, 0.0f)
const fourD_e2 = my_vec4f(threeD_e2, 0.0);

// Same as writing vec3<f32>()
// Computes vec3<f32>(0.0f, 0.0f, 0.0f)
const threeD_zero = my_vec3f();

7.5.1. Construction From Components

The expressions defined in this section create a constructible value by:

The scalar forms given here are redundant, but provide symmetry with scalar conversion expressions, and can be used to enhance readability.

The vector and matrix forms construct vector and matrix values from various combinations of components and subvectors with matching component types. There are overloads for constructing vectors and matrices that specify the dimensions of the target type without having to specify the component type; the component type is inferred from the constructor arguments.

Scalar constructor type rules
Precondition Conclusion Notes
e: bool bool(e): bool Identity.
e: i32 i32(e): i32 Identity.
e: u32 u32(e): u32 Identity.
e: f32 f32(e): f32 Identity.
e: f16 f16(e): f16 Identity.
Vector constructor type rules
Precondition Conclusion Notes
e: T vecN<T>(e): vecN<T> Evaluates e once. Results in the N-component vector where each component has the value of e.
vecN(e): vecN<T>
e1: T
e2: T
vec2<T>(e1,e2): vec2<T>
vec2(e1,e2): vec2<T>
e: vec2<T> vec2<T>(e): vec2<T> Identity. The result is e.
vec2(e): vec2<T>
e1: T
e2: T
e3: T
vec3<T>(e1,e2,e3): vec3<T>
vec3(e1,e2,e3): vec3<T>
e1: T
e2: vec2<T>
vec3<T>(e1,e2): vec3<T>
vec3<T>(e2,e1): vec3<T>
vec3(e1,e2): vec3<T>
vec3(e2,e1): vec3<T>
e: vec3<T> vec3<T>(e): vec3<T> Identity. The result is e.
vec3(e): vec3<T>
e1: T
e2: T
e3: T
e4: T
vec4<T>(e1,e2,e3,e4): vec4<T>
vec4(e1,e2,e3,e4): vec4<T>
e1: T
e2: T
e3: vec2<T>
vec4<T>(e1,e2,e3): vec4<T>
vec4<T>(e1,e3,e2): vec4<T>
vec4<T>(e3,e1,e2): vec4<T>
vec4(e1,e2,e3): vec4<T>
vec4(e1,e3,e2): vec4<T>
vec4(e3,e1,e2): vec4<T>
e1: vec2<T>
e2: vec2<T>
vec4<T>(e1,e2): vec4<T>
vec4(e1,e2): vec4<T>
e1: T
e2: vec3<T>
vec4<T>(e1,e2): vec4<T>
vec4<T>(e2,e1): vec4<T>
vec4(e1,e2): vec4<T>
vec4(e2,e1): vec4<T>
e: vec4<T> vec4<T>(e): vec4<T> Identity. The result is e.
vec4(e): vec4<T>
Matrix constructor type rules
Precondition Conclusion Notes
e: mat2x2<T> mat2x2<T>(e): mat2x2<T>
mat2x2(e): mat2x2<T>
Identity type conversion. The result is e.
e: mat2x3<T> mat2x3<T>(e): mat2x3<T>
mat2x3(e): mat2x3<T>
e: mat2x4<T> mat2x4<T>(e): mat2x4<T>
mat2x4(e): mat2x4<T>
e: mat3x2<T> mat3x2<T>(e): mat3x2<T>
mat3x2(e): mat3x2<T>
e: mat3x3<T> mat3x3<T>(e): mat3x3<T>
mat3x3(e): mat3x3<T>
e: mat3x4<T> mat3x4<T>(e): mat3x4<T>
mat3x4(e): mat3x4<T>
e: mat4x2<T> mat4x2<T>(e): mat4x2<T>
mat4x2(e): mat4x2<T>
e: mat4x3<T> mat4x3<T>(e): mat4x3<T>
mat4x3(e): mat4x3<T>
e: mat4x4<T> mat4x4<T>(e): mat4x4<T>
mat4x4(e): mat4x4<T>
e1: T
...
eN: T
mat2x2<T>(e1,e2,e3,e4): mat2x2<T>
mat3x2<T>(e1,...,e6): mat3x2<T>
mat2x3<T>(e1,...,e6): mat2x3<T>
mat4x2<T>(e1,...,e8): mat4x2<T>
mat2x4<T>(e1,...,e8): mat2x4<T>
mat3x3<T>(e1,...,e9): mat3x3<T>
mat4x3<T>(e1,...,e12): mat4x3<T>
mat3x4<T>(e1,...,e12): mat3x4<T>
mat4x4<T>(e1,...,e16): mat4x4<T>
Column-major construction by elements.
mat2x2(e1,e2,e3,e4): mat2x2<T>
mat3x2(e1,...,e6): mat3x2<T>
mat2x3(e1,...,e6): mat2x3<T>
mat4x2(e1,...,e8): mat4x2<T>
mat2x4(e1,...,e8): mat2x4<T>
mat3x3(e1,...,e9): mat3x3<T>
mat4x3(e1,...,e12): mat4x3<T>
mat3x4(e1,...,e12): mat3x4<T>
mat4x4(e1,...,e16): mat4x4<T>
e1: vec2<T>
e2: vec2<T>
e3: vec2<T>
e4: vec2<T>
mat2x2<T>(e1,e2): mat2x2<T>
mat3x2<T>(e1,e2,e3): mat3x2<T>
mat4x2<T>(e1,e2,e3,e4): mat4x2<T>
Column by column construction.
mat2x2(e1,e2): mat2x2<T>
mat3x2(e1,e2,e3): mat3x2<T>
mat4x2(e1,e2,e3,e4): mat4x2<T>
e1: vec3<T>
e2: vec3<T>
e3: vec3<T>
e4: vec3<T>
mat2x3<T>(e1,e2): mat2x3<T>
mat3x3<T>(e1,e2,e3): mat3x3<T>
mat4x3<T>(e1,e2,e3,e4): mat4x3<T>
Column by column construction.
mat2x3(e1,e2): mat2x3<T>
mat3x3(e1,e2,e3): mat3x3<T>
mat4x3(e1,e2,e3,e4): mat4x3<T>
e1: vec4<T>
e2: vec4<T>
e3: vec4<T>
e4: vec4<T>
mat2x4<T>(e1,e2): mat2x4<T>
mat3x4<T>(e1,e2,e3): mat3x4<T>
mat4x4<T>(e1,e2,e3,e4): mat4x4<T>
Column by column construction.
mat2x4(e1,e2): mat2x4<T>
mat3x4(e1,e2,e3): mat3x4<T>
mat4x4(e1,e2,e3,e4): mat4x4<T>
Array constructor type rules
Precondition Conclusion Notes
e1: T
...
eN: T,
T is concrete and constructible
array<T,N>(e1,...,eN) : array<T,N> Construction of an array from elements.

Note: array<T,N> is constructible because its element count is equal to the number of arguments to the constructor, and hence fully determined at shader-creation time.

e1: T
...
eN: T,
T is constructible
array(e1,...,eN) : array<T,N> Construction of an array from elements. The component type is inferred from the elements' types.
Structure constructor type rules
Precondition Conclusion Notes
e1: T1
...
eN: TN,
S is a constructible structure type with members having types T1 ... TN.
The expression is in the scope of declaration of S.
S(e1,...,eN): S Construction of a structure from members.

7.5.2. Zero Value Expressions

Each concrete, constructible T has a unique zero value written in WGSL as the type followed by an empty pair of parentheses: T ().

The zero values are as follows:

Note: WGSL does not have zero expression for atomic types, runtime-sized arrays, or other types that are not constructible.

Scalar zero value type rules
Precondition Conclusion Notes
bool(): bool false
Zero value
i32(): i32 0
Zero value
u32(): u32 0u
Zero value
f32(): f32 0.0
Zero value
f16(): f16 0.0
Zero value
Vector zero type rules, where T is a scalar type
Precondition Conclusion Notes
vec2<T>(): vec2<T> Zero value
vec3<T>(): vec3<T> Zero value
vec4<T>(): vec4<T> Zero value
EXAMPLE: Zero-valued vectors
vec2<f32>()                 // The zero-valued vector of two f32 components.
vec2<f32>(0.0, 0.0)         // The same value, written explicitly.

vec3<i32>()                 // The zero-valued vector of three i32 components.
vec3<i32>(0, 0, 0)          // The same value, written explicitly.
Matrix zero type rules
Precondition Conclusion Notes
T is f32 or f16 mat2x2<T>(): mat2x2<T>
mat3x2<T>(): mat3x2<T>
mat4x2<T>(): mat4x2<T>
Zero value
mat2x3<T>(): mat2x3<T>
mat3x3<T>(): mat3x3<T>
mat4x3<T>(): mat4x3<T>
Zero value
mat2x4<T>(): mat2x4<T>
mat3x4<T>(): mat3x4<T>
mat4x4<T>(): mat4x4<T>
Zero value
Array zero type rules
Precondition Conclusion Notes
T is a constructible array<T,N>(): array<T,N> Zero-valued array
EXAMPLE: Zero-valued arrays
array<bool, 2>()               // The zero-valued array of two booleans.
array<bool, 2>(false, false)   // The same value, written explicitly.
Structure zero type rules
Precondition Conclusion Notes
S is a constructible structure type.
The expression is in the scope of declaration of S.
S(): S Zero-valued structure: a structure of type S where each member is the zero value for its member type.
EXAMPLE: Zero-valued structures
struct Student {
  grade: i32,
  GPA: f32,
  attendance: array<bool,4>
}

fn func() {
  var s: Student;

  // The zero value for Student
  s = Student();

  // The same value, written explicitly.
  s = Student(0, 0.0, array<bool,4>(false, false, false, false));

  // The same value, written with zero-valued members.
  s = Student(i32(), f32(), array<bool,4>());
}

7.5.3. Conversion Expressions

WGSL does not implicitly convert or promote a numeric or boolean value to another type. Instead use a conversion expression as defined in the tables below.

For details on conversion to and from floating point types, see § 13.6.2 Floating Point Conversion.

Scalar conversion type rules
Precondition Conclusion Notes
e: u32 bool(e): bool Coercion to boolean.
The result is false if e is 0, and true otherwise.
e: i32 bool(e): bool Coercion to boolean.
The result is false if e is 0, and true otherwise.
e: f32 bool(e): bool Coercion to boolean.
The result is false if e is 0.0 or -0.0, and true otherwise. In particular NaN and infinity values map to true.
e: f16 bool(e): bool Coercion to boolean.
The result is false if e is 0.0 or -0.0, and true otherwise. In particular NaN and infinity values map to true.
e: bool i32(e): i32 Conversion of a boolean value to a signed integer
The result is 1 if e is true and 0 otherwise.
e: u32 i32(e): i32 Reinterpretation of bits.
The result is the unique value in i32 that has the same bit pattern as e.
e: f32 i32(e): i32 Value conversion, rounding toward zero.
e: f16 i32(e): i32 Value conversion, rounding toward zero.
e: bool u32(e): u32 Conversion of a boolean value to an unsigned integer.
The result is 1u if e is true and 0u otherwise.
e: i32 u32(e): u32 Reinterpretation of bits.
The result is the unique value in u32 that has the same bit pattern as e.
e: AbstractInt u32(e): u32 Value conversion.

Identity if the value of e can be represented in u32. Otherwise produces a shader-creation error.

Note: This overload exists so expressions such as u32(4*1000*1000*1000) can create a u32 value that would otherwise overflow the i32 type. If this overload did not exist, overload resolution would select the u32(i32) overload, the AbstractInt expression would automatically convert to i32, and this would cause a shader-creation error due to overflow.

e: f32 u32(e): u32 Value conversion, rounding toward zero.
e: f16 u32(e): u32 Value conversion, rounding toward zero.
e: bool f32(e): f32 Conversion of a boolean value to floating point.
The result is 1.0 if e is true and 0.0 otherwise.
e: i32 f32(e): f32 Value conversion, including invalid cases.
e: u32 f32(e): f32 Value conversion, including invalid cases.
e: f16 f32(e): f32 Exact value conversion.
e: bool f16(e): f16 Conversion of a boolean value to floating point
The result is 1.0 if e is true and 0.0 otherwise.
e: i32 f16(e): f16 Value conversion, including invalid cases.
e: u32 f16(e): f16 Value conversion, including invalid cases.
e: f32 f16(e): f16 Lossy value conversion.

Details of conversion to and from floating point are explained in § 13.6.2 Floating Point Conversion.

Vector conversion type rules
Precondition Conclusion Notes
e: vecN<u32> vecN<bool>(e): vecN<bool> Component-wise coercion of a unsigned integer vector to a boolean vector.
e: vecN<i32> vecN<bool>(e): vecN<bool> Component-wise coercion of a signed integer vector to a boolean vector.
e: vecN<f32> vecN<bool>(e): vecN<bool> Component-wise coercion of a binary32 floating point vector to a boolean vector.
e: vecN<f16> vecN<bool>(e): vecN<bool> Component-wise coercion of a binary16 floating point vector to a boolean vector.
e: vecN<bool> vecN<i32>(e): vecN<i32> Component-wise conversion of a boolean vector to signed.
Component i of the result is i32(e[i])
e: vecN<u32> vecN<i32>(e): vecN<i32> Component-wise reinterpretation of bits.
Component i of the result is i32(e[i])
e: vecN<f32> vecN<i32>(e): vecN<i32> Component-wise value conversion to signed integer, including invalid cases.
e: vecN<f16> vecN<i32>(e): vecN<i32> Component-wise value conversion to signed integer, including invalid cases.
e: vecN<bool> vecN<u32>(e): vecN<u32> Component-wise conversion of a boolean vector to unsigned.
Component i of the result is u32(e[i])
e: vecN<AbstractInt> or vecN<i32> vecN<u32>(e): vecN<u32> Component-wise reinterpretation of bits.
e: vecN<f32> vecN<u32>(e): vecN<u32> Component-wise value conversion to unsigned integer, including invalid cases.
e: vecN<f16> vecN<u32>(e): vecN<u32> Component-wise value conversion to unsigned integer, including invalid cases.
e: vecN<bool> vecN<f32>(e): vecN<f32> Component-wise conversion of a boolean vector to floating point.
Component i of the result is f32(e[i])
e: vecN<i32> vecN<f32>(e): vecN<f32> Component-wise value conversion to binary32 floating point, including invalid cases.
e: vecN<f32> vecN<f32>(e): vecN<f32> Component-wise value conversion to binary32 floating point, including invalid cases.
e: vecN<f16> vecN<f32>(e): vecN<f32> Component-wise exact value conversion to binary32 floating point.
e: vecN<bool> vecN<f16>(e): vecN<f16> Component-wise conversion of a boolean vector to binary16 floating point.
Component i of the result is f16(e[i])
e: vecN<i32> vecN<f16>(e): vecN<f16> Component-wise value conversion to binary16 floating point, including invalid cases.
e: vecN<u32> vecN<f16>(e): vecN<f> Component-wise value conversion to binary16 floating point, including invalid cases.
e: vecN<f32> vecN<f16>(e): vecN<f16> Component-wise lossy value conversion to binary16 floating point.
Matrix conversion type rules
Precondition Conclusion Notes
e: matCxR<f16> matCxR<f32>(e): matCxR<f32> Component-wise exact value conversion to binary32 floating point.
e: matCxR<f32> matCxR<f16>(e): matCxR<f16> Component-wise lossy value conversion to binary16 floating point.

7.6. Reinterpretation of Representation Expressions

A bitcast expression is used to reinterpet the bit representation of a value in one type as a value in another type.

Bitcast type rules
Precondition Conclusion Notes
e: T
T is a concrete numeric scalar or concrete numeric vector type
bitcast<T>(e): T Identity transform. Component-wise when T is a vector.
The result is e.
e: T1
T1 is i32, u32, or f32
T2 is not T1 and is i32, u32, or f32
bitcast<T2>(e): T2 Reinterpretation of bits as T2.
The result is the reinterpretation of the bits in e as a T2 value.
e: vecN<T1>
T1 is i32, u32, or f32
T2 is not T1 and is i32, u32, or f32
bitcast<vecN<T2>>(e): vecN<T2> Component-wise reinterpretation of bits as T2.
The result is the reinterpretation of the bits in e as a vecN<T2> value.
e: vec2<f16>
T is i32, u32, or f32
bitcast<T>(e): T Reinterpretation of bits as T.
The result is the reinterpretation of the 32 bits in e as a T value, following the internal layout rules.
e: T
T is i32, u32, or f32
bitcast<vec2<f16>>(e): vec2<f16> Reinterpretation of bits as vec2<f16>.
The result is the reinterpretation of the 32 bits in e as a vec2<f16> value, following the internal layout rules.
e: vec4<f16>
T is i32, u32, or f32
bitcast<vec2<T>>(e): vec2<T> Reinterpretation of bits as vec2<T>.
The result is the reinterpretation of the 64 bits in e as a vec2<T> value, following the internal layout rules.
e: vec2<T>
T is i32, u32, or f32
bitcast<vec4<f16>>(e): vec4<f16> Reinterpretation of bits as vec4<f16>.
The result is the reinterpretation of the 64 bits in e as a vec4<f16> value, following the internal layout rules.

The internal layout rules are described in § 5.3.6.4 Internal Layout of Values.

7.7. Composite Value Decomposition Expressions

7.7.1. Vector Access Expression

Accessing components of a vector can be done either:

The convenience names are accessed using the . notation. (e.g. color.bgra).

The convenience letterings must not be mixed. For example, you can not use .rybw.

A convenience letter must not access a component past the end of the vector.

The convenience letterings can be applied in any order, including duplicating letters as needed. The provided number of letters must be between 1 and 4. That is, using convenience letters can only produce a valid vector type.

The result type depends on the number of letters provided. Assuming a vec4<f32>

Accessor Result type
r f32
rg vec2<f32>
rgb vec3<f32>
rgba vec4<f32>
var a: vec3<f32> = vec3<f32>(1., 2., 3.);
var b: f32 = a.y;          // b = 2.0
var c: vec2<f32> = a.bb;   // c = (3.0, 3.0)
var d: vec3<f32> = a.zyx;  // d = (3.0, 2.0, 1.0)
var e: f32 = a[1];         // e = 2.0
7.7.1.1. Vector Single Component Selection
Vector decomposition: single component selection
Precondition Conclusion Description
e: vecN<T>
e.x: T
e.r: T
Select the first component of e
e: vecN<T>
e.y: T
e.g: T
Select the second component of e
e: vecN<T>
N is 3 or 4
e.z: T
e.b: T
Select the third component of e
e: vec4<T> e.w: T
e.a: T
Select the fourth component of e
e: vecN<T>
i: i32 or u32
T is concrete
e[i]: T Select the ith component of vector
The first component is at index i=0.

If i is outside the range [0,N-1]:

e: vecN<T>
i: i32 or u32
T is abstract
i is a const-expression
e[i]: T Select the ith component of vector
The first component is at index i=0.

It is a shader-creation error if i is outside the range [0,N-1].

Note: When an abstract vector value e is indexed by an expression that is not a const-expression, then the vector is concretized before the index is applied.

7.7.1.2. Vector Multiple Component Selection
Vector decomposition: multiple component selection
Precondition Conclusion Description
e: vecN<T>
I is the letter x, y, z, or w
J is the letter x, y, z, or w
e.IJ: vec2<T>
Computes the two-component vector with first component e.I, and second component e.J.
Letter z is valid only when N is 3 or 4.
Letter w is valid only when N is 4.
e: vecN<T>
I is the letter r, g, b, or a
J is the letter r, g, b, or a
e.IJ: vec2<T>
Computes the two-component vector with first component e.I, and second component e.J.
Letter b is valid only when N is 3 or 4.
Letter a is valid only when N is 4.
e: vecN<T>
I is the letter x, y, z, or w
J is the letter x, y, z, or w
K is the letter x, y, z, or w
e.IJK: vec3<T>
Computes the three-component vector with first component e.I, second component e.J, and third component e.K.
Letter z is valid only when N is 3 or 4.
Letter w is valid only when N is 4.
e: vecN<T>
I is the letter r, g, b, or a
J is the letter r, g, b, or a
K is the letter r, g, b, or a
e.IJK: vec3<T>
Computes the three-component vector with first component e.I, second component e.J, and third component e.K.
Letter b is only valid when N is 3 or 4.
Letter a is only valid when N is 4.
e: vecN<T>
I is the letter x, y, z, or w
J is the letter x, y, z, or w
K is the letter x, y, z, or w
L is the letter x, y, z, or w
e.IJKL: vec4<T>
Computes the four-component vector with first component e.I, second component e.J, third component e.K, and fourth component e.L.
Letter z is valid only when N is 3 or 4.
Letter w is valid only when N is 4.
e: vecN<T>
I is the letter r, g, b, or a
J is the letter r, g, b, or a
K is the letter r, g, b, or a
L is the letter r, g, b, or a
e.IJKL: vec4<T>
Computes the four-component vector with first component e.I, second component e.J, third component e.K, and fourth component e.L.
Letter b is only valid when N is 3 or 4.
Letter a is only valid when N is 4.
7.7.1.3. Component Reference from Vector Reference

A write access to component of a vector may access all of the memory locations associated with that vector.

Note: This means accesses to different components of a vector by different invocations must be synchronized if at least one access is a write access. See § 17.9 Synchronization Built-in Functions.

Getting a reference to a component from a reference to a vector
Precondition Conclusion Description
r: ref<AS,vecN<T>,AM>
r.x: ref<AS,T,AM>
r.r: ref<AS,T,AM>
Compute a reference to the first component of the vector referenced by the reference r.
The originating variable of the resulting reference is the same as the originating variable of r.
r: ref<AS,vecN<T>,AM>
r.y: ref<AS,T,AM>
r.g: ref<AS,T,AM>
Compute a reference to the second component of the vector referenced by the reference r.
The originating variable of the resulting reference is the same as the originating variable of r.
r: ref<AS,vecN<T>,AM>
N is 3 or 4
r.z: ref<AS,T,AM>
r.b: ref<AS,T,AM>
Compute a reference to the third component of the vector referenced by the reference r.
The originating variable of the resulting reference is the same as the originating variable of r.
r: ref<AS,vec4<T>,AM>
r.w: ref<AS,T,AM>
r.a: ref<AS,T,AM>
Compute a reference to the fourth component of the vector referenced by the reference r.
The originating variable of the resulting reference is the same as the originating variable of r.
r: ref<AS,vecN<T>,AM>
i: i32 or u32
r[i] : ref<AS,T,AM> Compute a reference to the ith component of the vector referenced by the reference r.

If i is outside the range [0,N-1]:

The originating variable of the resulting reference is the same as the originating variable of r.

7.7.2. Matrix Access Expression

Column vector extraction
Precondition Conclusion Description
e: matCxR<T>
i: i32 or u32
T is concrete
e[i]: vecR<T> The result is the ith column vector of e.

If i is outside the range [0,C-1]:

e: matCxR<T>
i: i32 or u32
T is abstract
i is a const-expression
e[i]: vecR<T> The result is the ith column vector of e.

It is a shader-creation error if i is outside the range [0,C-1].

Note: When an abstract matrix value e is indexed by an expression that is not a const-expression, then the matrix is concretized before the index is applied.

Getting a reference to a column vector from a reference to a matrix
Precondition Conclusion Description
r: ref<AS,matCxR<T>,AM>
i: i32 or u32
r[i] : ref<AS,vecR<T>,AM> Compute a reference to the ith column vector of the matrix referenced by the reference r.

If i is outside the range [0,C-1]:

The originating variable of the resulting reference is the same as the originating variable of r.

7.7.3. Array Access Expression

Array element extraction
Precondition Conclusion Description
e: array<T,N>
i: i32 or u32
T is concrete
e[i] : T The result is the value of the ith element of the array value e.

If i is outside the range [0,N-1]:

Getting a reference to an array element from a reference to an array
Precondition Conclusion Description
r: ref<AS,array<T,N>,AM>
i: i32 or u32
r[i] : ref<AS,T,AM> Compute a reference to the ith element of the array referenced by the reference r.

If i is outside the range [0,N-1]:

The originating variable of the resulting reference is the same as the originating variable of r.

r: ref<AS,array<T>,AM>
i: i32 or u32
r[i] : ref<AS,T,AM> Compute a reference to the ith element of the runtime-sized array referenced by the reference r.

If at runtime the array has N elements, and i is outside the range [0,N-1], then the expression evaluates to an invalid memory reference.

If i is a signed integer, and i is less than 0:

The originating variable of the resulting reference is the same as the originating variable of r.

7.7.4. Structure Access Expression

Structure member extraction
Precondition Conclusion Description
S is a structure type
M is the identifier name of a member of S, having type T
e: S
e.M: T The result is the value of the member with name M from the structure value e.
Getting a reference to a structure member from a reference to a structure
Precondition Conclusion Description
S is a structure type
M is the identifier name of a member of S, having type T
r: ref<AS,S,AM>
r.M: ref<AS,T,AM> Given a reference to a structure, the result is a reference to the structure member with identifier name M.
The originating variable of the resulting reference is the same as the originating variable of r.

7.8. Logical Expressions

Unary logical operations
Precondition Conclusion Notes
e: T
T is bool or vecN<bool>
!e: T Logical negation. The result is true when e is false and false when e is true. Component-wise when T is a vector.
Binary logical expressions
Precondition Conclusion Notes
e1: bool
e2: bool
e1 || e2: bool Short-circuiting "or". Yields true if either e1 or e2 are true; evaluates e2 only if e1 is false.
e1: bool
e2: bool
e1 && e2: bool Short-circuiting "and". Yields true if both e1 and e2 are true; evaluates e2 only if e1 is true.
e1: T
e2: T
T is bool or vecN<bool>
e1 | e2: T Logical "or". Component-wise when T is a vector. Evaluates both e1 and e2.
e1: T
e2: T
T is bool or vecN<bool>
e1 & e2: T Logical "and". Component-wise when T is a vector. Evaluates both e1 and e2.

7.9. Arithmetic Expressions

Unary arithmetic expressions
Precondition Conclusion Notes
e: T
T is AbstractInt, AbstractFloat, i32, f32, f16, vecN<AbstractInt>, vecN<AbstractFloat>, vecN<i32>, vecN<f32>, or vecN<f16>
-e: T Negation. Component-wise when T is a vector. If T is an integer scalar type and e evaluates to the largest negative value, then the result is e.
Binary arithmetic expressions
Precondition Conclusion Notes
e1 : T
e2 : T
S is AbstractInt, AbstractFloat, i32, u32, f32, or f16
T is S, or vecN<S>
e1 + e2 : T Addition. Component-wise when T is a vector. If T is a concrete integer scalar type, then the result is modulo 232.
e1 : T
e2 : T
S is AbstractInt, AbstractFloat, i32, u32, f32, or f16
T is S, or vecN<S>
e1 - e2 : T Subtraction Component-wise when T is a vector. If T is a concrete integer scalar type, then the result is modulo 232.
e1 : T
e2 : T
S is AbstractInt, AbstractFloat, i32, u32, f32, or f16
T is S, or vecN<S>
e1 * e2 : T Multiplication. Component-wise when T is a vector. If T is a concrete integer scalar type, then the result is modulo 232.
e1 : T
e2 : T
S is AbstractInt, AbstractFloat, i32, u32, f32, or f16
T is S, or vecN<S>
e1 / e2 : T Division. Component-wise when T is a vector.

If T is a signed integer scalar type, evaluates to:

Note: The need to ensure truncation behavior may require an implementation to perform more operations than when computing an unsigned division. Use unsigned division when both operands are known to have the same sign.

If T is an unsigned integer scalar type, evaluates to:

e1 : T
e2 : T
S is AbstractInt, AbstractFloat, i32, u32, f32, or f16
T is S, or vecN<S>
e1 % e2 : T Remainder. Component-wise when T is a vector.

If T is a signed integer scalar type, evaluates e1 and e2 once, and evaluates to:

Note: When non-zero, the result has the same sign as e1.

Note: The need to ensure consistent behavior may require an implementation to perform more operations than when computing an unsigned remainder.

If T is an unsigned integer scalar type, evaluates to:

If T is a floating point type, the result is equal to:
e1 - e2 * trunc(e1 / e2)

Binary arithmetic expressions with mixed scalar and vector operands
Preconditions Conclusions Semantics
S is one of AbstractInt, AbstractFloat, f32, f16, i32, u32
V is vecN<S>
es: S
ev: V
ev + es: V ev + V(es)
es + ev: V V(es) + ev
ev - es: V ev - V(es)
es - ev: V V(es) - ev
ev * es: V ev * V(es)
es * ev: V V(es) * ev
ev / es: V ev / V(es)
es / ev: V V(es) / ev
ev % es: V ev % V(es)
es % ev: V V(es) % ev
Matrix arithmetic
Preconditions Conclusions Semantics
e1, e2: matCxR<T>
T is AbstractFloat, f32, or f16
e1 + e2: matCxR<T>
Matrix addition: column i of the result is e1[i] + e2[i]
e1 - e2: matCxR<T> Matrix subtraction: column i of the result is e1[i] - e2[i]
m: matCxR<T>
s: T
T is AbstractFloat, f32, or f16
m * s: matCxR<T>
Component-wise scaling: (m * s)[i][j] is m[i][j] * s
s * m: matCxR<T>
Component-wise scaling: (s * m)[i][j] is m[i][j] * s
m: matCxR<T>
v: vecC<T>
T is AbstractFloat, f32, or f16
m * v: vecR<T>
Linear algebra matrix-column-vector product: Component i of the result is dot(transpose(m)[i],v)
m: matCxR<T>
v: vecR<T>
T is AbstractFloat, f32, or f16
v * m: vecC<T>
Linear algebra row-vector-matrix product:
transpose(transpose(m) * transpose(v))
e1: matKxR<T>
e2: matCxK<T>
T is AbstractFloat, f32, or f16
e1 * e2: matCxR<T>
Linear algebra matrix product.

7.10. Comparison Expressions

Comparisons
Precondtion Conclusion Notes
e1: T
e2: T
S is AbstractInt, AbstractFloat, bool, i32, u32, f32, or f16
T is S or vecN<S>
TB is vecN<bool> if T is a vector,
otherwise TB is bool
e1 == e2: TB Equality. Component-wise when T is a vector.
e1: T
e2: T
S is AbstractInt, AbstractFloat, bool, i32, u32, f32, or f16
T is S or vecN<S>
TB is vecN<bool> if T is a vector,
otherwise TB is bool
e1 != e2: TB Inequality. Component-wise when T is a vector.
e1: T
e2: T
S is AbstractInt, AbstractFloat, i32, u32, f32, or f16
T is S, or vecN<S>
TB is vecN<bool> if T is a vector,
otherwise TB is bool
e1 < e2: TB Less than. Component-wise when T is a vector.
e1: T
e2: T
S is AbstractInt, AbstractFloat, i32, u32, f32, or f16
T is S, or vecN<S>
TB is vecN<bool> if T is a vector,
otherwise TB is bool
e1 <= e2: TB Less than or equal. Component-wise when T is a vector.
e1: T
e2: T
S is AbstractInt, AbstractFloat, i32, u32, f32, or f16
T is S, or vecN<S>
TB is vecN<bool> if T is a vector,
otherwise TB is bool
e1 > e2: TB Greater than. Component-wise when T is a vector.
e1: T
e2: T
S is AbstractInt, AbstractFloat, i32, u32, f32, or f16
T is S, or vecN<S>
TB is vecN<bool> if T is a vector,
otherwise TB is bool
e1 >= e2: TB Greater than or equal. Component-wise when T is a vector.

7.11. Bit Expressions

Unary bitwise operations
Precondition Conclusion Notes
e: T
S is AbstractInt, i32, or u32
T is S or vecN<S>
~e : T Bitwise complement on e. Each bit in the result is the opposite of the corresponding bit in e. Component-wise when T is a vector.
Binary bitwise operations
Precondition Conclusion Notes
e1: T
e2: T
S is AbstractInt, i32, or u32
T is S or vecN<S>
e1 | e2: T Bitwise-or. Component-wise when T is a vector.
e1: T
e2: T
S is AbstractInt, i32, or u32
T is S or vecN<S>
e1 & e2: T Bitwise-and. Component-wise when T is a vector.
e1: T
e2: T
S is AbstractInt, i32, or u32
T is S or vecN<S>
e1 ^ e2: T Bitwise-exclusive-or. Component-wise when T is a vector.
Bit shift expressions
Precondition Conclusion Notes
e1: T
e2: TS
S is i32 or u32
T is S or vecN<S>
TS is u32 when T is S, otherwise TS is vecN<u32>
e1 << e2: T Shift left (shifted value is concrete):

Shift e1 left, inserting zero bits at the least significant positions, and discarding the most significant bits.

The number of bits to shift is the value of e2, modulo the bit width of e1.
If e2 is greater than or equal to the bit width of e1, then:

When both e1 and e2 are known before shader execution start, the result must not overflow:

Component-wise when T is a vector.

e1: T
e2: TS
T is AbstractInt or vecN<AbstractInt> TS is u32 when T is AbstractInt, otherwise TS is vecN<u32>
e1 << e2: T Shift left (shifted value abstract):

Shift e1 left, inserting zero bits at the least significant positions, and discarding the most significant bits.

The number of bits to shift is the value of e2.

The e2+1 most significant bits of e1 must have the same bit value. Otherwise overflow would occur.

Note: This condition means all the discarded bits must be the same as the sign bit of the original value, and the same as the sign bit of the final value.

Component-wise when T is a vector.

e1: T
e2: TS
S is i32 or u32
T is S or vecN<S>
TS is u32 when T is S, otherwise TS is vecN<u32>
e1 >> e2: T Shift right (shifted value is concrete).

Shift e1 right, discarding the least significant bits.

If S is an unsigned type, insert zero bits at the most significant positions.

If S is a signed type:

  • If e1 is negative, each inserted bit is 1, and so the result is also negative.

  • Otherwise, each inserted bit is 0.

The number of bits to shift is the value of e2, modulo the bit width of e1.

If e2 is greater than or equal to the bit width or e1, then:

Component-wise when T is a vector.

e1: T
e2: TS
T is AbstractInt or vecN<AbstractInt> TS is u32 when T is AbstractInt, otherwise TS is vecN<u32>
e1 >> e2: T Shift right (abstract).

Shift e1 right, discarding the least significant bits.

If e1 is negative, each inserted bit is 1, and so the result is also negative. Otherwise, each inserted bit is 0.

The number of bits to shift is the value of e2.

Component-wise when T is a vector.

7.12. Function Call Expression

A function call expression executes a function call where the called function has a return type. If the called function does not return a value, a function call statement should be used instead. See § 8.5 Function Call Statement.

7.13. Variable Identifier Expression

Getting a reference from a variable name
Precondition Conclusion Description
v is an identifier resolving to an in-scope variable declared in address space AS with store type T and access mode AM v: ref<AS,T,AM> Result is a reference to the memory for the named variable v.

7.14. Formal Parameter Expression

Getting the value of an identifier declared as a formal parameter to a function
Precondition Conclusion Description
a is an identifier resolving to an in-scope formal parameter declaration with type T a: T Result is the value supplied for the corresponding function call operand at the call site invoking this instance of the function.

7.15. Address-Of Expression

The address-of operator converts a reference to its corresponding pointer.

Getting a pointer from a reference
Precondition Conclusion Description
r: ref<AS,T,AM> &r: ptr<AS,T,AM> Result is the pointer value corresponding to the same memory view as the reference value r.

If r is an invalid memory reference, then the resulting pointer is also an invalid memory reference.

It is a shader-creation error if AS is the handle address space.

It is a shader-creation error if r is a reference to a vector component.

7.16. Indirection Expression

The indirection operator converts a pointer to its corresponding reference.

Getting a reference from a pointer
Precondition Conclusion Description
p: ptr<AS,T,AM> *p: ref<AS,T,AM> Result is the reference value corresponding to the same memory view as the pointer value p.

If p is an invalid memory reference, then the resulting reference is also an invalid memory reference.

7.17. Constant Identifier Expression

Getting the value of a let-declared identifier
Precondition Conclusion Description
c is an identifier resolving to an in-scope override-declaration with type T c: T If pipeline creation specified a value for the constant ID, then the result is that value. This value may be different for different pipeline instances.
Otherwise, the result is the value computed for the initializer expression. Pipeline-overridable constants appear at module-scope, so evaluation occurs before the shader begins execution.
Note: Pipeline creation fails if no initial value was specified in the API call and the let-declaration has no initializer expression.
c is an identifier resolving to an in-scope let declaration with type T c: T Result is the value computed for the initializer expression.
For a let declaration at module scope, evaluation occurs before the shader begins execution.
For a let declaration inside a function, evaluation occurs each time control reaches the declaration.

7.18. Expression Grammar Summary

When an identifier is used as a callable item, it is one of:

Declaration and scope rules ensure those names are always distinct.

primary_expression :

| ident

| callable argument_expression_list

| literal

| paren_expression

| bitcast less_than type_specifier greater_than paren_expression

callable :

| ident

| type_specifier_without_ident

| vec_prefix

| mat_prefix

| array

paren_expression :

| paren_left expression paren_right

argument_expression_list :

| paren_left ( expression ( comma expression ) * comma ? ) ? paren_right

component_or_swizzle_specifier :

| bracket_left expression bracket_right component_or_swizzle_specifier ?

| period member_ident component_or_swizzle_specifier ?

| period swizzle_name component_or_swizzle_specifier ?

unary_expression :

| singular_expression

| minus unary_expression

| bang unary_expression

| tilde unary_expression

| star unary_expression

| and unary_expression

singular_expression :

| primary_expression component_or_swizzle_specifier ?

lhs_expression :

| ( star | and ) * core_lhs_expression component_or_swizzle_specifier ?

core_lhs_expression :

| ident

| paren_left lhs_expression paren_right

multiplicative_expression :

| unary_expression

| multiplicative_expression multiplicative_operator unary_expression

multiplicative_operator :

| star

| forward_slash

| modulo

additive_expression :

| multiplicative_expression

| additive_expression additive_operator multiplicative_expression

additive_operator :

| plus

| minus

shift_expression :

| additive_expression

| unary_expression shift_left unary_expression

| unary_expression shift_right unary_expression

relational_expression :

| shift_expression

| shift_expression less_than shift_expression

| shift_expression greater_than shift_expression

| shift_expression less_than_equal shift_expression

| shift_expression greater_than_equal shift_expression

| shift_expression equal_equal shift_expression

| shift_expression not_equal shift_expression

short_circuit_and_expression :

| relational_expression

| short_circuit_and_expression and_and relational_expression

short_circuit_or_expression :

| relational_expression

| short_circuit_or_expression or_or relational_expression

binary_or_expression :

| unary_expression

| binary_or_expression or unary_expression

binary_and_expression :

| unary_expression

| binary_and_expression and unary_expression

binary_xor_expression :

| unary_expression

| binary_xor_expression xor unary_expression

bitwise_expression :

| binary_and_expression and unary_expression

| binary_or_expression or unary_expression

| binary_xor_expression xor unary_expression

expression :

| relational_expression

| short_circuit_or_expression or_or relational_expression

| short_circuit_and_expression and_and relational_expression

| bitwise_expression

8. Statements

Statements are program fragments that control its execution. Statements are generally executed in sequential order; however, control flow statements may cause a program to execute in non-sequential order.

8.1. Compound Statement

A compound statement is a brace-enclosed sequence of zero or more statements. When a declaration is one of those statements, its identifier is in scope from the start of the next statement until the end of the compound statement.

compound_statement :

| brace_left statement * brace_right

The continuing_compound_statement is a special form of compound statement that forms the body of a continuing statement, and allows an option break-if statement at the end.

8.2. Assignment Statement

An assignment evaluates an expression, and optionally stores it in memory (thus updating the contents of a variable).

assignment_statement :

| lhs_expression ( equal | compound_assignment_operator ) expression

| underscore equal expression

The text to the left of the operator token is the left-hand side, and the expression to the right of the operator token is the right-hand side.

8.2.1. Simple Assignment

An assignment is a simple assignment when the left-hand side is an expression, and the operator is the equal token. In this case the value of the right-hand side is written to the memory referenced by the left-hand side.

Precondition Statement Description
e: T,
T is a concrete constructible type,
r: ref<AS,T,AM>,
AS is a writable address space,
access mode AM is write or read_write
r = e Evaluates e, evaluates r, then writes the value computed for e into the memory locations referenced by r. Note: if the reference is an invalid memory reference, the write may not execute, or may write to a different memory location than expected.

In the simplest case, the left hand side is the name of a variable. See § 5.4.6 Forming Reference and Pointer Values for other cases.

EXAMPLE: Assignments
struct S {
    age: i32,
    weight: f32
}
var<private> person: S;

fn f() {
    var a: i32 = 20;
    a = 30;           // Replace the contents of 'a' with 30.

    person.age = 31;  // Write 31 into the age field of the person variable.

    var uv: vec2<f32>;
    uv.y = 1.25;      // Place 1.25 into the second component of uv.

    let uv_x_ptr: ptr<function,f32> = &uv.x;
    *uv_x_ptr = 2.5;   // Place 2.5 into the first component of uv.

    var friend: S;
    // Copy the contents of the 'person' variable into the 'friend' variable.
    friend = person;
}

8.2.2. Phony Assignment

An assignment is a phony assignment when the left-hand side is an underscore token. In this case the right-hand side is evaluated, and then ignored.

Precondition Statement Description
e: T,
T is constructible, a pointer type, a texture type, or a sampler type
_ = e Evaluates e.

Note: The resulting value is not stored. The _ token is not an identifier, and therefore cannot be used in an expression.

A phony-assignment is useful for:

EXAMPLE: Using phony-assignment to throw away an un-needed function result
var<private> counter: i32;

fn increment_and_yield_previous() -> i32 {
  let previous = counter;
  counter = counter + 1;
  return previous;
}

fn user() {
  // Increment the counter, but don’t use the result.
  _ = increment_and_yield_previous();
}
EXAMPLE: Using phony-assignment to occupy bindings without using them
struct BufferContents {
    counter: atomic<u32>,
    data: array<vec4<f32>>
}
@group(0) @binding(0) var<storage> buf: BufferContents;
@group(0) @binding(1) var t: texture_2d<f32>;
@group(0) @binding(2) var s: sampler;

@fragment
fn shade_it() -> @location(0) vec4<f32> {
  // Declare that buf, t, and s are part of the shader interface, without
  // using them for anything.
  _ = &buf;
  _ = t;
  _ = s;
  return vec4<f32>();
}

8.2.3. Compound Assignment

An assignment is a compound assignment when the left-hand side is an expression, and the operator is one of the compound_assignment_operators.

compound_assignment_operator :

| plus_equal

| minus_equal

| times_equal

| division_equal

| modulo_equal

| and_equal

| or_equal

| xor_equal

| shift_right_equal

| shift_left_equal

The type requirements, semantics, and behavior of each statement is defined as if the compound assignment expands as in the following table, except that:

Statement Expansion
e1 += e2 e1 = e1 + (e2)
e1 -= e2 e1 = e1 - (e2)
e1 *= e2 e1 = e1 * (e2)
e1 /= e2 e1 = e1 / (e2)
e1 %= e2 e1 = e1 % (e2)
e1 &= e2 e1 = e1 & (e2)
e1 |= e2 e1 = e1 | (e2)
e1 ^= e2 e1 = e1 ^ (e2)
e1 >>= e2 e1 = e1 >> (e2)
e1 <<= e2 e1 = e1 << (e2)

Note: The syntax does not allow a compound assignment to also be a phony assignment.

Note: Even though the reference e1 is evaluated once, its underlying memory is accessed twice: first a read access gets the old value, and then a write access stores the updated value.

EXAMPLE: Compound assignment
var<private> next_item: i32 = 0;

fn advance_item() -> i32 {
   next_item += 1;   // Adds 1 to next_item.
   return next_item - 1;
}

fn bump_item() {
  var data: array<f32,10>;
  next_item = 0;
  // Adds 5.0 to data[0], calling advance_item() only once.
  data[advance_item()] += 5.0;
  // next_item will be 1 here.
}

fn precedence_example() {
  var value = 1;
  // The right-hand side of a compound assignment is its own expression.
  value *= 2 + 3; // Same as value = value * (2 + 3);
  // 'value' now holds 5.
}

Note: A compound assignment can rewritten as different WGSL code that uses a simple assignment instead. The idea is to use a pointer to hold the result of evaluating the reference once.

For example, when e1 is not a reference to a component inside a vector, then e1+=e2 can be rewritten as {let p = &(e1); *p = *p + (e2);}, where the identifier p is chosen to be different from all other identifiers in the program.

When e1 is a reference to a component inside a vector, the above technique needs to be modified because WGSL does not allow taking the address in that case. For example, if ev is a reference to a vector, the statement ev[c] += e2 can be rewritten as {let p = &(ev); let c0 = c; (*p)[c0] = (*p)[c0] + (e2);}, where identifiers c0 and p are chosen to be different from all other identifiers in the program.

8.3. Increment and Decrement Statements

An increment statement adds 1 to the contents of a variable. A decrement statement subtracts 1 from the contents of a variable.

increment_statement :

| lhs_expression plus_plus

decrement_statement :

| lhs_expression minus_minus

The expression must evaluate to a reference with a concrete integer scalar store type and read_write access mode.

Precondition Statement Description
r : ref<AS,T,read_write>,
T is a concrete integer scalar
r++ Adds 1 to the contents of memory referenced by r.
Same as r += T(1)
r : ref<AS,T,read_write>,
T is a concrete integer scalar
r-- Subtracts 1 from the contents of memory referenced by r.
Same as r -= T(1)
EXAMPLE: Increment and decrement
fn f() {
    var a: i32 = 20;
    a++;
    // Now a contains 21
    a--;
    // Now a contains 20
}

8.4. Control Flow

Control flow statements may cause the program to execute in non-sequential order.

8.4.1. If Statement

An if statement conditionally executes at most one compound statement based on the evaluation of condition expressions.

An if statement has an if clause, followed by zero or more else if clauses, followed by an optional else clause.

if_statement :

| if_clause else_if_clause * else_clause ?

if_clause :

| if expression compound_statement

else_if_clause :

| else if expression compound_statement

else_clause :

| else compound_statement

Type rule precondition: The expression in each if and else if clause must be of bool type.

An if statement is executed as follows:

8.4.2. Switch Statement

A switch statement transfers control to one of a set of case clauses, or to the default clause, depending on the evaluation of a selector expression.

switch_statement :

| switch expression brace_left switch_body + brace_right

switch_body :

| case_clause

| default_alone_clause

case_clause :

| case case_selectors colon ? compound_statement

default_alone_clause :

| default colon ? compound_statement

case_selectors :

| case_selector ( comma case_selector ) * comma ?

case_selector :

| default

| expression

A case clause is the case token followed by a comma-separated list of case selectors and a body in the form of a compound statement.

A default-alone clause is the default token followed by a body in the form of a compound statement.

A default clause is either:

Each switch statement must have exactly one default clause.

The default token must not appear more than once in a single case_selector list.

Type rule precondition: For a single switch statement, the selector expression and all case selector expressions must be of the same concrete integer scalar type.

The expressions in the case_selectors must be const-expressions.

Two different case selector expressions in the same switch statement must not have the same value.

If the selector value equals the value of an expression in a case_selector list, then control is transferred to the body of that case clause. If the selector value does not equal any of the case selector values, then control is transferred to the body of the default clause.

When control reaches the end of the body of a clause, control transfers to the first statement after the switch statement.

When one of the statements in the body of a clause is a declaration, it follows the normal scope and lifetime rules of a declaration in a compound statement. That is, the body is a sequence of statements, and if one of those is a declaration then the scope of that declaration extends from the start of the next statement in the sequence until the end of the body. The declaration executes when it is reached, creating a new instance of the variable or value, and initializes it.

EXAMPLE: WGSL Switch
var a : i32;
let x : i32 = generateValue();
switch x {
  case 0: {      // The colon is optional
    a = 1;
  }
  default {      // The default need not appear last
    a = 2;
  }
  case 1, 2, {   // Multiple selector values can be used
    a = 3;
  }
  case 3, {      // The trailing comma is optional
    a = 4;
  }
  case 4 {
    a = 5;
  }
}
EXAMPLE: WGSL Switch with default combined
const c = 2;
var a : i32;
let x : i32 = generateValue();
switch x {
  case 0: {
    a = 1;
  }
  case 1, c {       // Const-expression can be used in case selectors
    a = 3;
  }
  case 3, default { // The default keyword can be used with other clauses
    a = 4;
  }
}

8.4.3. Loop Statement

loop_statement :

| loop brace_left statement * continuing_statement ? brace_right

A loop statement repeatedly executes a loop body; the loop body is specified as a compound statement. Each execution of the loop body is called an iteration.

This repetition can be interrupted by a break, or return statement.

Optionally, the last statement in the loop body may be a continuing statement.

When one of the statements in the loop body is a declaration, it follows the normal scope and lifetime rules of a declaration in a compound statement. That is, the loop body is a sequence of statements, and if one of those is a declaration then the scope of that declaration extends from the start of the next statement in the sequence until the end of the loop body. The declaration executes each time it is reached, so each new iteration creates a new instance of the variable or value, and re-initializes it.

Note: The loop statement is one of the biggest differences from other shader languages.

This design directly expresses loop idioms commonly found in compiled code. In particular, placing the loop update statements at the end of the loop body allows them to naturally use values defined in the loop body.

EXAMPLE: GLSL Loop
int a = 2;
for (int i = 0; i < 4; i++) {
  a *= 2;
}
EXAMPLE: WGSL Loop
var a: i32 = 2;
var i: i32 = 0;      // <1>
loop {
  if i >= 4 { break; }

  a = a * 2;

  i++;
}
EXAMPLE: GLSL Loop with continue
int a = 2;
let int step = 1;
for (int i = 0; i < 4; i += step) {
  if i % 2 == 0 continue;
  a *= 2;
}
EXAMPLE: WGSL Loop with continue
var a: i32 = 2;
var i: i32 = 0;
loop {
  if i >= 4 { break; }

  let step: i32 = 1;

  i = i + step;
  if i % 2 == 0 { continue; }

  a = a * 2;
}
EXAMPLE: WGSL Loop with continue and continuing
var a: i32 = 2;
var i: i32 = 0;
loop {
  if i >= 4 { break; }

  let step: i32 = 1;

  if i % 2 == 0 { continue; }

  a = a * 2;

  continuing {   // <2>
    i = i + step;
  }
}

8.4.4. For Statement

for_statement :

| for paren_left for_header paren_right compound_statement

for_header :

| for_init ? semicolon expression ? semicolon for_update ?

for_init :

| variable_statement

| variable_updating_statement

| func_call_statement

for_update :

| variable_updating_statement

| func_call_statement

The for statement takes the form for (initializer; condition; update_part) { body } and is syntactic sugar on top of a loop statement with the same body. Additionally:

The initializer of a for loop is executed once prior to executing the loop. When a declaration appears in the initializer, its identifier is in scope until the end of the body. Unlike declarations in the body, the declaration is not re-initialized each iteration.

The condition, body and update_part execute in that order to form a loop iteration. The body is a special form of compound statement. The identifier of a declaration in the body is in scope from the start of the next statement until the end of the body. The declaration is executed each time it is reached, so each new iteration creates a new instance of the variable or constant, and re-initializes it.

EXAMPLE: For to Loop transformation
for(var i: i32 = 0; i < 4; i++) {
  if a == 0 {
    continue;
  }
  a = a + 2;
}

Converts to:

EXAMPLE: For to Loop transformation
{ // Introduce new scope for loop variable i
  var i: i32 = 0;
  var a: i32 = 0;
  loop {
    if !(i < 4) {
      break;
    }

    if a == 0 {
      continue;
    }
    a = a + 2;

    continuing {
      i++;
    }
  }
}

8.4.5. While Statement

while_statement :

| while expression compound_statement

The while statement is a kind of loop parameterized by a condition. At the start of each loop iteration, a boolean condition is evaluated. If the condition is false, the while loop ends execution. Otherwise, the rest of the iteration is executed.

Type rule precondition: The condition must be of bool type.

A while loop can be viewed as syntactic sugar over either a loop or for statement. The following statement forms are equivalent:

8.4.6. Break Statement

break_statement :

| break

A break statement transfers control to immediately after the body of the nearest-enclosing loop or switch statement, thus ending execution of the loop or switch statement.

A break statement must only be used within loop, for, while, and switch statements.

A break statement must not be placed such that it would exit from a loop’s continuing statement. Use a break-if statement instead.

EXAMPLE: WGSL Invalid loop break from a continuing clause
var a: i32 = 2;
var i: i32 = 0;
loop {
  let step: i32 = 1;

  if i % 2 == 0 { continue; }

  a = a * 2;

  continuing {
    i = i + step;
    if i >= 4 { break; } // Invalid.  Use break-if instead.
  }
}

8.4.7. Break-If Statement

break_if_statement :

| break if expression semicolon

A break-if statement evaluates a boolean condition; If the condition is true, control is transferred to immediately after the body of the nearest-enclosing loop statement, ending execution of that loop.

Type rule precondition: The condition must be of bool type.

Note: A break-if statement may only appear as the last statement in the body of a continuing statement.

EXAMPLE: WGSL Valid loop break-if from a continuing clause
var a: i32 = 2;
var i: i32 = 0;
loop {
  let step: i32 = 1;

  if i % 2 == 0 { continue; }

  a = a * 2;

  continuing {
    i = i + step;
    break if i >= 4;
  }
}

8.4.8. Continue Statement

continue_statement :

| continue

A continue statement transfers control in the nearest-enclosing loop:

A continue statement must only be used in a loop, for or while statement. A continue statement must not be placed such that it would transfer control to an enclosing continuing statement. (It is a forward branch when branching to a continuing statement.)

A continue statement must not be placed such that it would transfer control past a declaration used in the targeted continuing statement.

Note: A continue can only be used in a continuing statement if it is used for transferring control flow within another loop nested in the continuing statement. That is, a continue cannot be used to transfer control to the start of the currently executing continuing statement.

EXAMPLE: Invalid continue bypasses declaration
var i: i32 = 0;
loop {
  if i >= 4 { break; }
  if i % 2 == 0 { continue; } // <3>

  let step: i32 = 2;

  continuing {
    i = i + step;
  }
}

8.4.9. Continuing Statement

continuing_statement :

| continuing continuing_compound_statement

continuing_compound_statement :

| brace_left statement * break_if_statement ? brace_right

A continuing statement specifies a compound statement to be executed at the end of a loop iteration. The construct is optional.

The compound statement must not contain a return at any compound statement nesting level.

8.4.10. Return Statement

return_statement :

| return expression ?

A return statement ends execution of the current function. If the function is an entry point, then the current shader invocation is terminated. Otherwise, evaluation continues with the next expression or statement after the evaluation of the call site of the current function invocation.

If the function does not have a return type, then the return statement is optional. If the return statement is provided for such a function, it must not supply a value. Otherwise the expression must be present, and is called the return value. In this case the call site of this function invocation evaluates to the return value. The type of the return value must match the return type of the function.

8.4.11. Discard Statement

A discard statement converts the invocation into a helper invocation and throws away the fragment. The discard statement must only be used in a fragment shader stage.

More precisely, executing a discard statement will:

Only statements executed prior to the discard statement will have observable effects.

Note: A discard statement may be executed by any function in a fragment stage and the effect is the same: the fragment will be thrown away.

EXAMPLE: Using the discard statement to throw away a fragment
@group(0) @binding(0)
var<storage, read_write> will_emit_color : u32;

fn discard_if_shallow(pos: vec4<f32>) {
  if pos.z < 0.001 {
    // If this is executed, then the will_emit_color variable will
    // never be set to 1 because helper invocations will not write
    // to shared memory.
    discard;
  }
  will_emit_color = 1;
}

@fragment
fn main(@builtin(position) coord_in: vec4<f32>)
  -> @location(0) vec4<f32>
{
  discard_if_shallow(coord_in);

  // Set the value to 1 and emit red, but only if the helper function
  // did not execute the discard statement.
  will_emit_color = 1;
  return vec4<f32>(1.0, 0.0, 0.0, 1.0);
}

8.5. Function Call Statement

func_call_statement :

| ident argument_expression_list

A function call statement executes a function call.

Note: If the function returns a value, that value is ignored.

8.6. Static Assertion Statement

A static assertion statement produces a shader-creation error if the expression evaluates to false. The expression must be a const-expression. The statement can satisfy static access conditions in a shader, but otherwise has no effect on the compiled shader. This statement can be used at module scope and within functions.

static_assert_statement :

| static_assert expression

EXAMPLE: Static assertion examples
const x = 1;
const y = 2;
static_assert x < y; // valid at module-scope.
static_assert(y != 0); // parentheses are optional.

fn foo() {
  const z = x + y - 2;
  static_assert z > 0; // valid in functions.
  let a  = 3;
  static_assert a != 0; // invalid, the expresion must be a const-expression.
}

8.7. Statements Grammar Summary

The statement rule matches statements that can be used in most places inside a function body.

statement :

| semicolon

| return_statement semicolon

| if_statement

| switch_statement

| loop_statement

| for_statement

| while_statement

| func_call_statement semicolon

| variable_statement semicolon

| break_statement semicolon

| continue_statement semicolon

| discard semicolon

| variable_updating_statement semicolon

| compound_statement

| static_assert_statement semicolon

variable_updating_statement :

| assignment_statement

| increment_statement

| decrement_statement

Additionally, certain statements may only be used in very specific contexts:

8.8. Statements Behavior Analysis

8.8.1. Rules

Some statements affecting control-flow are only valid in some contexts. For example, continue is invalid outside of a loop, for, or while. Additionally, the uniformity analysis (see § 13.2 Uniformity) needs to know when control flow can exit a statement in multiple different ways.

Both goals are achieved by a system for summarizing execution behaviors of statements and expressions. Behavior analysis maps each statement and expression to the set of possible ways execution proceeds after evaluation of the statement or expression completes. As with type analysis for values and expressions, behavior analysis proceeds bottom up: first determine behaviors for certain basic statements, and then determine behavior for higher level constructs by applying combining rules.

A behavior is a set, whose elements may be:

Each of those correspond to a way to exit a compound statement: either through a keyword, or by falling to the next statement ("Next").

We note "s: B" to say that s respects the rules regarding behaviors, and has behavior B.

For each function:

We assign a behavior to each function: it is its body’s behavior (treating the body as a regular statement), with any "Return" replaced by "Next". As a consequence of the rules above, a function behavior is always one of {}, or {Next}.

Behavior analysis must be able to determine a non-empty behavior for each statement, and function.

Rules for analyzing and validating the behaviors of statements
Statement Preconditions Resulting behavior
empty statement {Next}
{s} s: B B
s1 s2

Note: s1 often ends in a semicolon.

s1: B1
Next in B1
s2: B2
(B1∖{Next}) ∪ B2
s1: B1
Next not in B1
s2: B2
B1
var x:T; {Next}
let x = e; {Next}
var x = e; {Next}
x = e; {Next}
_ = e; {Next}
f(e1, ..., en); f has behavior B B
return; {Return}
return e; {Return}
discard; {Next}
break; {Break}
break if e; {Break, Next}
continue; {Continue}
if e s1 else s2 s1: B1
s2: B2
B1B2
loop {s1 continuing {s2}} s1: B1
s2: B2
None of {Continue, Return} are in B2
Break is not in (B1B2)
(B1B2)∖{Continue, Next}
s1: B1
s2: B2
None of {Continue, Return} are in B2
Break is in (B1B2)
(B1B2 ∪ {Next})∖{Break, Continue}
switch e {case c1: s1 ... case cn: sn} s1: B1
...
sn: Bn
Break is not in (B1 ∪ ... ∪ Bn)
B1 ∪ ... ∪ Bn
s1: B1
...
sn: Bn
Break is in (B1 ∪ ... ∪ Bn)
(B1 ∪ ... ∪ Bn ∪ {Next})∖Break

Note: The empty statement case occurs when a loop has an empty body, or when a for loop lacks an initialization or update statement.

For the purpose of this analysis:

Each built-in function has a behavior of {Next}. And each operator application not listed in the table above has the same behavior as if it were a function call with the same operands and with a function’s behavior of {Next}.

The behavior of a function must satisfy the rules given above.

Note: It is unnecessary to analyze the behavior of expressions because they will always be {Next} or a previously analyzed function will have produced a error.

8.8.2. Notes

This section is informative, non-normative.

Here is the full list of ways that these rules can cause a program to be rejected (this is just restating information already listed above):

This analysis can be run in linear time, by analyzing the call-graph bottom-up (since the behavior of a function call can depend on the function’s code).

8.8.3. Examples

Here are some examples showing this analysis in action:

EXAMPLE: Trivially dead code is allowed
fn simple() -> i32 {
  var a: i32;
  return 0;  // Behavior: {Return}
  a = 1;     // Valid, statically unreachable code.
             //   Statement behavior: {Next}
             //   Overall behavior (due to sequential statements): {Return}
  return 2;  // Valid, statically unreachable code. Behavior: {Return}
} // Function behavior: {Return}
EXAMPLE: Compound statements are supported
fn nested() -> i32 {
  var a: i32;
  {             // The start of a compound statement.
    a = 2;      // Behavior: {Next}
    return 1;   // Behavior: {Return}
  }             // The compound statement as a whole has behavior {Return}
  a = 1;        // Valid, statically unreachable code.
                //   Statement behavior: {Next}
                //   Overall behavior (due to sequential statements): {Return}
  return 2;     // Valid, statically unreachable code. Behavior: {Return}
}
EXAMPLE: if/then behaves as if there is an empty else
fn if_example() {
  var a: i32 = 0;
  loop {
    if a == 5 {
      break;      // Behavior: {Break}
    }             // Behavior of the whole if compound statement: {Break, Next},
                  //   as the if has an implicit empty else
    a = a + 1;    // Valid, as the previous statement had "Next" in its behavior
  }
}
EXAMPLE: if/then/else has the behavior of both sides
fn if_example() {
  var a: i32 = 0;
  loop {
    if a == 5 {
      break;      // Behavior: {Break}
    } else {
      continue;   // Behavior: {Continue}
    }             // Behavior of the whole if compound statement: {Break, Continue}
    a = a + 1;    // Valid, statically unreachable code.
                  //   Statement behavior: {Next}
                  //   Overall behavior: {Break, Continue}
  }
}
EXAMPLE: if/else if/else behaves like a nested if/else
fn if_example() {
  var a: i32 = 0;
  loop {
    // if e1 s1 else if e2 s2 else s3
    // is identical to
    // if e1 else { if e2 s2 else s3 }
    if a == 5 {
      break;      // Behavior: {Break}
    } else if a == 42 {
      continue;   // Behavior: {Continue}
    } else {
      return;     // Behavior {Return}
    }             // Behavior of the whole if compound statement:
                  //   {Break, Continue, Return}
  }               // Behavior of the whole loop compound statement {Next, Return}
}                 // Behavior of the whole function {Next}
EXAMPLE: Break in switch becomes Next
fn switch_example() {
  var a: i32 = 0;
  switch a {
    default: {
      break;   // Behavior: {Break}
    }
  }            // Behavior: {Next}, as switch replaces Break by Next
  a = 5;       // Valid, as the previous statement had Next in its behavior
}
EXAMPLE: Obviously infinite loops
fn invalid_infinite_loop() {
  loop { }     // Behavior: { }.  Invalid because it’s empty.
}
EXAMPLE: Discard will not terminate a loop
fn invalid_infinite_loop() {
  loop {
    discard; // Behaviour { Next }.
  }          // Invalid, behavior of the whole loop is { }.
}
EXAMPLE: A conditional continue with continuing statement
fn conditional_continue() {
  var a: i32;
  loop {
    if a == 5 { break; } // Behavior: {Break, Next}
    if a % 2 == 1 {      // Valid, as the previous statement has Next in its behavior
      continue;          // Behavior: {Continue}
    }                    // Behavior: {Continue, Next}
    a = a * 2;           // Valid, as the previous statement has Next in its behavior
    continuing {         // Valid as the continuing statement has behavior {Next}
                         //  which does not include any of:
                         //  {Break, Continue, Return}
      a = a + 1;
    }
  }                      // The loop as a whole has behavior {Next},
                         //  as it absorbs "Continue" and "Next",
                         //  then replaces "Break" with "Next"
}
EXAMPLE: A redundant continue with continuing statement
fn redundant_continue_with_continuing() {
  var a: i32;
  loop {
    if a == 5 { break; }
    continue;   // Valid. This is redundant, branching to the next statement.
    continuing {
      a = a + 1;
    }
  }
}
EXAMPLE: A continue at the end of a loop body
fn continue_end_of_loop_body() {
  for (var i: i32 = 0; i < 5; i++ ) {
    continue;   // Valid. This is redundant,
                //   branching to the end of the loop body.
  }             // Behavior: {Next},
                //   as loops absorb "Continue",
                //   and "for" loops always add "Next"
}
for loops desugar to loop with a conditional break. As shown in a previous example, the conditional break has behavior {Break, Next}, which leads to adding "Next" to the loop’s behavior.
EXAMPLE: return required in functions that have a return type
fn missing_return () -> i32 {
  var a: i32 = 0;
  if a == 42 {
    return a;       // Behavior: {Return}
  }                 // Behavior: {Next, Return}
}                   // Error: Next is invalid in the body of a
                    //   function with a return type
EXAMPLE: continue must be in a loop
fn continue_out_of_loop () {
  var a: i32 = 0;
  if a > 0  {
    continue;       // Behavior: {Continue}
  }                 // Behavior: {Next, Continue}
}                   // Error: Continue is invalid in the body of a function
The same example would also be invalid for the same reason if continue was replaced by break.

9. Functions

A function performs computational work when invoked.

A function is invoked in one of the following ways:

There are two kinds of functions:

9.1. Declaring a User-defined Function

A function declaration creates a user-defined function, by specifying:

A function declaration must only occur at module scope. A function name is in scope for the entire program.

A formal parameter declaration specifies an identifier name and a type for a value that must be provided when invoking the function. A formal parameter may have attributes. See § 9.2 Function Calls. The identifier is in scope until the end of the function. Two formal parameters for a given function must not have the same name.

Note: Some built-in functions may allow parameters to be abstract numeric types; however, this functionality is not currently supported for user-declared functions.

The return type, if specified, must be constructible.

WGSL defines the following attributes that can be applied to function declarations:

WGSL defines the following attributes that can be applied to function parameters and return types:

function_decl :

| attribute * function_header compound_statement

function_header :

| fn ident paren_left param_list ? paren_right ( arrow attribute * type_specifier ) ?

param_list :

| param ( comma param ) * comma ?

param :

| attribute * ident colon type_specifier

EXAMPLE: Simple functions
// Declare the add_two function.
// It has two formal paramters, i and b.
// It has a return type of i32.
// It has a body with a return statement.
fn add_two(i: i32, b: f32) -> i32 {
  return i + 2;  // A formal parameter is available for use in the body.
}

// A compute shader entry point function, 'main'.
// It has no specified return type.
// It invokes the add_two function, and captures
// the resulting value in the named value 'six'.
@compute @workgroup_size(1)
fn main() {
   let six: i32 = add_two(4, 5.0);
}

9.2. Function Calls

A function call is a statement or expression which invokes a function.

The function containing the function call is the calling function, or caller. The function being invoked is the called function, or callee.

The function call:

The function call must supply the same number of argument values as there are formal parameters in the called function. Each argument value must evaluate to the same type as the corresponding formal parameter, by position.

In summary, when calling a function:

  1. Execution of the calling function is suspended.

  2. The called function executes until it returns.

  3. Execution of the calling function resumes.

A called function returns as follows:

In detail, when a function call is executed the following steps occur:

  1. Function call argument values are evaluated. The relative order of evaluation is left-to-right.

  2. Execution of the calling function is suspended. All function scope variables and constants maintain their current values.

  3. If the called function is user-defined, memory is allocated for each function scope variable in the called function.

  4. Values for the formal parameters of the called function are determined by matching the function call argument values by position. For example, the first formal parameter of the called function will have the value of the first argument at the call site.

  5. Control is transferred to the called function. If the called function is user-defined, execution proceeds starting from the first statement in the body.

  6. The called function is executed, until it returns.

  7. Control is transferred back to the calling function, and the called function’s execution is unsuspended. If the called function returns a value, that value is supplied for the value of the function call expression.

The location of a function call is referred to as a call site. Call sites are a dynamic context. As such, the same textual location may represent multiple call sites.

Note: It is possible that a function call in a fragment shader never returns if all of the invocations in a quad are discarded. In such a case, control will not be tranferred back to the calling function.

9.3. const Functions

A function declared with a const attribute can be evaluated at shader-creation time. These functions are called const-functions. Calls to these functions can part of const-expressions.

It is a shader-creation error if the function contains any expressions that are not const-expressions, or any declarations that are not const-declarations.

Note: The const attribute cannot be applied to user-declared functions.

EXAMPLE: const-functions
const first_one = firstLeadingBit(1234 + 4567); // Evaluates to 12
                                                // first_one has the type i32, because
                                                // firstLeadingBit cannot operate on
                                                // AbstractInt

@id(1) override x : i32;
override y = firstLeadingBit(x); // const-expressions can be
                                 // used in override-expressions.
                                 // firstLeadingBit(x) is not a
                                 // const-expression in this context.

fn foo() {
  var a : array<i32, firstLeadingBit(257)>; // const-functions can be used in
                                            // const-expressions if all their
                                            // parameters are const-expressions.
}

9.4. Restrictions on Functions

Note: Recursion is disallowed because cycles are not permitted among any kinds of declarations.

9.4.1. Alias Analysis

Memory locations can be accessed during the execution of a function using memory views. Within a function, each memory view has a particular root identifier. The root identifier can be an originating variable or a formal parameter of pointer type.

Locally derived expressions of reference or pointer type may introduce new names for a particular root identifier, but each expression has a statically determinable root identifier. While the originating variable of a root identifier is a dynamic concept that depends on the call sites for the function, WGSL programs can be statically analyzed to determine the set of all possible originating variables for each root identifier.

Two root identifiers alias when they have the same originating variable. Execution of a WGSL function must not potentially access memory through aliased root identifiers, where one access is a write and the other is a read or a write. This is determined by analyzing the program from the leaves of the callgraph upwards (i.e. topological order). For each function the analysis records the following sets:

At each call site of a function, it is a shader-creation error if any of the following occur:

EXAMPLE: Alias analysis
var x : i32 = 0;

fn f1(p1 : ptr<function, i32>, p2 : ptr<function, i32>) {
  *p1 = *p2;
}

fn f2(p1 : ptr<function, i32>, p2 : ptr<function, i32>) {
  f1(p1, p2);
}

fn f3() {
  var a : i32 = 0;
  f2(&a, &a);  // Invalid. Cannot pass two pointer parameters
               // with the same root identifier when one or
               // more are written (even by a subfunction).
}

fn f4(p1 : ptr<function, i32>, p2 : ptr<function, i32>) -> i32 {
  return *p1 + *p2;
}

fn f5() {
  var a : i32 = 0;
  let b = f4(&a, &a); // Valid. p1 and p2 in f3 are both only read.
}

fn f6(p : ptr<private, i32>) {
  x = *p;
}

fn f7(p : ptr<private, i32>) -> i32 {
  return x + *p;
}

fn f8() {
  let a = f6(&x); // Invalid. x is written as a global variable and
                  // read as a parameter.
  let b = f7(&x); // Valid. x is only read as both a parameter and
                  // a variable.
}

10. Entry Points

An entry point is a user-defined function that performs the work for a particular shader stage.

10.1. Shader Stages

WebGPU issues work to the GPU in the form of draw or dispatch commands. These commands execute a pipeline in the context of a set of shader stage inputs, outputs, and attached resources.

A pipeline describes the work to be performed on the GPU, as a sequence of stages, some of which are programmable. In WebGPU, a pipeline is created before scheduling a draw or dispatch command for execution. There are two kinds of pipelines: GPUComputePipeline, and GPURenderPipeline.

A dispatch command uses a GPUComputePipeline to run a compute shader stage over a logical grid of points with a controllable amount of parallelism, while reading and possibly updating buffer and image resources.

A draw command uses a GPURenderPipeline to run a multi-stage process with two programmable stages among other fixed-function stages:

The WebGPU specification describes pipelines in greater detail.

WGSL defines three shader stages, corresponding to the programmable parts of pipelines:

Each shader stage has its own set of features and constraints, described elsewhere.

10.2. Entry Point Declaration

To create an entry point, declare a user-defined function with a shader stage attribute.

When configuring a pipeline in the WebGPU API, the entry point’s function name maps to the entryPoint attribute of the WebGPU § GPUProgrammableStage object.

The entry point’s formal parameters denote the stage’s shader stage inputs. The entry point’s return value, if specified, denotes the stage’s shader stage outputs.

The type of each formal parameter, and the entry point’s return type, must be one of:

A structure type can be used to group user-defined inputs with each other and optionally with built-in inputs. A structure type can be used as the return type to group user-defined outputs with each other and optionally with built-in outputs.

Note: The bool case is forbidden for user-defined inputs and outputs. It is only permitted for the front_facing builtin value.

Note: Compute entry points never have a return type.

EXAMPLE: Entry Point
@vertex
fn vert_main() -> @builtin(position) vec4<f32> {
  return vec4<f32>(0.0, 0.0, 0.0, 1.0);
}

@fragment
fn frag_main(@builtin(position) coord_in: vec4<f32>) -> @location(0) vec4<f32> {
  return vec4<f32>(coord_in.x, coord_in.y, 0.0, 1.0);
}

@compute @workgroup_size(1)
fn comp_main() { }

The set of functions in a shader stage is the union of:

The union is applied repeatedly until it stabilizes. It will stabilize in a finite number of steps.

10.2.1. Function Attributes for Entry Points

WGSL defines the following attributes that can be applied to entry point declarations:

Can we query upper bounds on workgroup size dimensions? Is it independent of the shader, or a property to be queried after creating the shader module?

EXAMPLE: workgroup_size Attribute
@compute @workgroup_size(8,4,1)
fn sorter() { }

@compute @workgroup_size(8u)
fn reverser() { }

// Using an pipeline-overridable constant.
@id(42) override block_width = 12u;
@compute @workgroup_size(block_width)
fn shuffler() { }

// Error: workgroup_size must be specified on compute shader
@compute
fn bad_shader() { }

10.3. Shader Interface

The shader interface is the set of objects through which the shader accesses data external to the shader stage, either for reading or writing. The interface includes:

When an identifier in a function body resolves to a module-scope variable or value declaration, then we say that variable or value is statically accessed by the function. Note that being statically accessed is independent of whether an execution of the shader will actually evaluate the expression referring to the variable, or even execute the statement that may enclose the expression.

More precisely, the interface of a shader stage consists of:

10.3.1. Inter-stage Input and Output Interface

A shader stage input is a datum provided to the shader stage from upstream in the pipeline. Each datum is either a built-in input value, or a user-defined input.

A shader stage output is a datum the shader provides for further processing downstream in the pipeline. Each datum is either a built-in output value, or a user-defined output.

IO attributes are used to establish an object as a shader stage input or a shader stage output, or to further describe the properties of an input or output. The IO attributes are:

10.3.1.1. Built-in Inputs and Outputs

A built-in input value provides access to system-generated control information. The set of built-in inputs are listed in § 16 Built-in Values. An entry point must not contain duplicated built-in inputs.

A built-in input for stage S with name X and type TX is accessed via a formal parameter to an entry point for shader stage S, in one of two ways:

  1. The parameter has attribute builtin(X) and is of type TX.

  2. The parameter has structure type, where one of the structure members has attribute builtin(X) and is of type TX.

Conversely, when a parameter or member of a parameter for an entry point has a builtin attribute, the corresponding builtin must be an input for the entry point’s shader stage.

A built-in output value is used by the shader to convey control information to later processing steps in the pipeline. The set of built-in outputs are listed in § 16 Built-in Values. An entry point must not contain duplicated built-in outputs.

A built-in output for stage S with name Y and type TY is set via the return value for an entry point for shader stage S, in one of two ways:

  1. The entry point return type has attribute builtin(Y) and is of type TY.

  2. The entry point return type has structure type, where one of the structure members has attribute builtin(Y) and is of type TY.

Conversely, when the return type or member of a return type for an entry point has a builtin attribute, the corresponding builtin must be an output for the entry point’s shader stage.

Note: The position built-in is both an output of a vertex shader, and an input to the fragement shader.

Collectively, built-in input and built-in output values are known as built-in values.

10.3.1.2. User-defined Inputs and Outputs

User-defined data can be passed as input to the start of a pipeline, passed between stages of a pipeline or output from the end of a pipeline.

Each user-defined input datum and user-defined output datum must:

A compute shader must not have user-defined inputs or outputs.

10.3.1.3. Input-output Locations

Each input-output location can store a value up to 16 bytes in size. The byte size of a type is defined using the SizeOf column in § 5.3.6.1 Alignment and Size. For example, a four-component vector of floating-point values occupies a single location.

IO locations are specified via the location attribute.

Each user-defined input and output must have an explicitly specified IO location. Each structure member in the entry point IO must be one of either a built-in value (see § 10.3.1.1 Built-in Inputs and Outputs), or assigned a location.

Locations must not overlap within each of the following sets:

Note: Location numbering is distinct between inputs and outputs: Location numbers for an entry point’s shader stage inputs do not conflict with location numbers for the entry point’s shader stage outputs.

Note: No additional rule is required to prevent location overlap within an entry point’s outputs. When the output is a structure, the first rule above prevents overlap. Otherwise, the output is a scalar or a vector, and can have only a single location assigned to it.

Note: The number of available locations for an entry point is defined by the WebGPU API.

EXAMPLE: Applying location attributes
struct A {
  @location(0) x: f32,
  // Despite locations being 16-bytes, x and y cannot share a location
  @location(1) y: f32
}

// in1 occupies locations 0 and 1.
// in2 occupies location 2.
// The return value occupies location 0.
@fragment
fn fragShader(in1: A, @location(2) in2: f32) -> @location(0) vec4<f32> {
 // ...
}

User-defined IO can be mixed with built-in values in the same structure. For example,

EXAMPLE: Mixing builtins and user-defined IO
// Mixed builtins and user-defined inputs.
struct MyInputs {
  @location(0) x: vec4<f32>,
  @builtin(front_facing) y: bool,
  @location(1) @interpolate(flat) z: u32
}

struct MyOutputs {
  @builtin(frag_depth) x: f32,
  @location(0) y: vec4<f32>
}

@fragment
fn fragShader(in1: MyInputs) -> MyOutputs {
  // ...
}
EXAMPLE: Invalid location assignments
struct A {
  @location(0) x: f32,
  // Invalid, x and y cannot share a location.
  @location(0) y: f32
}

struct B {
  @location(0) x: f32
}

struct C {
  // Invalid, structures with user-defined IO cannot be nested.
  b: B
}

struct D {
  x: vec4<f32>
}

@fragment
// Invalid, location cannot be applied to a structure type.
fn fragShader1(@location(0) in1: D) {
  // ...
}

@fragment
// Invalid, in1 and in2 cannot share a location.
fn fragShader2(@location(0) in1: f32, @location(0) in2: f32) {
  // ...
}

@fragment
// Invalid, location cannot be applied to a structure.
fn fragShader3(@location(0) in1: vec4<f32>) -> @location(0) D {
  // ...
}
10.3.1.4. Interpolation

Authors can control how user-defined IO data is interpolated through the use of the interpolate attribute. WGSL offers two aspects of interpolation to control: the type of interpolation, and the sampling of the interpolation.

The interpolation type must be one of:

The interpolation sampling must be one of:

For user-defined IO of scalar or vector floating-point type:

User-defined vertex outputs and fragment inputs of scalar or vector integer type must always be specified as @interpolate(flat).

Interpolation attributes must match between vertex outputs and fragment inputs with the same location assignment within the same pipeline.

10.3.2. Resource Interface

A resource is an object which provides access to data external to a shader stage, and which is not an override-declaration and not a shader stage input or output. Resources are shared by all invocations of the shader.

There are four kinds of resources:

The resource interface of a shader is the set of module-scope resource variables statically accessed by functions in the shader stage.

Each resource variable must be declared with both group and binding attributes. Together with the shader’s stage, these identify the binding address of the resource on the shader’s pipeline. See WebGPU § GPUPipelineLayout.

Bindings must not alias within a shader stage: two different variables in the resource interface of a given shader must not have the same group and binding values, when considered as a pair of values.

10.3.3. Resource Layout Compatibility

WebGPU requires that a shader’s resource interface match the layout of the pipeline using the shader.

It is a pipeline-creation error if a WGSL variable in a resource interface is bound to an incompatible WebGPU resource type or binding type, where compatibility is defined by the following table.

WebGPU binding type compatibility
WGSL resource WebGPU
Resource type
WebGPU Binding member WebGPU Binding type
uniform buffer GPUBufferBinding buffer GPUBufferBindingType uniform
storage buffer with read_write access storage
storage buffer with read access read-only-storage
sampler GPUSampler sampler GPUSamplerBindingType filtering
non-filtering
sampler_comparison comparison
sampled texture GPUTextureView texture GPUTextureSampleType float
unfilterable-float
sint
uint
depth
write-only storage texture GPUTextureView storageTexture GPUStorageTextureAccess write-only
external sampled texture GPUExternalTexture externalTexture (not applicable)

See the WebGPU API specification for interface validation requirements.

11. Language Extensions

The WGSL language is expected to evolve over time.

An extension is a named grouping for a coherent set of modifications to a particular version of the WGSL specification, consisting of any combination of:

Hypothetically, extensions could be used to:

11.1. Enable Directive

An enable directive indicates that the functionality described by a particular named extension may be used. The grammar rules imply that all enable directives must appear before any declarations or static assertions.

The directive uses a context-dependent name to name the extension.

In particular, an extension name may be spelled the same as a keyword or reserved word, but is not interpreted as any of those.

The valid extensions are listed in § 11.2 Extensions list.

enable_directive :

| enable extension_name semicolon

Note: The grammar rule includes the terminating semicolon token, ensuring the additional functionality is usable only after that semicolon. Therefore any WGSL implementation can parse the entire enable directive. When an implementation encounters an enable directive for an unsupported extension, the implementation can issue a clear diagnostic.

EXAMPLE: Using hypothetical extensions
// Enable a hypothetical extension for arbitrary precision floating point types.
enable aribtrary_precision_float;
enable arbitrary_precision_float; // A redundant enable directive is ok.

// Enable a hypothetical extension to control the rounding mode.
enable rounding_mode;

// Assuming arbitrary_precision_float enables use of:
//    - a type f<E,M>
//    - as a type in function return, formal parameters and let-declarations
//    - as a type constructor from AbstractFloat
//    - operands to division operator: /
// Assuming @rounding_mode attribute is enabled by the rounding_mode enable directive.
@rounding_mode(round_to_even)
fn halve_it(x : f<8, 7>) -> f<8, 7> {
  let two = f<8, 7>(2);
  return x / 2; // uses round to even rounding mode.
}

11.2. Extensions list

Extension identifier
WGSL extension name WebGPU extension name Description
f16 "shader-f16" Keyword f16 and any floating point literal with a h suffix is valid if and only if this extension is enabled. Otherwise, using f16 keyword or any floating point literal with a h suffix will result in a shader-creation error.

12. WGSL Program

A WGSL program is a sequence of optional directives followed by module scope declarations.

translation_unit :

| global_directive * global_decl *

global_decl :

| semicolon

| global_variable_decl semicolon

| global_constant_decl semicolon

| type_alias_decl semicolon

| struct_decl

| function_decl

| static_assert_statement semicolon

12.1. Limits

A WGSL implementation will support shaders that satisfy the following limits. A WGSL implementation may support shaders that go beyond the specified limits.

Note: A WGSL implementation should issue an error if it does not support a shader that goes beyond the specified limits.

Quantifiable shader complexity limits
Limit Minimum supported value
Maximum number of members in a structure type 16383
Maximum nesting depth of a composite type 255
Maximum number of parameters for a function 255
Maximum number of case selector values in a switch statement 16383
Maximum byte-size of an array type instantiated in the function or private address spaces

For the purposes of this limit, bool has a size of 1 byte.

65535
Maximum byte-size of an array type instantiated in the workgroup address space

For the purposes of this limit, bool has a size of 1 byte.

16384
Maximum number of elements in const-expression of array type 65535

13. Execution

§ 1.1 Technical Overview describes how a shader is invoked and partitioned into invocations. This section describes further constraints on how invocations execute, individually and collectively.

13.1. Program Order Within an Invocation

Each statement in a WGSL program may be executed zero or more times during execution. For a given invocation, each execution of a given statement represents a unique dynamic statement instance.

When a statement includes an expression, the statement’s semantics determines:

Expression nesting defines data dependencies which must be satisfied to complete evaluation. That is, a nested expression must be evaluated before the enclosing expression can be evaluated. The order of evaluation for operands of an expression is left-to-right in WGSL. For example, foo() + bar() must evaluate foo() before bar(). See § 7 Expressions.

Statements in a WGSL program are executed in control flow order. See § 8 Statements and § 9.2 Function Calls.

13.2. Uniformity

Collective operations (e.g. barriers and derivatives) require coordination among different invocations running concurrently on the GPU. To ensure correct and portable behavior, WGSL requires that these operations can be statically analyzed to not have any control dependencies such that a non-empty strict subset of invocations will execute the operation (i.e. the operation must be executed in uniform control flow). Non-uniform control dependencies arise from control flow statements whose behavior depends on non-uniform values. These non-uniform values can be traced back to certain sources that are not statically proven to be uniform. These sources include, but are not limited to:

The remainder of this section is devoted to a description of this static analysis an implementation will perform to validate the WGSL program.

13.2.1. Terminology and Concepts

The following definitions are merely informative, trying to give an intuition for what the analysis in the next subsection is computing. The analysis is what actually defines these concepts, and when a program is valid or breaks the uniformity rules.

For a given group of invocations:

13.2.2. Uniformity Analysis Overview

The remaining subsections specify a static analysis that verifies that collective operations are only executed in uniform control flow.

Note: This analysis has the following desirable properties:

Each function is analyzed, verifying that there is a context where it is safe to call this function. It rejects the program as invalid if there is no such context.

At the same time, it computes metadata about the function to help analyze its callers in turn. This means that the call graph must first be built, and functions must be analyzed from the leaves upwards, i.e. from functions that call no function outside the standard library toward the entry point. This way, whenever a function is analyzed, the metadata for all of its callees has already been computed. There is no risk of being trapped in a cycle, as recurrence is forbidden in the language.

Note: another way of saying the same thing is that we do a topological sort of functions ordered by the "is a (possibly indirect) callee of" partial order, and analyze them in that order.

13.2.3. Analyzing the Uniformity Requirements of a Function

Each function is analyzed in two phases.

The first phase walks over the syntax of the function, building a directed graph along the way based on the rules in the following subsections. The second phase explores that graph, resulting in either rejecting the program, or computing the constraints on calling this function.

Note: apart from two special nodes RequiredToBeUniform and MayBeNonUniform, all nodes can be understood as having one of the following meanings:

An edge can be understood as an implication from the statement corresponding to its source node to the statement corresponding to its target node.

To express that uniformity requirement (e.g. the control flow at the call site of a derivative), we add an edge from RequiredToBeUniform to the corresponding node. One way to understand this, is that RequiredToBeUniform corresponds to the proposition True, so that RequiredToBeUniform -> X is the same as saying that X is true.

Reciprocally, to express that we cannot ensure the uniformity of something (e.g. a variable which holds the thread id), we add an edge from the corresponding node to MayBeNonUniform. One way to understand this, is that MayBeNonUniform corresponds to the proposition False, so that X -> MayBeNonUniform is the same as saying that X is false.

A consequence of this interpretation is that every node reachable from RequiredToBeUniform corresponds to something which is required to be uniform for the program to be valid, and every node from which MayBeNonUniform is reachable corresponds to something whose uniformity we cannot guarantee. It follows that we have a uniformity violation (and thus reject the program) if there is any path from RequiredToBeUniform to MayBeNonUniform.

For each function, two tags are computed:

Additionally, for each formal parameter of a function, a parameter tag is computed and, if the parameter is a function address space pointer, a pointer parameter tag is also computed. The parameter tag describes the uniformity requirement of the parameter value. The pointer parameter tag describes whether the value stored in the memory pointed to by the parameter becomes non-uniform during the execution of the function call.

Call site tag values
Call Site Tag Description
CallSiteRequiredToBeUniform The function must only be called from uniform control flow.
CallSiteNoRestriction The function may be called from non-uniform control flow.
Function tag values
Function Tag Description
ReturnValueMayBeNonUniform The return value of the function may be non-uniform.
NoRestriction The function does not introduce non-uniformity.
Parameter tag values
Parameter Tag Description
ParameterRequiredToBeUniform The parameter must be a uniform value.
ParameterRequiredToBeUniformForReturnValue The parameter must be a uniform value in order for the return value to be a uniform value.
ParameterNoRestriction The parameter value has no uniformity requirement.
Pointer parameter tag values
Pointer Parameter Tag Description
PointerParameterMayBeNonUniform The value stored in the memory pointed to by the pointer parameter may be non-uniform after the function call.
PointerParameterNoRestriction The uniformity of the value stored in the memory pointed to by the pointer parameter is unaffected by the function call.

The following algorithm describes how to compute these tags for a given function:

Note: The entire graph can be destroyed at this point. The tags listed above are all that we need to remember to analyze callers of this function.

13.2.4. Function-scope Variable Value Analysis

The value of each function-scope variable at a particular statement can be analyzed in terms of the assignments that reach it and, potentially, its initial value.

An assignment is a full assignment if:

Otherwise, an assignment is a partial assignment.

A full reference is an expression of reference type that is one of:

A full pointer is an expression of pointer type that is one of:

Note: For the purposes of this analysis, we don’t need the case where a formal parameter of pointer type may be a full pointer.

A full reference, and similarly a full pointer, is a memory view for all the memory locations for the corresponding originating variable x.

A reference that is not a full reference is a partial reference. As such, a partial reference is a memory view for a strict subset of the memory locations for the corresponding originating variable.

An assignment through a full reference is a full assignment. An assignment thorugh a partial reference is a partial assignment.

When the uniformity rules in subsequent sections refer to the value for a function-scope variable used as an RValue, it means the value of the variable prior to evaluation of the RValue expression. When the uniformity rules in subsequent sections refer to the value for a function-scope variable used as an LValue, it means the value of the variable after execution of the statement the expression appears in.

Multiple assignments to a variable might reach a use of that variable due to control-flow statements or partial assignments. The analysis joins multiple assignments reaching out of control-flow statements by unioning the set of assignments that reach each control-flow exit.

The following table describes the rules for joining assignments. In the uniformity graph, each join is an edge from the result node to node representing the source of the value. It is written in terms of an arbitrary variable x. It uses the following notations:

Rules for joining multiple assignments to a function-scope variable.
Statement Result Edges from the Result
var x; Vin(next) V(0)
var x = e;
Vin(next) V(e)

Note: This is a full assignment to x.

x = e;
r = e;
where r is a full reference to variable x
r = e;
where r is a partial reference to variable x
Vout(s) V(e), V(prev)

Note: This is a partial assignment to x.

Note: Partial assignments include the previous value since only a subset of components are updated.

s1 s2
where Next is in behavior of s1.

Note: s1 often ends in a semicolon.

Vin(s2) Vout(s1)
if e s1 else s2
where Next is in the behaviors of both s1 and s2
Vin(next) Vout(s1), Vout(s2)
if e s1 else s2
where Next is in the behavior of s1, but not s2
Vin(next) Vout(s1)
where Next is in the behavior of s2, but not s1 Vin(next) Vout(s2)
loop { s1 continuing { s2 } } Vin(s1) Vout(prev), Vout(s2)
loop { s1 continuing { s2 } } Vin(s2) Vout(s1),
Vout(si)
for all si in s1 whose behavior is {Continue} and transfer control to s2
loop { s1 continuing { s2 } } Vin(next) Vout(s2),
Vout(si)
for all si in s1 whose behavior is {Break} and transfer control to next
switch e {
case _: s1
case _: s2
...
case _: s3
}
Vin(si) Vout(prev)
switch e {
case _: s1
case _: s2
...
case _: s3
}
Vin(next) Vout(si),
for all si whose behavior includes Next or Break, and
Vout(sj)
for all statements inside sj whose behavior is {Break} and trasfer control to next

For all other statements (except function calls), Vin(next) is equivalent to Vout(prev).

Note: The same desugarings apply as in statement behavior analysis.

13.2.4.1. Pointers to Function-scope Variables

Each pointer parameter in the function address space is desugared as a local variable declaration whose initial value is equivalent to dereferencing the parameter.

Whenever a let-declaration's effective-value-type is a function address space pointer, the initializer expression is recorded and any identifier that resolves to the declaration is substituted with that initializer (wrapped in a parenthesized expression) before applying the rules in the previous section. That is, function address space pointers are viewed as aliases to a local variable declaration. The alias may produce either full or partial assignments depending on the initializer substitutions.

EXAMPLE: pointers in the value analysis
fn foo(p : ptr<function, array<f32, 4>>) -> f32 {
  let p1 = p;
  let p2 = &((*p1)[1]);
  *p2 = 5;
  return (*p1)[0];
}

// This is the equivalent version of foo for the analysis.
fn foo_for_analysis(p : ptr<function, array<f32, 4>>) -> f32 {
  var p_var = *p;         // Introduce variable for p.
  let p1 = &p_var;        // Use the variable for p1
  let p2 = &(p_var[1]);   // Substitute p1’s initializer
  *(&(p_var[1])) = 5;     // Substitute p2’s initializer
  return (*(&p_var))[0];  // Substitute p1’s initializer
}

13.2.5. Uniformity Rules for Statements

The rules for analyzing statements take as argument both the statement itself and the node corresponding to control flow at the beginning of it (which we’ll note "CF" below) and return both of the following:

In the table below, (CF1, S) => CF2 means "run the analysis on S starting with control flow CF1, apply the required changes to the graph, and name the resulting control flow CF2". Similarly, (CF1, E) => (CF2, V) means "run the analysis on expression E, starting with control flow CF1, apply the required changes to the graph, and name the resulting control flow node CF2 and the resulting value node V" (see next section for the analysis of expressions).

We have a similar set of rules for expressions in left-value positions, that we denote by LValue: (CF, E) => (CF, L). Instead of computing the node which corresponds to the uniformity of the value, it computes the node which corresponds to the uniformity of the variable we are addressing.

When several edges have to be created we use X -> {Y, Z} as a short-hand for X -> Y, X -> Z.

Uniformity rules for statements
Statement New nodes Recursive analyses Resulting control flow node New edges
{s} (CF, s) => CF' CF'
s1 s2,
with Next in behavior of s1

Note: s1 often ends in a semicolon.

(CF, s1) => CF1
(CF1, s2) => CF2
CF2
s1 s2,
without Next in behavior of s1

Note: s1 often ends in a semicolon.

(CF, s1) => CF1

Note: s2 is statically unreachable and not recursively analyzed. s2 does not contribute to the uniformity analysis.

CF1
if e s1 else s2
with behavior {Next}
(CF, e) => (CF', V)
(V, s1) => CF1
(V, s2) => CF2
CF
if e s1 else s2
with another behavior
CFend CFend CFend -> {CF1, CF2}
loop {s1 continuing {s2}}
with behavior {Next}
CF' (CF', s1) => CF1
(CF1, s2) => CF2
CF CF' -> {CF2, CF}
loop {s1 continuing {s2}}
with another behavior
CF'
loop {s1}
with behavior {Next}
CF' (CF', s1) => CF1 CF CF' -> {CF1, CF}
loop {s1}
with another behavior
CF'
switch e case _: s_1 .. case _: s_n
with behavior {Next}
(CF, e) => (CF', V)
(V, s_1) => CF_1
...
(V, s_n) => CF_n
CF
switch e case _: s_1 .. case _: s_n
with another behavior
CFend CFend CFend -> {CF_1, ..., CF_n}
var x: T; CF Note: If x is a function address space variable, CF is used as the zero value initializer in the value analysis.
break;
continue;
break if e; (CF, e) => (CF', V) CF'
return; CF For each function address space pointer parameter i, Value_return_i -> Vin(prev) (see § 13.2.4 Function-scope Variable Value Analysis)
return e; (CF, e) => (CF', V) CF' Value_return -> V

For each function address space pointer parameter i, Value_return_i -> Vin(prev) (see § 13.2.4 Function-scope Variable Value Analysis)

e2 = e1; (CF, e1) => (CF1, V1)
LValue: (CF1, e2) => (CF2, L2)
CF2 L2 -> V1

Note: L2 is the result value from the value analysis.

_ = e (CF, e) => (CF', V) CF'
let x = e; (CF, e) => (CF', V) CF'
var x = e; (CF, e) => (CF', V) CF' Note: If x is a function address space variable, V is used as the result value in the value analysis.

Analysis of for and while loops follows from their respective desugaring translations to loop statements.

In switch, a default-alone clause block is treated exactly like a case clause with regards to uniformity.

To maximize performance, implementations often try to minimize the amount of non-uniform control flow. However, the points at which invocations can be said to be uniform varies depending on a number of factors. WGSL’s static analysis conservatively assumes a return to uniform control flow occuring at the end of if, switch, and loop statements if the behavior for the statement is {Next}. This is modeled in the preceding table as the resulting control flow node being the same as input control flow node.

13.2.6. Uniformity Rules for Function Calls

The most complex rule is for function calls:

Note: Refer to § 13.2.4 Function-scope Variable Value Analysis for the definition of Vout(call).

Most built-in functions have tags of:

Here is the list of exceptions:

Note: A WGSL implementation will ensure that if control flow prior to a function call is uniform, it will also be uniform after the function call.

13.2.7. Uniformity Rules for Expressions

The rules for analyzing expressions take as argument both the expression itself and the node corresponding to control flow at the beginning of it (which we’ll note "CF" below) and return the following:

Uniformity rules for expressions (in normal rvalue position)
Expression New nodes Recursive analyses Resulting control flow node, value node New edges
e1 || e2
with behavior {Next}
(CF, e1) => (CF1, V1)
(V1, e2) => (CF2, V2)
CF, V2
e1 && e2
with behavior {Next}
e1 || e2
with behavior other than {Next}
CF2, V2
e1 && e2
with behavior other than {Next}
Literal CF, CF
reference to function-scope variable "x" Result X is the node corresponding to the value of "x" at the input to the statement containing this expression CF, Result Result -> {CF, X}

Note: X is equivalent to Vout(prev) for "x"
(see § 13.2.4 Function-scope Variable Value Analysis)

reference to const-declaration, override-declaration, let-declaration, or non-built-in parameter "x" Result X is the node corresponding to "x" CF, Result Result -> {CF, X}
reference to uniform built-in value "x" CF, CF
reference to non-uniform built-in value "x" CF,
MayBeNonUniform
reference to read-only module-scope variable "x" CF, CF
reference to non-read-only module-scope variable "x" CF,
MayBeNonUniform
op e,
where op is a unary operator
(CF, e) => (CF', V) CF', V
e.field
e1 op e2,
where op is a non-short-circuiting binary operator
Result (CF, e1) => (CF1, V1)
(CF1, e2) => (CF2, V2)
CF2, Result Result -> {V1, V2}
e1[e2]

The following built-in input variables are considered uniform:

All other ones (see § 16 Built-in Values) are considered non-uniform.

Note: An author should avoid grouping the uniform built-in values together with other non-uniform inputs because the analysis does not analyze the components of a composite type separately.

Uniformity rules for expressions in lvalue positions
Expression New nodes Recursive analyses Resulting control flow node, variable node New edges
reference to function-scope variable "x" Result X is the node corresponding to the value of "x" at the output of the statement containing this expression. CF, Result Result -> {CF, X}

Note: X is equivalent to Vin(next) for "x"
(see § 13.2.4 Function-scope Variable Value Analysis)

reference to const-declaration, override-declaration, let-declaration, or parameter "x" X is the node corresponding to "x" CF, X
reference to module-scope variable "x" CF,
MayBeNonUniform
e.field LValue: (CF, e) => (CF1, L1) CF1, L1
e1[e2] LValue: (CF, e1) => (CF1, L1)
(CF1, e2) => (CF2, V2)
CF2, L1 L1 -> V2

13.2.8. Annotating the Uniformity of Every Point in the Control-flow

This entire subsection is non-normative.

If implementers want to provide developers with a diagnostic mode that shows for each point in the control-flow of the entire shader whether it is uniform or not (and thus whether it would be valid to call a function that requires uniformity there), we suggest the following:

Any node which is not visited by these reachability analyses can be proven to be uniform by the analysis (and so it would be safe to call a derivative or similar function there).

Note: The bottom-up analysis is still required, as it lets us know what edges to add to the graphs when encountering calls.

13.2.9. Examples

The graphs in the subsequent example use the following conventions for nodes:

13.2.9.1. Invalid textureSample Function Call

This example shows an invalid use of a textureSample built-in function call. The function call is made within an if statement whose condition depends on a non-uniform value (i.e. the built-in value position). The invalid dependency chain is highlighted in red.

EXAMPLE: WGSL invalid textureSample
@group(0) @binding(0) var t : texture_2d<f32>;
@group(0) @binding(1) var s : sampler;

@fragment
fn main(@builtin(position) pos : vec4<f32>) {
  if (pos.x < 0.5) {
    // Invalid textureSample function call.
    _ = textureSample(t, s, pos.xy);
  }
}
Uniformity graph

The example also shows that uniformity of the control flow after the if statement is the same as the uniformity prior to the if statement (CF_return being connected to CF_start). That is, the control flow is once again uniform after the if statement (because it is guaranteed to start as uniform control flow at the beginning of the entry point). If the textureSample function call had been moved outside the if statement the program would have been valid. Likewise, if the condition of the if statement were a uniform value (e.g. each invocation read the same value from a uniform buffer), the program would also have been valid.

13.2.9.2. Function-scope Variable Uniformity

This example shows both a valid and an invalid barrier function call that depend on the value of a function-scope variable. The workgroupBarrier is invalid because the value of x is derived from the mutable module-scope variable a. The storageBarrier is valid because the value of x is derived from the immutable module-scope variable b. This example highlights the value analysis' ability to separate different periods of uniformity in a function-scope variable’s lifetime. This example also clearly shows that control flow becomes uniform again after the end of the first if statement. We know this because that section of the graph is independent from the second if statement.

EXAMPLE: WGSL using function variable
@group(0) @binding(0) var<storage, read_write> a : i32;
@group(0) @binding(1) var<uniform> b : i32;

@compute @workgroup_size(16,1,1)
fn main() {
  var x : i32;
  x = a;
  if x > 0 {
    // Invalid barrier function call.
    workgroupBarrier();
  }
  x = b;
  if x < 0 {
    // Valid barrier function call.
    storageBarrier();
  }
}
Uniformity graph

Note: The subgraphs are only included in the example for ease of understanding.

13.2.9.3. Composite Value Analysis Limitations

One limitation of the uniformity analysis is that it does not track the components of a composite value independently. That is, any non-uniform component value will cause the analysis to treat the entire composite value as non-uniform. This example illustrates this issue and a potential workaround that shader authors can employ to avoid this limitation.

EXAMPLE: Invalid composite value WGSL
struct Inputs {
  // workgroup_id is a uniform built-in value.
  @builtin(workgroup_id) wgid : vec3<u32>,
  // local_invocation_index is a non-uniform built-in value.
  @builtin(local_invocation_index) lid : u32
}

@compute @workgroup_size(16,1,1)
fn main(inputs : Inputs) {
  // This comparison is always uniform,
  // but the analysis cannot determine that.
  if inputs.wgid.x == 1 {
    workgroupBarrier();
  }
}
Invalid uniformity graph

The easiest way to work around this limitation of the analysis is to split the composite up so that values that are known to be uniform are separate from value that are known to be non-uniform. In the alternative WGSL below, splitting the two built-in values into separate parameters satisfies the uniformity analysis. This can be seen by the lack of a path from RequiredToBeUniform to MayBeNonUniform in the graph.

EXAMPLE: Valid alternative WGSL
@compute @workgroup_size(16,1,1)
fn main(@builtin(workgroup_id) wgid : vec3<u32>,
        @builtin(local_invocation_index) lid : u32) {
  // The uniformity analysis can now correctly determine this comparison is
  // always uniform.
  if wgid.x == 1 {
    // Valid barrier function call.
    workgroupBarrier();
  }
}
Valid alternative uniformity graph
13.2.9.4. Uniformity in a Loop

In this example, there is an invalid workgroupBarrier function call in a loop. The non-uniform built-in value local_invocation_index is the ultimate cause despite the fact that it appears after the barrier in the loop. This occurs, because on later iterations some of the invocations in the workgroup will have exited the loop prematurely while others attempt to execute the barrier. The analysis models the inter-iteration dependencies as an edge, where the control at the start of the loop body (CF_loop_body) depends on the control flow at the end of the loop body (CF_after_if).

EXAMPLE: Loop uniformity WGSL
@compute @workgroup_size(16,1,1)
fn main(@builtin(local_invocation_index) lid : u32) {
  for (var i = 0u; i < 10; i++) {
    workgroupBarrier();
    if (lid + i) > 7 {
      break;
    }
  }
}
Uniformity graph
13.2.9.5. User-defined Function Calls

This example is modification of the first example, but uses a user-defined function call. The analysis tags both parameters of scale as ParameterRequiredToBeUniformForReturnValue. This leads to the path in main between the return value of the scale function call and the position built-in value. That path is a subpath of the overall invalid path from RequiredToBeUniform to MayBeNonUniform.

EXAMPLE: User-defined funtion call uniformity WGSL
fn scale(in1 : f32, in2 : f32) -> f32 {
  let v = in1 / in2;
  return v;
}

@group(0) @binding(0) var t : texture_2d<f32>;
@group(0) @binding(1) var s : sampler;

@fragment
fn main(@builtin(position) pos : vec4<f32>) {
  let tmp = scale(pos.x, 0.5);
  if tmp > 1.0 {
    _ = textureSample(t, s, pos.xy);
  }
}
Uniformity graph for scale
Uniformity graph for main

Note: The subgraphs are only included in the example for ease of understanding.

13.3. Compute Shaders and Workgroups

A workgroup is a set of invocations which concurrently execute a compute shader stage entry point, and share access to shader variables in the workgroup address space.

The workgroup grid for a compute shader is the set of points with integer coordinates (i,j,k) with:

where (workgroup_size_x, workgroup_size_y, workgroup_size_z) is the value specified for the workgroup_size attribute of the entry point.

There is exactly one invocation in a workgroup for each point in the workgroup grid.

An invocation’s local invocation ID is the coordinate triple for the invocation’s corresponding workgroup grid point.

When an invocation has local invocation ID (i,j,k), then its local invocation index is

i + (j * workgroup_size_x) + (k * workgroup_size_x * workgroup_size_y)

Note that if a workgroup has W invocations, then each invocation I the workgroup has a unique local invocation index L(I) such that 0 ≤ L(I) < W, and that entire range is covered.

A compute shader begins execution when a WebGPU implementation removes a dispatch command from a queue and begins the specified work on the GPU. The dispatch command specifies a dispatch size, which is an integer triple (group_count_x, group_count_y, group_count_z) indicating the number of workgroups to be executed, as described in the following.

The compute shader grid for a particular dispatch is the set of points with integer coordinates (CSi,CSj,CSk) with:

where workgroup_size_x, workgroup_size_y, and workgroup_size_z are as above for the compute shader entry point.

The work to be performed by a compute shader dispatch is to execute exactly one invocation of the entry point for each point in the compute shader grid.

An invocation’s global invocation ID is the coordinate triple for the invocation’s corresponding compute shader grid point.

The invocations are organized into workgroups, so that each invocation (CSi, CSj, CSk) is identified with the workgroup grid point

( CSi mod workgroup_size_x , CSj mod workgroup_size_y , CSk mod workgroup_size_z )

in workgroup ID

( ⌊ CSi ÷ workgroup_size_x ⌋, ⌊ CSj ÷ workgroup_size_y ⌋, ⌊ CSk ÷ workgroup_size_z ⌋).

WebGPU provides no guarantees about:

13.4. Fragment Shaders and Helper Invocations

Invocations in the fragment shader stage are divided into 2x2 grids of invocations with neighbouring positions in the X and Y dimensions. Each of these grids is referred to as a quad. Quads can collaborate in some collective operations (see § 13.5.2 Derivatives).

Ordinarily, fragment processing creates one invocation of a fragment shader for each RasterizationPoint produced by rasterization. Sometimes there may be insufficient RasterizationPoints to fully populate a quad, for example at the edge of a graphics primitive. When a quad has only 1, 2, or 3 invocations corresponding to RasterizationPoints, fragment processing will create a helper invocation for each unpopulated position in the quad.

Helper invocations do not have observable effects, except that they help compute derivatives. As such, helper invocations are subject to the following restrictions:

If all of the invocations in a quad become helper invocations (e.g. due to executing a discard statement), execution of the quad may be terminated; however, such termination is not considered to produce non-uniform control flow.

13.5. Collective Operations

13.5.1. Barriers

A barrier is a synchronization built-in function that orders memory operations in a program. A control barrier is executed by all invocations in the same workgroup as if it were executed concurrently. As such, control barriers must only be executed in uniform control flow in a compute shader.

13.5.2. Derivatives

A partial derivative is the rate of change of a value along an axis. Fragment shader invocations within the same quad collaborate to compute approximate partial derivatives.

Partial derivatives of the fragment coordinate are computed implicitly as part of operation of the following built-in functions:

For these, the derivatives help determine the mip levels of texels to be sampled, or in the case of textureSampleCompare, sampled and compared against a reference value.

Partial derivatives of invocation-specified values are computed by the built-in functions described in § 17.4 Derivative Built-in Functions:

Because neighbouring invocations collaborate to compute derivatives, these functions must only be invoked in uniform control flow in a fragment shader.

13.6. Floating Point Evaluation

WGSL follows the IEEE-754 standard for floating point computation with the following exceptions:

13.6.1. Floating Point Accuracy

Let x be the exact real-valued or infinite result of an operation when computed with unbounded precision. The correctly rounded result of the operation for floating point type T is:

That is, the result may be rounded up or down: WGSL does not specify a rounding mode.

Note: Floating point types include positive and negative infinity, so the correctly rounded result may be finite or infinite.

The units in the last place, ULP, for a floating point number x is the minimum distance between two non-equal floating point numbers a and b such that axb (i.e. ulp(x) = mina,b|b - a|).

In the following tables, the accuracy of an operation is provided among five possibilities:

For any accuracy values specified over a range, the accuracy is undefined for results outside that range.

If an allowable return value for any operation is greater in magnitude than the largest representable finite floating-point value, then that operation may additionally return either the infinity with the same sign or the largest finite value with the same sign.

Accuracy of expressions
Expression Accuracy for f32 Accuracy for f16
x + y Correctly rounded
x - y Correctly rounded
x * y Correctly rounded
x / y 2.5 ULP for |y| in the range [2-126, 2126] 2.5 ULP for |y| in the range [2-14, 214]
x % y Derived from x - y * trunc(x/y)
-x Correctly rounded
x == y Correct result
x != y Correct result
x < y Correct result
x <= y Correct result
x > y Correct result
x >= y Correct result
Accuracy of built-in functions
Built-in Function Accuracy for f32 Accuracy for f16
abs(x) Correctly rounded
acos(x) Inherited from atan2(sqrt(1.0 - x * x), x)
acosh(x) Inherited from log(x + sqrt(x * x - 1.0))
asin(x) Inherited from atan2(x, sqrt(1.0 - x * x))
asinh(x) Inherited from log(x + sqrt(x * x + 1.0))
atan(x) 4096 ULP 5 ULP
atan2(y, x) When y is finite and normal, inherited from atan(y / x)
atanh(x) Inherited from log( (1.0 + x) / (1.0 - x) ) * 0.5
ceil(x) Correctly rounded
clamp(x,low,high) Correctly rounded.
Note: The infinitely precise result is computed using either the min-max formulation, or the median-of-3-values formulation. These may differ when low > high.
cos(x) Absolute error at most 2-11 when x is in the interval [-π, π] Absolute error at most 2-7 when x is in the interval [-π, π]
cosh(x) Inherited from (exp(x) + exp(-x)) * 0.5
cross(x, y) Inherited from (x[i] * y[j] - x[j] * y[i])
degrees(x) Inherited from x * 57.295779513082322865
distance(x, y) Inherited from length(x - y)
dot(x, y) Inherited from sum of x[i] * y[i]
exp(x) 3 + 2 * |x| ULP 1 + 2 * |x| ULP
exp2(x) 3 + 2 * |x| ULP 1 + 2 * |x| ULP
faceForward(x, y, z) Inherited from select(-x, x, dot(z, y) < 0.0)
floor(x) Correctly rounded
fma(x, y, z) Inherited from x * y + z
fract(x) Inherited from x - floor(x)
frexp(x) Correctly rounded
inverseSqrt(x) 2 ULP
ldexp(x, y) Correctly rounded
length(x) Inherited from sqrt(dot(x, x)) in the vector case, and sqrt(x*x) in the scalar case.
log(x) Absolute error at most 2-21 when x is in the interval [0.5, 2.0].
3 ULP when x is outside the interval [0.5, 2.0].
Absolute error at most 2-7 when x is in the interval [0.5, 2.0].
3 ULP when x is outside the interval [0.5, 2.0].
log2(x) Absolute error at most 2-21 when x is in the interval [0.5, 2.0].
3 ULP when x is outside the interval [0.5, 2.0].
Absolute error at most 2-7 when x is in the interval [0.5, 2.0].
3 ULP when x is outside the interval [0.5, 2.0].
max(x, y) Correctly rounded
min(x, y) Correctly rounded
mix(x, y, z) Inherited from x * (1.0 - z) + y * z
modf(x) Correctly rounded
normalize(x) Inherited from x / length(x)
pack4x8snorm(x) Correctly rounded intermediate value. Correct result.
pack4x8unorm(x) Correctly rounded intermediate value. Correct result.
pack2x16snorm(x) Correctly rounded intermediate value. Correct result.
pack2x16unorm(x) Correctly rounded intermediate value. Correct result.
pack2x16float(x) Correctly rounded intermediate value. Correct result.
pow(x, y) Inherited from exp2(y * log2(x))
quantizeToF16(x) Correctly rounded
radians(x) Inherited from x * 0.017453292519943295474
reflect(x, y) Inherited from x - 2.0 * dot(x, y) * y
refract(x, y, z) Inherited from z * x - (z * dot(y, x) + sqrt(k)) * y,
where k = 1.0 - z * z * (1.0 - dot(y, x) * dot(y, x))
If k < 0.0 the result is precisely 0.0
round(x) Correctly rounded
sign(x) Correctly rounded
sin(x) Absolute error at most 2-11 when x is in the interval [-π, π] Absolute error at most 2-7 when x is in the interval [-π, π]
sinh(x) Inherited from (exp(x) - exp(-x)) * 0.5
saturate(x) Correctly rounded
smoothstep(low, high, x) Inherited from t * t * (3.0 - 2.0 * t),
where t = clamp((x - low) / (high - low), 0.0, 1.0)
sqrt(x) Inherited from 1.0 / inverseSqrt(x)
step(edge, x) Correctly rounded
tan(x) Inherited from sin(x) / cos(x)
tanh(x) Inherited from sinh(x) / cosh(x)
trunc(x) Correctly rounded
unpack4x8snorm(x) Correctly rounded
unpack4x8unorm(x) Correctly rounded
unpack2x16snorm(x) Correctly rounded
unpack2x16unorm(x) Correctly rounded
unpack2x16float(x) Correctly rounded

Reassociation is the reordering of operations in an expression such that the answer is the same if computed exactly. For example:

However, the result may not be the same when computed in floating point. The reassociated result may be inaccurate due to approximation, or may trigger an overflow or NaN when computing intermediate results.

An implementation may reassociate operations.

An implementation may fuse operations if the transformed expression is at least as accurate as the original formulation. For example, some fused multiply-add implementations can be more accurate than performing a multiply followed by an addition.

13.6.2. Floating Point Conversion

In this section, a floating point type may be any of:

Note: Recall that the f32 WGSL type corresponds to the IEEE-754 binary32 format, and the f16 WGSL type corresponds to the IEEE-754 binary16 format.

When converting a floating point scalar value to an integer scalar type:

Note: In other words, floating point to integer conversion rounds toward zero, then saturates.

Note: The result in the overflow case may not yield the value with the maximum magnitude in the target type, because that value may not be exactly representable in the original floating point type. For example, the maximum value in u32 is 4294967295, but 4294967295.0 is not exactly representable in f32. For any real number x with 4294967040 ≤ x ≤ 4294967295, the f32 value nearest to x is either larger than 429467295 or rounds down to 4294967040. Therefore the maximum u32 value resulting from a floating point conversion is 4294967040u.

When converting a value to a floating point type:

NOTE: An integer value may lie between two adjacent representable floating point values. In particular, the f32 type uses 23 explicit fractional bits. Additionally, when the floating point value is in the normal range (the exponent is neither extreme value), then the mantissa is the set of fractional bits together with an extra 1-bit at the most significant position at bit position 23. Then, for example, integers 228 and 1+228 both map to the same floating point value: the difference in the least significant 1 bit is not representable by the floating point format. This kind of collision occurs for pairs of adjacent integers with a magnitude of at least 225.

Note: The original value is always within range of the destination type when the original type is one of i32 or u32 and the destination type is f32.

Note: The original value is always within range of the destination type when the source type is a floating point type with fewer exponent and mantissa bits than the target floating point type.

Check behavior of the f32 to f16 conversion for numbers just beyond the max normal f16 values. I’ve written what an NVIDIA GPU does. See https://github.com/google/amber/pull/918 for an executable test case.

14. Memory Model

In general, WGSL follows the Vulkan Memory Model. The remainder of this section describes how WGSL programs map to the Vulkan Memory Model.

Note: The Vulkan Memory Model is a textual version of a formal Alloy model.

14.1. Memory Operation

In WGSL, a read access is equivalent to a memory read operation in the Vulkan Memory Model. A WGSL, a write access is equivalent to a memory write operation in the Vulkan Memory Model.

A read access occurs when an invocation executes one of the following:

A write access occurs when an invocation executes one of the following:

Atomic read-modify-write built-in functions perform a single memory operation that is both a read access and a write access.

Read and write accesses do not occur under any other circumstances. Read and write accesses are collectively known as memory operations in the Vulkan Memory Model.

A memory operation accesses exactly the set of locations associated with the particular memory view used in the operation. For example, a memory read that accesses a u32 from a struct containing multiple members, only reads the memory locations associated with that u32 member.

EXAMPLE: Accessing memory locations
struct S {
  a : f32,
  b : u32,
  c : f32
}

@group(0) @binding(0)
var<storage> v : S;

fn foo() {
  let x = v.b; // Does not access memory locations for v.a or v.c.
}

14.2. Memory Model Reference

Each module-scope variable in WGSL forms a unique memory model reference for the lifetime of a given entry point. Each function-scope variable in WGSL forms a unique memory model reference for the lifetime of the variable.

14.3. Scoped Operations

When an invocation performs a scoped operation, it will affect one or two sets of invocations. These sets are the memory scope and the execution scope. The memory scope specifies the set of invocations that will see any updates to memory contents affected by the operation. For synchronization built-in functions, this also means that all affected memory operations program ordered before the function are visible to affected operations program ordered after the function. The execution scope specifies the set of invocations which may participate in an operation (see § 13.5 Collective Operations).

Atomic built-in functions map to atomic operations whose memory scope is:

Synchronization built-in functions map to control barriers whose execution and memory scopes are Workgroup.

Implicit and explicit derivatives have an implicit quad execution scope.

Note: If the Vulkan memory model is not enabled in generated shaders, Device scope should be used instead of QueueFamily.

14.4. Memory Semantics

All Atomic built-in functions use Relaxed memory semantics and, thus, no address space semantics.

workgroupBarrier uses AcquireRelease memory semantics and WorkgroupMemory semantics. storageBarrier uses AcquireRelease memory semantics and UniformMemory semantics.

Note: A combined workgroupBarrier and storageBarrier uses AcquireRelease ordering semantics and both WorkgroupMemory and UniformMemory memory semantics.

Note: No atomic or synchronization built-in functions use MakeAvailable or MakeVisible semantics.

14.5. Private vs Non-private

All non-atomic read accesses in the storage or workgroup address spaces are considered non-private and correspond to read operations with NonPrivatePointer | MakePointerVisible memory operands with the Workgroup scope.

All non-atomic write accesses in the storage or workgroup address spaces are considered non-private and correspond to write operations with NonPrivatePointer | MakePointerAvailable memory operands with the Workgroup scope.

https://github.com/gpuweb/gpuweb/issues/1621

15. Keyword and Token Summary

15.1. Keyword Summary

15.1.1. Type-defining Keywords

array :

| 'array'

atomic :

| 'atomic'

bool :

| 'bool'

float32 :

| 'f32'

float16 :

| 'f16'

int32 :

| 'i32'

mat2x2 :

| 'mat2x2'

mat2x3 :

| 'mat2x3'

mat2x4 :

| 'mat2x4'

mat3x2 :

| 'mat3x2'

mat3x3 :

| 'mat3x3'

mat3x4 :

| 'mat3x4'

mat4x2 :

| 'mat4x2'

mat4x3 :

| 'mat4x3'

mat4x4 :

| 'mat4x4'

pointer :

| 'ptr'

sampler :

| 'sampler'

sampler_comparison :

| 'sampler_comparison'

texture_1d :

| 'texture_1d'

texture_2d :

| 'texture_2d'

texture_2d_array :

| 'texture_2d_array'

texture_3d :

| 'texture_3d'

texture_cube :

| 'texture_cube'

texture_cube_array :

| 'texture_cube_array'

texture_multisampled_2d :

| 'texture_multisampled_2d'

texture_storage_1d :

| 'texture_storage_1d'

texture_storage_2d :

| 'texture_storage_2d'

texture_storage_2d_array :

| 'texture_storage_2d_array'

texture_storage_3d :

| 'texture_storage_3d'

texture_depth_2d :

| 'texture_depth_2d'

texture_depth_2d_array :

| 'texture_depth_2d_array'

texture_depth_cube :

| 'texture_depth_cube'

texture_depth_cube_array :

| 'texture_depth_cube_array'

texture_depth_multisampled_2d :

| 'texture_depth_multisampled_2d'

uint32 :

| 'u32'

vec2 :

| 'vec2'

vec3 :

| 'vec3'

vec4 :

| 'vec4'

15.1.2. Other Keywords

bitcast :

| 'bitcast'

break :

| 'break'

case :

| 'case'

const :

| 'const'

continue :

| 'continue'

continuing :

| 'continuing'

default :

| 'default'

discard :

| 'discard'

else :

| 'else'

enable :

| 'enable'

false :

| 'false'

fn :

| 'fn'

for :

| 'for'

if :

| 'if'

let :

| 'let'

loop :

| 'loop'

override :

| 'override'

return :

| 'return'

static_assert :

| 'static_assert'

struct :

| 'struct'

switch :

| 'switch'

true :

| 'true'

type :

| 'type'

var :

| 'var'

while :

| 'while'

15.2. Reserved Words

A reserved word is a token which is reserved for future use. A WGSL program must not contain a reserved word.

The following are reserved words:

_reserved :

| 'CompileShader'

| 'ComputeShader'

| 'DomainShader'

| 'GeometryShader'

| 'Hullshader'

| 'NULL'

| 'Self'

| 'abstract'

| 'active'

| 'alignas'

| 'alignof'

| 'as'

| 'asm'

| 'asm_fragment'

| 'async'

| 'attribute'

| 'auto'

| 'await'

| 'become'

| 'binding_array'

| 'cast'

| 'catch'

| 'class'

| 'co_await'

| 'co_return'

| 'co_yield'

| 'coherent'

| 'column_major'

| 'common'

| 'compile'

| 'compile_fragment'

| 'concept'

| 'const_cast'

| 'consteval'

| 'constexpr'

| 'constinit'

| 'crate'

| 'debugger'

| 'decltype'

| 'delete'

| 'demote'

| 'demote_to_helper'

| 'do'

| 'dynamic_cast'

| 'enum'

| 'explicit'

| 'export'

| 'extends'

| 'extern'

| 'external'

| 'fallthrough'

| 'filter'

| 'final'

| 'finally'

| 'friend'

| 'from'

| 'fxgroup'

| 'get'

| 'goto'

| 'groupshared'

| 'handle'

| 'highp'

| 'impl'

| 'implements'

| 'import'

| 'inline'

| 'inout'

| 'instanceof'

| 'interface'

| 'layout'

| 'line'

| 'lineadj'

| 'lowp'

| 'macro'

| 'macro_rules'

| 'match'

| 'mediump'

| 'meta'

| 'mod'

| 'module'

| 'move'

| 'mut'

| 'mutable'

| 'namespace'

| 'new'

| 'nil'

| 'noexcept'

| 'noinline'

| 'nointerpolation'

| 'noperspective'

| 'null'

| 'nullptr'

| 'of'

| 'operator'

| 'package'

| 'packoffset'

| 'partition'

| 'pass'

| 'patch'

| 'pixelfragment'

| 'point'

| 'precise'

| 'precision'

| 'premerge'

| 'priv'

| 'protected'

| 'pub'

| 'public'

| 'readonly'

| 'ref'

| 'regardless'

| 'register'

| 'reinterpret_cast'

| 'requires'

| 'resource'

| 'restrict'

| 'self'

| 'set'

| 'shared'

| 'signed'

| 'sizeof'

| 'smooth'

| 'snorm'

| 'static'

| 'static_assert'

| 'static_cast'

| 'std'

| 'subroutine'

| 'super'

| 'target'

| 'template'

| 'this'

| 'thread_local'

| 'throw'

| 'trait'

| 'try'

| 'typedef'

| 'typeid'

| 'typename'

| 'typeof'

| 'union'

| 'unless'

| 'unorm'

| 'unsafe'

| 'unsized'

| 'use'

| 'using'

| 'varying'

| 'virtual'

| 'volatile'

| 'wgsl'

| 'where'

| 'with'

| 'writeonly'

| 'yield'

15.3. Syntactic Tokens

A syntactic token is a sequence of special code points, used:

and :

| '&' (Code point: U+0026)

and_and :

| '&&' (Code points: U+0026 U+0026)

arrow :

| '->' (Code points: U+002D U+003E)

attr :

| '@' (Code point: U+0040)

forward_slash :

| '/' (Code point: U+002F)

bang :

| '!' (Code point: U+0021)

bracket_left :

| '[' (Code point: U+005B)

bracket_right :

| ']' (Code point: U+005D)

brace_left :

| '{' (Code point: U+007B)

brace_right :

| '}' (Code point: U+007D)

colon :

| ':' (Code point: U+003A)

comma :

| ',' (Code point: U+002C)

equal :

| '=' (Code point: U+003D)

equal_equal :

| '==' (Code points: U+003D U+003D)

not_equal :

| '!=' (Code points: U+0021 U+003D)

greater_than :

| '>' (Code point: U+003E)

greater_than_equal :

| '>=' (Code points: U+003E U+003D)

shift_right :

| '>>' (Code point: U+003E U+003E)

less_than :

| '<' (Code point: U+003C)

less_than_equal :

| '<=' (Code points: U+003C U+003D)

shift_left :

| '<<' (Code points: U+003C U+003C)

modulo :

| '%' (Code point: U+0025)

minus :

| '-' (Code point: U+002D)

minus_minus :

| '--' (Code points: U+002D U+002D)

period :

| '.' (Code point: U+002E)

plus :

| '+' (Code point: U+002B)

plus_plus :

| '++' (Code points: U+002B U+002B)

or :

| '|' (Code point: U+007C)

or_or :

| '||' (Code points: U+007C U+007C)

paren_left :

| '(' (Code point: U+0028)

paren_right :

| ')' (Code point: U+0029)

semicolon :

| ';' (Code point: U+003B)

star :

| '*' (Code point: U+002A)

tilde :

| '~' (Code point: U+007E)

underscore :

| '_' (Code point: U+005F)

xor :

| '^' (Code point: U+005E)

plus_equal :

| '+=' (Code points: U+002B U+003D)

minus_equal :

| '-=' (Code points: U+002D U+003D)

times_equal :

| '*=' (Code points: U+002A U+003D)

division_equal :

| '/=' (Code points: U+002F U+003D)

modulo_equal :

| '%=' (Code points: U+0025 U+003D)

and_equal :

| '&=' (Code points: U+0026 U+003D)

or_equal :

| '|=' (Code points: U+007C U+003D)

xor_equal :

| '^=' (Code points: U+005E U+003D)

shift_right_equal :

| '>>=' (Code points: U+003E U+003E U+003D)

shift_left_equal :

| '<<=' (Code points: U+003C U+003C U+003D)

15.4. Context-Dependent Name Tokens

This section lists the tokens used as context-dependent names.

The attribute names are:

The interpolation type names are:

interpolation_type_name :

| 'perspective'

| 'linear'

| 'flat'

The interpolation sampling names are:

interpolation_sample_name :

| 'center'

| 'centroid'

| 'sample'

The built-in value names are:

builtin_value_name :

| 'vertex_index'

| 'instance_index'

| 'position'

| 'front_facing'

| 'frag_depth'

| 'local_invocation_id'

| 'local_invocation_index'

| 'global_invocation_id'

| 'workgroup_id'

| 'num_workgroups'

| 'sample_index'

| 'sample_mask'

The access mode names are:

access_mode :

| 'read'

| 'write'

| 'read_write'

The address space names are:

address_space :

| 'function'

| 'private'

| 'workgroup'

| 'uniform'

| 'storage'

The texel format names are:

texel_format :

| 'rgba8unorm'

| 'rgba8snorm'

| 'rgba8uint'

| 'rgba8sint'

| 'rgba16uint'

| 'rgba16sint'

| 'rgba16float'

| 'r32uint'

| 'r32sint'

| 'r32float'

| 'rg32uint'

| 'rg32sint'

| 'rg32float'

| 'rgba32uint'

| 'rgba32sint'

| 'rgba32float'

The extension names are:

extension_name :

| 'f16'

The swizzle names are used in vector access expressions:

swizzle_name :

| '/[rgba]/'

| '/[rgba][rgba]/'

| '/[rgba][rgba][rgba]/'

| '/[rgba][rgba][rgba][rgba]/'

| '/[xyzw]/'

| '/[xyzw][xyzw]/'

| '/[xyzw][xyzw][xyzw]/'

| '/[xyzw][xyzw][xyzw][xyzw]/'

16. Built-in Values

The following table lists the available built-in values.

See § 10.3.1.1 Built-in Inputs and Outputs for how to declare a built-in value.

Built-in input and output values
Name Stage Input or Output Type Description
vertex_index vertex input u32 Index of the current vertex within the current API-level draw command, independent of draw instancing.

For a non-indexed draw, the first vertex has an index equal to the firstVertex argument of the draw, whether provided directly or indirectly. The index is incremented by one for each additional vertex in the draw instance.

For an indexed draw, the index is equal to the index buffer entry for the vertex, plus the baseVertex argument of the draw, whether provided directly or indirectly.

instance_index vertex input u32 Instance index of the current vertex within the current API-level draw command.

The first instance has an index equal to the firstInstance argument of the draw, whether provided directly or indirectly. The index is incremented by one for each additional instance in the draw.

position vertex output vec4<f32> Output position of the current vertex, using homogeneous coordinates. After homogeneous normalization (where each of the x, y, and z components are divided by the w component), the position is in the WebGPU normalized device coordinate space. See WebGPU § Coordinate Systems.
fragment input vec4<f32> Framebuffer position of the current fragment in framebuffer space. (The x, y, and z components have already been scaled such that w is now 1.) See WebGPU § Coordinate Systems.
front_facing fragment input bool True when the current fragment is on a front-facing primitive. False otherwise. See WebGPU § Front-facing.
frag_depth fragment output f32 Updated depth of the fragment, in the viewport depth range. See WebGPU § Coordinate Systems.
local_invocation_id compute input vec3<u32> The current invocation’s local invocation ID, i.e. its position in the workgroup grid.
local_invocation_index compute input u32 The current invocation’s local invocation index, a linearized index of the invocation’s position within the workgroup grid.
global_invocation_id compute input vec3<u32> The current invocation’s global invocation ID, i.e. its position in the compute shader grid.
workgroup_id compute input vec3<u32> The current invocation’s workgroup ID, i.e. the position of the workgroup in the workgroup grid.
num_workgroups compute input vec3<u32> The dispatch size, vec<u32>(group_count_x, group_count_y, group_count_z), of the compute shader dispatched by the API.
sample_index fragment input u32 Sample index for the current fragment. The value is least 0 and at most sampleCount-1, where sampleCount is the number of MSAA samples specified for the GPU render pipeline.
See WebGPU § GPURenderPipeline.
sample_mask fragment input u32 Sample coverage mask for the current fragment. It contains a bitmask indicating which samples in this fragment are covered by the primitive being rendered.
See WebGPU § Sample Masking.
fragment output u32 Sample coverage mask control for the current fragment. The last value written to this variable becomes the shader-output mask. Zero bits in the written value will cause corresponding samples in the color attachments to be discarded.
See WebGPU § Sample Masking.
EXAMPLE: Declaring built-in values
 struct VertexOutput {
   @builtin(position) my_pos: vec4<f32>
 }

 @vertex
 fn vs_main(
   @builtin(vertex_index) my_index: u32,
   @builtin(instance_index) my_inst_index: u32,
 ) -> VertexOutput {}

 struct FragmentOutput {
   @builtin(frag_depth) depth: f32,
   @builtin(sample_mask) mask_out: u32
 }

 @fragment
 fn fs_main(
   @builtin(front_facing) is_front: bool,
   @builtin(position) coord: vec4<f32>,
   @builtin(sample_index) my_sample_index: u32,
   @builtin(sample_mask) mask_in: u32,
 ) -> FragmentOutput {}

 @compute @workgroup_size(64)
 fn cs_main(
   @builtin(local_invocation_id) local_id: vec3<u32>,
   @builtin(local_invocation_index) local_index: u32,
   @builtin(global_invocation_id) global_id: vec3<u32>,
) {}

17. Built-in Functions

Certain functions are predeclared, provided by the implementation, and therefore always available for use in a WGSL program. These are called built-in functions.

A built-in function is a family of functions, all with the same name, but distinguished by the number, order, and types of their formal parameters. Each of these distinct function variations is an overload.

Note: Each user-defined function only has one overload.

Each overload is described below via:

When calling a built-in function, all arguments to the function are evaluated before function evaluation begins. See § 9.2 Function Calls.

17.1. Logical Built-in Functions

17.1.1. all

Overload
@const fn all(e: vecN<bool>) -> bool
Description Returns true if each component of e is true.
Overload
@const fn all(e: bool) -> bool
Description Returns e.

17.1.2. any

Overload
@const fn any(e: vecN<bool>) -> bool
Description Returns true if any component of e is true.
Overload
@const fn any(e: bool) -> bool
Description Returns e.

17.1.3. select

Overload
@const fn select(f: T,
                 t: T,
                 cond: bool) -> T
Parameterization T is scalar or vector
Description Returns t when cond is true, and f otherwise.
Overload
@const fn select(f: vecN<T>,
                 t: vecN<T>,
                 cond: vecN<bool>) -> vecN<T>
Parameterization T is scalar
Description Component-wise selection. Result component i is evaluated as select(f[i], t[i], cond[i]).

17.2. Array Built-in Functions

17.2.1. arrayLength

Overload
fn arrayLength(p: ptr<storage, array<E>, AM>) -> u32
Parameterization E is an element type for a runtime-sized array,
access mode AM is read or read_write
Description Returns the number of elements in the runtime-sized array.

17.3. Numeric Built-in Functions

17.3.1. abs

Overload
@const fn abs(e: T ) -> T
Parameterization S is AbstractInt, AbstractFloat, i32, u32, f32, or f16
T is S, or vecN<S>
Description The absolute value of e. Component-wise when T is a vector.

If e is a floating-point type, then the result is e with a positive sign bit. If e is an unsigned integer scalar type, then the result is e. If e is a signed integer scalar type and evaluates to the largest negative value, then the result is e.

17.3.2. acos

Overload
@const fn acos(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the principal value, in radians, of the inverse cosine (cos-1) of e.
That is, approximates x with 0 ≤ x ≤ π, such that cos(x) = e.

Component-wise when T is a vector.

Note: The result is not mathematically meaningful when abs(e) > 1.

17.3.3. acosh

Overload
@const fn acosh(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the inverse hyperbolic cosine (cosh-1) of e, as a hyperbolic angle in radians.
That is, approximates x with 0 ≤ x ≤ ∞, such that cosh(x) = e.

The result is 0 when e < 1.

Component-wise when T is a vector.

Note: The result is not mathematically meaningful when e < 1.

17.3.4. asin

Overload
@const fn asin(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the principal value, in radians, of the inverse sine (sin-1) of e.
That is, approximates x with -π/2 ≤ x ≤ π/2, such that sin(x) = e.

Component-wise when T is a vector.

Note: The result is not mathematically meaningful when abs(e) > 1.

17.3.5. asinh

Overload
@const fn asinh(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the inverse hyperbolic sine (sinh-1) of e, as a hyperbolic angle in radians.
That is, approximates x such that sinh(x) = e.

Component-wise when T is a vector.

17.3.6. atan

Overload
@const fn atan(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the principal value, in radians, of the inverse tangent (tan-1) of e.
That is, approximates x with π/2 ≤ x ≤ π/2, such that tan(x) = e.

Component-wise when T is a vector.

17.3.7. atanh

Overload
@const fn atanh(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the inverse hyperbolic tangent (tanh-1) of e, as a hyperbolic angle in radians.
That is, approximates x such that tanh(x) = e.

The result is 0 when abs(e) ≥ 1.

Component-wise when T is a vector.

Note: The result is not mathematically meaningful when abs(e) ≥ 1.

17.3.8. atan2

Overload
@const fn atan2(y: T,
                x: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns an angle, in radians, in the interval [-π, π] whose tangent is y÷x.

The quadrant selected by the result depends on the signs of y and x. For example, the function may be implemented as:

  • atan(y/x) when x > 0

  • atan(y/x) + π when (x < 0) and (y > 0)

  • atan(y/x) - π when (x < 0) and (y < 0)

Note: atan2 is ill-defined at the origin (x,y) = (0,0)

Component-wise when T is a vector.

17.3.9. ceil

Overload
@const fn ceil(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the ceiling of e. Component-wise when T is a vector.

17.3.10. clamp

Overload
@const fn clamp(e: T,
                low: T,
                high: T) -> T
Parameterization S is AbstractInt, AbstractFloat, i32, u32, f32, or f16
T is S, or vecN<S>
Description Restricts the value of e within a range.

If T is an integer type, then the result is min(max(e, low), high).

If T is a floating-point type, then the result is either min(max(e, low), high), or the median of the three values e, low, high.

Component-wise when T is a vector.

If low is greater than high, then:

17.3.11. cos

Overload
@const fn cos(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the cosine of e, where e is in radians. Component-wise when T is a vector.

17.3.12. cosh

Overload
@const fn cosh(arg: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the hyperbolic cosine of arg, where arg is a hyperbolic angle in radians. Approximates the pure mathematical function (earg + e−arg)÷2, but not necessarily computed that way.

Component-wise when T is a vector

17.3.13. countLeadingZeros

Overload
@const fn countLeadingZeros(e: T) -> T
Parameterization T is i32, u32, vecN<i32>, or vecN<u32>
Description The number of consecutive 0 bits starting from the most significant bit of e, when T is a scalar type.
Component-wise when T is a vector.
Also known as "clz" in some languages.

17.3.14. countOneBits

Overload
@const fn countOneBits(e: T) -> T
Parameterization T is i32, u32, vecN<i32>, or vecN<u32>
Description The number of 1 bits in the representation of e.
Also known as "population count".
Component-wise when T is a vector.

17.3.15. countTrailingZeros

Overload
@const fn countTrailingZeros(e: T) -> T
Parameterization T is i32, u32, vecN<i32>, or vecN<u32>
Description The number of consecutive 0 bits starting from the least significant bit of e, when T is a scalar type.
Component-wise when T is a vector.
Also known as "ctz" in some languages.

17.3.16. cross

Overload
@const fn cross(e1: vec3<T>,
                e2: vec3<T>) -> vec3<T>
Parameterization T is AbstractFloat, f32, or f16
Description Returns the cross product of e1 and e2.

17.3.17. degrees

Overload
@const fn degrees(e1: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Converts radians to degrees, approximating e1 × 180 ÷ π. Component-wise when T is a vector

17.3.18. determinant

Overload
@const fn determinant(e: matCxC<T>) -> T
Parameterization T is AbstractFloat, f32, or f16
Description Returns the determinant of e.

17.3.19. distance

Overload
@const fn distance(e1: T,
                   e2: T) -> S
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the distance between e1 and e2 (e.g. length(e1 - e2)).

17.3.20. dot

Overload
@const fn dot(e1: vecN<T>,
              e2: vecN<T>) -> T
Parameterization T is AbstractInt, AbstractFloat, i32, u32, f32, or f16
Description Returns the dot product of e1 and e2.

17.3.21. exp

Overload
@const fn exp(e1: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the natural exponentiation of e1 (e.g. ee1). Component-wise when T is a vector.

17.3.22. exp2

Overload
@const fn exp2(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns 2 raised to the power e (e.g. 2e). Component-wise when T is a vector.

17.3.23. extractBits (signed)

Overload
@const fn extractBits(e: T,
                      offset: u32,
                      count: u32) -> T
Parameterization T is i32 or vecN<i32>
Description Reads bits from an integer, with sign extension.

When T is a scalar type, then:

  • w is the bit width of T
  • o = min(offset, w)
  • c = min(count, w - o)
  • The result is 0 if c is 0.
  • Otherwise, bits 0..c - 1 of the result are copied from bits o..o + c - 1 of e. Other bits of the result are the same as bit c - 1 of the result.
Component-wise when T is a vector.

If count + offset is greater than w, then:

17.3.24. extractBits (unsigned)

Overload
@const fn extractBits(e: T,
                      offset: u32,
                      count: u32) -> T
Parameterization T is u32 or vecN<u32>
Description Reads bits from an integer, without sign extension.

When T is a scalar type, then:

  • w is the bit width of T
  • o = min(offset, w)
  • c = min(count, w - o)
  • The result is 0 if c is 0.
  • Otherwise, bits 0..c - 1 of the result are copied from bits o..o + c - 1 of e. Other bits of the result are 0.
Component-wise when T is a vector.

If count + offset is greater than w, then:

17.3.25. faceForward

Overload
@const fn faceForward(e1: T,
                      e2: T,
                      e3: T) -> T
Parameterization T is vecN<AbstractFloat>, vecN<f32>, or vecN<f16>
Description Returns e1 if dot(e2, e3) is negative, and -e1 otherwise.

17.3.26. firstLeadingBit (signed)

Overload
@const fn firstLeadingBit(e: T) -> T
Parameterization T is i32 or vecN<i32>
Description For scalar T, the result is:
  • -1 if e is 0 or -1.
  • Otherwise the position of the most significant bit in e that is different from e's sign bit.

Component-wise when T is a vector.

Note: Since signed integers use twos-complement representation, the sign bit appears in the most significant bit position.

17.3.27. firstLeadingBit (unsigned)

Overload
@const fn firstLeadingBit(e: T) -> T
Parameterization T is u32 or vecN<u32>
Description For scalar T, the result is:
  • T(-1) if e is zero.
  • Otherwise the position of the most significant 1 bit in e.
Component-wise when T is a vector.

17.3.28. firstTrailingBit

Overload
@const fn firstTrailingBit(e: T) -> T
Parameterization T is i32, u32, vecN<i32>, or vecN<u32>
Description For scalar T, the result is:
  • T(-1) if e is zero.
  • Otherwise the position of the least significant 1 bit in e.
Component-wise when T is a vector.

17.3.29. floor

Overload
@const fn floor(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the floor of e. Component-wise when T is a vector.

17.3.30. fma

Overload
@const fn fma(e1: T,
              e2: T,
              e3: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns e1 * e2 + e3. Component-wise when T is a vector.

17.3.31. fract

Overload
@const fn fract(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the fractional part of e, computed as e - floor(e).
Component-wise when T is a vector.

Note: Valid results are in the closed interval [0, 1.0]. For example, if e is a very small negative number, then fract(e) may be 1.0.

17.3.32. frexp

Overload
@const fn frexp(e: T) -> __frexp_result
Parameterization T is f32
Description Splits e into a significand and exponent of the form significand * 2exponent. Returns the __frexp_result built-in structure, defined as follows:
struct __frexp_result {
  sig : f32, // significand part
  exp : i32  // exponent part
}
The magnitude of the significand is in the range of [0.5, 1.0) or 0.
EXAMPLE: frexp usage
// Infers result type
let sig_and_exp = frexp(1.5);
// Sets fraction_direct to 0.75
let fraction_direct = frexp(1.5).sig;

Note: A value cannot be explicitly declared with the type __frexp_result, but a value may infer the type.

Overload
@const fn frexp(e: T) -> __frexp_result_f16
Parameterization T is f16
Description Splits e into a significand and exponent of the form significand * 2exponent. Returns the __frexp_result_f16 built-in structure, defined as if as follows:
struct __frexp_result_f16 {
  sig : f16, // significand part
  exp : i32  // exponent part
}
The magnitude of the significand is in the range of [0.5, 1.0) or 0.

Note: A value cannot be explicitly declared with the type __frexp_result_f16, but a value may infer the type.

Overload
@const fn frexp(e: T) -> __frexp_result_vecN
Parameterization T is vecN<f32>
Description Splits the components of e into a significand and exponent of the form significand * 2exponent.

Returns the __frexp_result_vecN built-in structure, defined as follows:

struct __frexp_result_vecN {
  sig : vecN<f32>, // significand part
  exp : vecN<i32>  // exponent part
}
The magnitude of each component of the significand is in the range of [0.5, 1.0) or 0.

Note: A value cannot be explicitly declared with the type __frexp_result_vecN, but a value may infer the type.

Overload
@const fn frexp(e: T) -> __frexp_result_vecN_f16
Parameterization T is vecN<f16>
Description Splits the components of e into a significand and exponent of the form significand * 2exponent.

Returns the __frexp_result_vecN built-in structure, defined as if as follows:

struct __frexp_result_vecN_f16 {
  sig : vecN<f16>, // significand part
  exp : vecN<i32>  // exponent part
}
The magnitude of each component of the significand is in the range of [0.5, 1.0) or 0.

Note: A value cannot be explicitly declared with the type __frexp_result_vecN_f16, but a value may infer the type.

17.3.33. insertBits

Overload
@const fn insertBits(e: T,
                     newbits: T,
                     offset: u32,
                     count: u32) -> T
Parameterization T is i32, u32, vecN<i32>, or vecN<u32>
Description Sets bits in an integer.

When T is a scalar type, then:

  • w is the bit width of T
  • o = min(offset, w)
  • c = min(count, w - o)
  • The result is e if c is 0.
  • Otherwise, bits o..o + c - 1 of the result are copied from bits 0..c - 1 of newbits. Other bits of the result are copied from e.
Component-wise when T is a vector.

17.3.34. inverseSqrt

Overload
@const fn inverseSqrt(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the reciprocal of sqrt(e). Component-wise when T is a vector.

Note: The result is not mathematically meaningful if e ≤ 0.

17.3.35. ldexp

Overload
@const fn ldexp(e1: T,
                e2: I) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
I is AbstractInt, i32, vecN<AbstractInt>, or vecN<i32>
I is a vector if and only if T is a vector
I is concrete if and only if T is a concrete
Description Returns e1 * 2e2, except:

Here, bias is the exponent bias of the floating point format:

  • 15 for f16

  • 127 for f32

  • 1023 for AbstractFloat, when AbstractFloat is IEEE-754 binary64

If x is zero or a finite normal value for its type, then:

x = ldexp(frexp(x).sig, frexp(x).exp)

Component-wise when T is a vector.

17.3.36. length

Overload
@const fn length(e: T) -> S
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the length of e.
Evaluates to the absolute value of e if T is scalar.
Evaluates to sqrt(e[0]2 + e[1]2 + ...) if T is a vector type.

Note: The scalar case may be evaluated as sqrt(e * e), which may unnecessarily overflow or lose accuracy.

Note: The result is not mathematically meaningful if e ≤ -0.

17.3.37. log

Overload
@const fn log(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the natural logarithm of e. Component-wise when T is a vector.

Note: The result is not mathematically meaningful if e < 0.

17.3.38. log2

Overload
@const fn log2(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the base-2 logarithm of e. Component-wise when T is a vector.

Note: The result is not mathematically meaningful if e < 0.

17.3.39. max

Overload
@const fn max(e1: T,
              e2: T) -> T
Parameterization S is AbstractInt, AbstractFloat, i32, u32, f32, or f16
T is S, or vecN<S>
Description Returns e2 if e1 is less than e2, and e1 otherwise. Component-wise when T is a vector.

If e1 and e2 are floating-point types, then:

  • If one operand is a NaN, the other is returned.

  • If both operands are NaNs, a NaN is returned.

17.3.40. min

Overload
@const fn min(e1: T,
              e2: T) -> T
Parameterization S is AbstractInt, AbstractFloat, i32, u32, f32, or f16
T is S, or vecN<S>
Description Returns e2 if e2 is less than e1, and e1 otherwise. Component-wise when T is a vector.

If e1 and e2 are floating-point types, then:

  • If one operand is a NaN, the other is returned.

  • If both operands are NaNs, a NaN is returned.

17.3.41. mix

Overload
@const fn mix(e1: T,
              e2: T,
              e3: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the linear blend of e1 and e2 (e.g. e1 * (1 - e3) + e2 * e3). Component-wise when T is a vector.
Overload
@const fn mix(e1: T2,
              e2: T2,
              e3: T) -> T2
Parameterization T is AbstractFloat, f32, or f16
T2 is vecN<T>
Description Returns the component-wise linear blend of e1 and e2, using scalar blending factor e3 for each component.
Same as mix(e1, e2, T2(e3)).

17.3.42. modf

Overload
@const fn modf(e: T) -> __modf_result
Parameterization T is f32
Description Splits e into fractional and whole number parts.

The whole part is (e % 1.0), and the fractional part is e minus the whole part.

Returns the __modf_result built-in structure, defined as follows:

struct __modf_result {
  fract : f32, // fractional part
  whole : f32  // whole part
}
EXAMPLE: modf usage
// Infers result type
let fract_and_whole = modf(1.5);
// Sets fract_direct to 0.5
let fract_direct = modf(1.5).fract;

Note: A value cannot be explicitly declared with the type __modf_result, but a value may infer the type.

Overload
@const fn modf(e: T) -> __modf_result_f16
Parameterization T is f16
Description Splits e into fractional and whole number parts.

The whole part is (e % 1.0), and the fractional part is e minus the whole part.

Returns the __modf_result_f16 built-in structure, defined as if as follows:

struct __modf_result_f16 {
  fract : f16, // fractional part
  whole : f16  // whole part
}

Note: A value cannot be explicitly declared with the type __modf_result_f16, but a value may infer the type.

Overload
@const fn modf(e: T) -> __modf_result_vecN
Parameterization T is vecN<f32>
Description Splits the components of e into fractional and whole number parts.

The i'th component of the whole and fractional parts equal the whole and fractional parts of modf(e[i]).

Returns the __modf_result_vecN built-in structure, defined as follows:

struct __modf_result_vecN {
  fract : vecN<f32>, // fractional part
  whole : vecN<f32>  // whole part
}

Note: A value cannot be explicitly declared with the type __modf_result_vecN, but a value may infer the type.

Overload
@const fn modf(e: T) -> __modf_result_vecN_f16
Parameterization T is vecN<f16>
Description Splits the components of e into fractional and whole number parts.

The i'th component of the whole and fractional parts equal the whole and fractional parts of modf(e[i]).

Returns the __modf_result_vecN_f16 built-in structure, defined as if as follows:

struct __modf_result_vecN_f16 {
  fract : vecN<f16>, // fractional part
  whole : vecN<f16>  // whole part
}

Note: A value cannot be explicitly declared with the type __modf_result_vecN_f16, but a value may infer the type.

17.3.43. normalize

Overload
@const fn normalize(e: vecN<T> ) -> vecN<T>
Parameterization T is AbstractFloat, f32, or f16
Description Returns a unit vector in the same direction as e.

17.3.44. pow

Overload
@const fn pow(e1: T,
              e2: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns e1 raised to the power e2. Component-wise when T is a vector.

17.3.45. quantizeToF16

Overload
@const fn quantizeToF16(e: T) -> T
Parameterization T is f32 or vecN<f32>
Description Quantizes a 32-bit floating point value e as if e were converted to a IEEE 754 binary16 value, and then converted back to a IEEE 754 binary32 value.

If e is outside the finite range of binary16, then the result is any value of type f32.

The intermediate binary16 value may be flushed to zero, i.e. the final result may be zero if the intermediate binary16 value is denormalized.

See § 13.6.2 Floating Point Conversion.

Component-wise when T is a vector.

Note: The vec2<f32> case is the same as unpack2x16float(pack2x16float(e)).

17.3.46. radians

Overload
@const fn radians(e1: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Converts degrees to radians, approximating e1 × π ÷ 180. Component-wise when T is a vector

17.3.47. reflect

Overload
@const fn reflect(e1: T,
                  e2: T) -> T
Parameterization T is vecN<AbstractFloat>, vecN<f32>, or vecN<f16>
Description For the incident vector e1 and surface orientation e2, returns the reflection direction e1 - 2 * dot(e2, e1) * e2.

17.3.48. refract

Overload
@const fn refract(e1: T,
                  e2: T,
                  e3: I) -> T
Parameterization T is vecN<I>
I is AbstractFloat, f32, or f16
Description For the incident vector e1 and surface normal e2, and the ratio of indices of refraction e3, let k = 1.0 - e3 * e3 * (1.0 - dot(e2, e1) * dot(e2, e1)). If k < 0.0, returns the refraction vector 0.0, otherwise return the refraction vector e3 * e1 - (e3 * dot(e2, e1) + sqrt(k)) * e2.

17.3.49. reverseBits

Overload
@const fn reverseBits(e: T) -> T
Parameterization T is i32, u32, vecN<i32>, or vecN<u32>
Description Reverses the bits in e: The bit at position k of the result equals the bit at position 31 -k of e.
Component-wise when T is a vector.

17.3.50. round

Overload
@const fn round(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Result is the integer k nearest to e, as a floating point value.
When e lies halfway between integers k and k + 1, the result is k when k is even, and k + 1 when k is odd.
Component-wise when T is a vector.

17.3.51. saturate

Overload
@const fn saturate(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns clamp(e, 0.0, 1.0). Component-wise when T is a vector.

17.3.52. sign

Overload
@const fn sign(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Result is:
  • 1.0 when e > 0
  • 0.0 when e = 0
  • -1.0 when e < 0

Component-wise when T is a vector.

17.3.53. sin

Overload
@const fn sin(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the sine of e, where e is in radians. Component-wise when T is a vector.

17.3.54. sinh

Overload
@const fn sinh(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the hyperbolic sine of e, where e is a hyperbolic angle in radians. Approximates the pure mathematical function (earge−arg)÷2, but not necessarily computed that way.

Component-wise when T is a vector.

17.3.55. smoothstep

Overload
@const fn smoothstep(low: T,
                     high: T,
                     x: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the smooth Hermite interpolation between 0 and 1. Component-wise when T is a vector.

For scalar T, the result is t * t * (3.0 - 2.0 * t), where t = clamp((x - low) / (high - low), 0.0, 1.0).

17.3.56. sqrt

Overload
@const fn sqrt(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the square root of e. Component-wise when T is a vector.

17.3.57. step

Overload
@const fn step(edge: T,
               x: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns 1.0 if edgex, and 0.0 otherwise. Component-wise when T is a vector.

17.3.58. tan

Overload
@const fn tan(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the tangent of e, where e is in radians. Component-wise when T is a vector.

17.3.59. tanh

Overload
@const fn tanh(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns the hyperbolic tangent of e, where e is a hyperbolic angle in radians. Approximates the pure mathematical function (earge−arg) ÷ (earg + e−arg) but not necessarily computed that way.

Component-wise when T is a vector.

17.3.60. transpose

Overload
@const fn transpose(e: matRxC<T>) -> matCxR<T>
Parameterization T is AbstractFloat, f32, or f16
Description Returns the transpose of e.

17.3.61. trunc

Overload
@const fn trunc(e: T) -> T
Parameterization S is AbstractFloat, f32, or f16
T is S or vecN<S>
Description Returns truncate(e), the nearest whole number whose absolute value is less than or equal to e. Component-wise when T is a vector.

17.4. Derivative Built-in Functions

See § 13.5.2 Derivatives.

These functions:

17.4.1. dpdx

Overload
fn dpdx(e: T) -> T
Parameterization T is f32 or vecN<f32>
Description Partial derivative of e with respect to window x coordinates. The result is the same as either dpdxFine(e) or dpdxCoarse(e).

17.4.2. dpdxCoarse

Overload
fn dpdxCoarse(e: T) -> T
Parameterization T is f32 or vecN<f32>
Description Returns the partial derivative of e with respect to window x coordinates using local differences. This may result in fewer unique positions that dpdxFine(e).

17.4.3. dpdxFine

Overload
fn dpdxFine(e: T) -> T
Parameterization T is f32 or vecN<f32>
Description Returns the partial derivative of e with respect to window x coordinates.

17.4.4. dpdy

Overload
fn dpdy(e: T) -> T
Parameterization T is f32 or vecN<f32>
Description Partial derivative of e with respect to window y coordinates. The result is the same as either dpdyFine(e) or dpdyCoarse(e).

17.4.5. dpdyCoarse

Overload
fn dpdyCoarse(e: T) -> T
Parameterization T is f32 or vecN<f32>
Description Returns the partial derivative of e with respect to window y coordinates using local differences. This may result in fewer unique positions that dpdyFine(e).

17.4.6. dpdyFine

Overload
fn dpdyFine(e: T) -> T
Parameterization T is f32 or vecN<f32>
Description Returns the partial derivative of e with respect to window y coordinates.

17.4.7. fwidth

Overload
fn fwidth(e: T) -> T
Parameterization T is f32 or vecN<f32>
Description Returns abs(dpdx(e)) + abs(dpdy(e)).

17.4.8. fwidthCoarse

Overload
fn fwidthCoarse(e: T) -> T
Parameterization T is f32 or vecN<f32>
Description Returns abs(dpdxCoarse(e)) + abs(dpdyCoarse(e)).

17.4.9. fwidthFine

Overload
fn fwidthFine(e: T) -> T
Parameterization T is f32 or vecN<f32>
Description Returns abs(dpdxFine(e)) + abs(dpdyFine(e)).

17.5. Texture Built-in Functions

Parameter values must be valid for the respective texture types.

17.5.1. textureDimensions

Returns the dimensions of a texture, or texture’s mip level in texels.

Parameterization Overload
ST is i32, u32, or f32
F is a texel format
A is an access mode

T is texture_1d<ST> or texture_storage_1d<F,A>
fn textureDimensions(t: T) -> u32
ST is i32, u32, or f32

T is texture_1d<ST>

L is i32, or u32

fn textureDimensions(t: T,
                     level: L) -> u32
ST is i32, u32, or f32
F is a texel format
A is an access mode

T is texture_2d<ST>, texture_2d_array<ST>, texture_cube<ST>, texture_cube_array<ST>, texture_multisampled_2d<ST>, texture_depth_2d, texture_depth_2d_array, texture_depth_cube, texture_depth_cube_array, texture_depth_multisampled_2d, texture_storage_2d<F,A>, texture_storage_2d_array<F,A>, or texture_external
fn textureDimensions(t: T) -> vec2<u32>
ST is i32, u32, or f32

T is texture_2d<ST>, texture_2d_array<ST>, texture_cube<ST>, texture_cube_array<ST>, texture_depth_2d, texture_depth_2d_array, texture_depth_cube, or texture_depth_cube_array

L is i32, or u32

fn textureDimensions(t: T,
                     level: L) -> vec2<u32>
ST is i32, u32, or f32
F is a texel format
A is an access mode

T is texture_3d<ST> or texture_storage_3d<F,A>
fn textureDimensions(t: T) -> vec3<u32>
ST is i32, u32, or f32

T is texture_3d<ST>

L is i32, or u32

fn textureDimensions(t: T,
                     level: L) -> vec3<u32>

Parameters:

t The sampled, multisampled, depth, storage, or external texture.
level The mip level, with level 0 containing a full size version of the texture.
If omitted, the dimensions of level 0 are returned.

Returns:

The dimensions of the texture in texels.

For textures based on cubes, the results are the dimensions of each face of the cube. Cube faces are square, so the x and y components of the result are equal.

If level is outside the range [0, textureNumLevels(t)) then an indeterminate value for the return type may be returned.

17.5.2. textureGather

A texture gather operation reads from a 2D, 2D array, cube, or cube array texture, computing a four-component vector as follows:

TODO: The four texels are the "sample footprint" that should be described by the WebGPU spec. https://github.com/gpuweb/gpuweb/issues/2343

Parameterization Overload
C is i32, or u32
ST is i32, u32, or f32
fn textureGather(component: C,
                 t: texture_2d<ST>,
                 s: sampler,
                 coords: vec2<f32>) -> vec4<ST>
C is i32, or u32
ST is i32, u32, or f32
fn textureGather(component: C,
                 t: texture_2d<ST>,
                 s: sampler,
                 coords: vec2<f32>,
                 offset: vec2<i32>) -> vec4<ST>
C is i32, or u32
A is i32, or u32
ST is i32, u32, or f32
fn textureGather(component: C,
                 t: texture_2d_array<ST>,
                 s: sampler,
                 coords: vec2<f32>,
                 array_index: A) -> vec4<ST>
C is i32, or u32
A is i32, or u32
ST is i32, u32, or f32
fn textureGather(component: C,
                 t: texture_2d_array<ST>,
                 s: sampler,
                 coords: vec2<f32>,
                 array_index: A,
                 offset: vec2<i32>) -> vec4<ST>
C is i32, or u32
ST is i32, u32, or f32
fn textureGather(component: C,
                 t: texture_cube<ST>,
                 s: sampler,
                 coords: vec3<f32>) -> vec4<ST>
C is i32, or u32
A is i32, or u32
ST is i32, u32, or f32
fn textureGather(component: C,
                 t: texture_cube_array<ST>,
                 s: sampler,
                 coords: vec3<f32>,
                 array_index: A) -> vec4<ST>
fn textureGather(t: texture_depth_2d,
                 s: sampler,
                 coords: vec2<f32>) -> vec4<f32>
fn textureGather(t: texture_depth_2d,
                 s: sampler,
                 coords: vec2<f32>,
                 offset: vec2<i32>) -> vec4<f32>
fn textureGather(t: texture_depth_cube,
                 s: sampler,
                 coords: vec3<f32>) -> vec4<f32>
A is i32, or u32
fn textureGather(t: texture_depth_2d_array,
                 s: sampler,
                 coords: vec2<f32>,
                 array_index: A) -> vec4<f32>
A is i32, or u32
fn textureGather(t: texture_depth_2d_array,
                 s: sampler,
                 coords: vec2<f32>,
                 array_index: A,
                 offset: vec2<i32>) -> vec4<f32>
A is i32, or u32
fn textureGather(t: texture_depth_cube_array,
                 s: sampler,
                 coords: vec3<f32>,
                 array_index: A) -> vec4<f32>

Parameters:

component Only applies to non-depth textures.
The index of the channel to read from the selected texels.
When provided, the component expression must be a const-expression (e.g. 1).
Its value must be at least 0 and at most 3. Values outside of this range will result in a shader-creation error.
t The sampled or depth texture to read from.
s The sampler type.
coords The texture coordinates.
array_index The 0-based texture array index.
offset The optional texel offset applied to the unnormalized texture coordinate before sampling the texture. This offset is applied before applying any texture wrapping modes.
The offset expression must be a const-expression (e.g. vec2<i32>(1, 2)).
Each offset component must be at least -8 and at most 7. Values outside of this range will result in a shader-creation error.

Returns:

A four component vector with components extracted from the specified channel from the selected texels, as described above.

EXAMPLE: Gather components from texels in 2D texture
@group(0) @binding(0) var t: texture_2d<f32>;
@group(0) @binding(1) var dt: texture_depth_2d;
@group(0) @binding(2) var s: sampler;

fn gather_x_components(c: vec2<f32>) -> vec4<f32> {
  return textureGather(0,t,s,c);
}
fn gather_y_components(c: vec2<f32>) -> vec4<f32> {
  return textureGather(1,t,s,c);
}
fn gather_z_components(c: vec2<f32>) -> vec4<f32> {
  return textureGather(2,t,s,c);
}
fn gather_depth_components(c: vec2<f32>) -> vec4<f32> {
  return textureGather(dt,s,c);
}

17.5.3. textureGatherCompare

A texture gather compare operation performs a depth comparison on four texels in a depth texture and collects the results into a single vector, as follows:

Parameterization Overload
fn textureGatherCompare(t: texture_depth_2d,
                        s: sampler_comparison,
                        coords: vec2<f32>,
                        depth_ref: f32) -> vec4<f32>
fn textureGatherCompare(t: texture_depth_2d,
                        s: sampler_comparison,
                        coords: vec2<f32>,
                        depth_ref: f32,
                        offset: vec2<i32>) -> vec4<f32>
A is i32, or u32
fn textureGatherCompare(t: texture_depth_2d_array,
                        s: sampler_comparison,
                        coords: vec2<f32>,
                        array_index: A,
                        depth_ref: f32) -> vec4<f32>
A is i32, or u32
fn textureGatherCompare(t: texture_depth_2d_array,
                        s: sampler_comparison,
                        coords: vec2<f32>,
                        array_index: A,
                        depth_ref: f32,
                        offset: vec2<i32>) -> vec4<f32>
fn textureGatherCompare(t: texture_depth_cube,
                        s: sampler_comparison,
                        coords: vec3<f32>,
                        depth_ref: f32) -> vec4<f32>
A is i32, or u32
fn textureGatherCompare(t: texture_depth_cube_array,
                        s: sampler_comparison,
                        coords: vec3<f32>,
                        array_index: A,
                        depth_ref: f32) -> vec4<f32>

Parameters:

t The depth texture to read from.
s The sampler comparison.
coords The texture coordinates.
array_index The 0-based texture array index.
depth_ref The reference value to compare the sampled depth value against.
offset The optional texel offset applied to the unnormalized texture coordinate before sampling the texture. This offset is applied before applying any texture wrapping modes.
The offset expression must be a const-expression (e.g. vec2<i32>(1, 2)).
Each offset component must be at least -8 and at most 7. Values outside of this range will result in a shader-creation error.

Returns:

A four component vector with comparison result for the selected texels, as described above.

EXAMPLE: Gather depth comparison
@group(0) @binding(0) var dt: texture_depth_2d;
@group(0) @binding(1) var s: sampler;

fn gather_depth_compare(c: vec2<f32>, depth_ref: f32) -> vec4<f32> {
  return textureGatherCompare(dt,s,c,depth_ref);
}

17.5.4. textureLoad

Reads a single texel from a texture without sampling or filtering.

Parameterization Overload
C is i32, or u32
L is i32, or u32
ST is i32, u32, or f32
fn textureLoad(t: texture_1d<ST>,
               coords: C,
               level: L) -> vec4<ST>
C is i32, or u32
L is i32, or u32
ST is i32, u32, or f32
fn textureLoad(t: texture_2d<ST>,
               coords: vec2<C>,
               level: L) -> vec4<ST>
C is i32, or u32
A is i32, or u32
L is i32, or u32
ST is i32, u32, or f32
fn textureLoad(t: texture_2d_array<ST>,
              coords: vec2<C>,
              array_index: A,
              level: L) -> vec4<ST>
C is i32, or u32
L is i32, or u32
ST is i32, u32, or f32
fn textureLoad(t: texture_3d<ST>,
               coords: vec3<C>,
               level: L) -> vec4<ST>
C is i32, or u32
S is i32, or u32
ST is i32, u32, or f32
fn textureLoad(t: texture_multisampled_2d<ST>,
               coords: vec2<C>,
               sample_index: S)-> vec4<ST>
C is i32, or u32
L is i32, or u32
fn textureLoad(t: texture_depth_2d,
               coords: vec2<C>,
               level: L) -> f32
C is i32, or u32
A is i32, or u32
L is i32, or u32
fn textureLoad(t: texture_depth_2d_array,
               coords: vec2<C>,
               array_index: A,
               level: L) -> f32
C is i32, or u32
S is i32, or u32
fn textureLoad(t: texture_depth_multisampled_2d,
               coords: vec2<C>,
               sample_index: S)-> f32
C is i32, or u32
fn textureLoad(t: texture_external,
               coords: vec2<C>) -> vec4<f32>

Parameters:

t The sampled, multisampled, depth, or external texture.
coords The 0-based texel coordinate.
array_index The 0-based texture array index.
level The mip level, with level 0 containing a full size version of the texture.
sample_index The 0-based sample index of the multisampled texture.

Returns:

The unfiltered texel data.

An out of bounds access occurs if:

If an out of bounds access occurs, the built-in function returns one of:

17.5.5. textureNumLayers

Returns the number of layers (elements) of an array texture.

Parameterization Overload
F is a texel format
A is an access mode
ST is i32, u32, or f32

T is texture_2d_array<ST>, texture_cube_array<ST>, texture_depth_2d_array, texture_depth_cube_array, or texture_storage_2d_array<F,A>
fn textureNumLayers(t: T) -> u32

Parameters:

t The sampled, depth or storage array texture.

Returns:

The number of layers (elements) of the array texture.

17.5.6. textureNumLevels

Returns the number of mip levels of a texture.

Parameterization Overload
ST is i32, u32, or f32

T is texture_1d<ST>, texture_2d<ST>, texture_2d_array<ST>, texture_3d<ST>, texture_cube<ST>, texture_cube_array<ST>, texture_depth_2d, texture_depth_2d_array, texture_depth_cube, or texture_depth_cube_array
fn textureNumLevels(t: T) -> u32

Parameters:

t The sampled or depth texture.

Returns:

The number of mip levels for the texture.

17.5.7. textureNumSamples

Returns the number samples per texel in a multisampled texture.

Parameterization Overload
ST is i32, u32, or f32

T is texture_multisampled_2d<ST> or texture_depth_multisampled_2d
fn textureNumSamples(t: T) -> u32

Parameters:

t The multisampled texture.

Returns:

The number of samples per texel in the multisampled texture.

17.5.8. textureSample

Samples a texture.

Must only be used in a fragment shader stage. Must only be invoked in uniform control flow.

Parameterization Overload
fn textureSample(t: texture_1d<f32>,
                 s: sampler,
                 coords: f32) -> vec4<f32>
fn textureSample(t: texture_2d<f32>,
                 s: sampler,
                 coords: vec2<f32>) -> vec4<f32>
fn textureSample(t: texture_2d<f32>,
                 s: sampler,
                 coords: vec2<f32>,
                 offset: vec2<i32>) -> vec4<f32>
A is i32, or u32
fn textureSample(t: texture_2d_array<f32>,
                 s: sampler,
                 coords: vec2<f32>,
                 array_index: A) -> vec4<f32>
A is i32, or u32
fn textureSample(t: texture_2d_array<f32>,
                 s: sampler,
                 coords: vec2<f32>,
                 array_index: A,
                 offset: vec2<i32>) -> vec4<f32>
T is texture_3d<f32>, or texture_cube<f32>
fn textureSample(t: T,
                 s: sampler,
                 coords: vec3<f32>) -> vec4<f32>
fn textureSample(t: texture_3d<f32>,
                 s: sampler,
                 coords: vec3<f32>,
                 offset: vec3<i32>) -> vec4<f32>
A is i32, or u32
fn textureSample(t: texture_cube_array<f32>,
                 s: sampler,
                 coords: vec3<f32>,
                 array_index: A) -> vec4<f32>
fn textureSample(t: texture_depth_2d,
                 s: sampler,
                 coords: vec2<f32>) -> f32
fn textureSample(t: texture_depth_2d,
                 s: sampler,
                 coords: vec2<f32>,
                 offset: vec2<i32>) -> f32
A is i32, or u32
fn textureSample(t: texture_depth_2d_array,
                 s: sampler,
                 coords: vec2<f32>,
                 array_index: A) -> f32
A is i32, or u32
fn textureSample(t: texture_depth_2d_array,
                 s: sampler,
                 coords: vec2<f32>,
                 array_index: A,
                 offset: vec2<i32>) -> f32
fn textureSample(t: texture_depth_cube,
                 s: sampler,
                 coords: vec3<f32>) -> f32
A is i32, or u32
fn textureSample(t: texture_depth_cube_array,
                 s: sampler,
                 coords: vec3<f32>,
                 array_index: A) -> f32

Parameters:

t The sampled or depth texture to sample.
s The sampler type.
coords The texture coordinates used for sampling.
array_index The 0-based texture array index to sample.
offset The optional texel offset applied to the unnormalized texture coordinate before sampling the texture. This offset is applied before applying any texture wrapping modes.
The offset expression must be a const-expression (e.g. vec2<i32>(1, 2)).
Each offset component must be at least -8 and at most 7. Values outside of this range will result in a shader-creation error.

Returns:

The sampled value.

17.5.9. textureSampleBias

Samples a texture with a bias to the mip level.

Must only be used in a fragment shader stage. Must only be invoked in uniform control flow.

Parameterization Overload
fn textureSampleBias(t: texture_2d<f32>,
                     s: sampler,
                     coords: vec2<f32>,
                     bias: f32) -> vec4<f32>
fn textureSampleBias(t: texture_2d<f32>,
                     s: sampler,
                     coords: vec2<f32>,
                     bias: f32,
                     offset: vec2<i32>) -> vec4<f32>
A is i32, or u32
fn textureSampleBias(t: texture_2d_array<f32>,
                     s: sampler,
                     coords: vec2<f32>,
                     array_index: A,
                     bias: f32) -> vec4<f32>
A is i32, or u32
fn textureSampleBias(t: texture_2d_array<f32>,
                     s: sampler,
                     coords: vec2<f32>,
                     array_index: A,
                     bias: f32,
                     offset: vec2<i32>) -> vec4<f32>
T is texture_3d<f32>, or texture_cube<f32>
fn textureSampleBias(t: T,
                     s: sampler,
                     coords: vec3<f32>,
                     bias: f32) -> vec4<f32>
fn textureSampleBias(t: texture_3d<f32>,
                     s: sampler,
                     coords: vec3<f32>,
                     bias: f32,
                     offset: vec3<i32>) -> vec4<f32>
A is i32, or u32
fn textureSampleBias(t: texture_cube_array<f32>,
                     s: sampler,
                     coords: vec3<f32>,
                     array_index: A,
                     bias: f32) -> vec4<f32>

Parameters:

t The texture to sample.
s The sampler type.
coords The texture coordinates used for sampling.
array_index The 0-based texture array index to sample.
bias The bias to apply to the mip level before sampling. bias must be between -16.0 and 15.99.
offset The optional texel offset applied to the unnormalized texture coordinate before sampling the texture. This offset is applied before applying any texture wrapping modes.
The offset expression must be a const-expression (e.g. vec2<i32>(1, 2)).
Each offset component must be at least -8 and at most 7. Values outside of this range will result in a shader-creation error.

Returns:

The sampled value.

17.5.10. textureSampleCompare

Samples a depth texture and compares the sampled depth values against a reference value.

Must only be used in a fragment shader stage. Must only be invoked in uniform control flow.

Parameterization Overload
fn textureSampleCompare(t: texture_depth_2d,
                        s: sampler_comparison,
                        coords: vec2<f32>,
                        depth_ref: f32) -> f32
fn textureSampleCompare(t: texture_depth_2d,
                        s: sampler_comparison,
                        coords: vec2<f32>,
                        depth_ref: f32,
                        offset: vec2<i32>) -> f32
A is i32, or u32
fn textureSampleCompare(t: texture_depth_2d_array,
                        s: sampler_comparison,
                        coords: vec2<f32>,
                        array_index: A,
                        depth_ref: f32) -> f32
A is i32, or u32
fn textureSampleCompare(t: texture_depth_2d_array,
                        s: sampler_comparison,
                        coords: vec2<f32>,
                        array_index: A,
                        depth_ref: f32,
                        offset: vec2<i32>) -> f32
fn textureSampleCompare(t: texture_depth_cube,
                        s: sampler_comparison,
                        coords: vec3<f32>,
                        depth_ref: f32) -> f32
A is i32, or u32
fn textureSampleCompare(t: texture_depth_cube_array,
                        s: sampler_comparison,
                        coords: vec3<f32>,
                        array_index: A,
                        depth_ref: f32) -> f32

Parameters:

t The depth texture to sample.
s The sampler comparision type.
coords The texture coordinates used for sampling.
array_index The 0-based texture array index to sample.
depth_ref The reference value to compare the sampled depth value against.
offset The optional texel offset applied to the unnormalized texture coordinate before sampling the texture. This offset is applied before applying any texture wrapping modes.
The offset expression must be a const-expression (e.g. vec2<i32>(1, 2)).
Each offset component must be at least -8 and at most 7. Values outside of this range will result in a shader-creation error.

Returns:

A value in the range [0.0..1.0].

Each sampled texel is compared against the reference value using the comparision operator defined by the sampler_comparison, resulting in either a 0 or 1 value for each texel.

If the sampler uses bilinear filtering then the returned value is the filtered average of these values, otherwise the comparision result of a single texel is returned.

17.5.11. textureSampleCompareLevel

Samples a depth texture and compares the sampled depth values against a reference value.

Parameterization Overload
fn textureSampleCompareLevel(t: texture_depth_2d,
                             s: sampler_comparison,
                             coords: vec2<f32>,
                             depth_ref: f32) -> f32
fn textureSampleCompareLevel(t: texture_depth_2d,
                             s: sampler_comparison,
                             coords: vec2<f32>,
                             depth_ref: f32,
                             offset: vec2<i32>) -> f32
A is i32, or u32
fn textureSampleCompareLevel(t: texture_depth_2d_array,
                             s: sampler_comparison,
                             coords: vec2<f32>,
                             array_index: A,
                             depth_ref: f32) -> f32
A is i32, or u32
fn textureSampleCompareLevel(t: texture_depth_2d_array,
                             s: sampler_comparison,
                             coords: vec2<f32>,
                             array_index: A,
                             depth_ref: f32,
                             offset: vec2<i32>) -> f32
fn textureSampleCompareLevel(t: texture_depth_cube,
                             s: sampler_comparison,
                             coords: vec3<f32>,
                             depth_ref: f32) -> f32
A is i32, or u32
fn textureSampleCompareLevel(t: texture_depth_cube_array,
                             s: sampler_comparison,
                             coords: vec3<f32>,
                             array_index: A,
                             depth_ref: f32) -> f32

Parameters:

t The depth texture to sample.
s The sampler comparision type.
coords The texture coordinates used for sampling.
array_index The 0-based texture array index to sample.
depth_ref The reference value to compare the sampled depth value against.
offset The optional texel offset applied to the unnormalized texture coordinate before sampling the texture. This offset is applied before applying any texture wrapping modes.
The offset expression must be a const-expression (e.g. vec2<i32>(1, 2)).
Each offset component must be at least -8 and at most 7. Values outside of this range will result in a shader-creation error.

Returns:

A value in the range [0.0..1.0].

The textureSampleCompareLevel function is the same as textureSampleCompare, except that:

17.5.12. textureSampleGrad

Samples a texture using explicit gradients.

Parameterization Overload
fn textureSampleGrad(t: texture_2d<f32>,
                     s: sampler,
                     coords: vec2<f32>,
                     ddx: vec2<f32>,
                     ddy: vec2<f32>) -> vec4<f32>
fn textureSampleGrad(t: texture_2d<f32>,
                     s: sampler,
                     coords: vec2<f32>,
                     ddx: vec2<f32>,
                     ddy: vec2<f32>,
                     offset: vec2<i32>) -> vec4<f32>
A is i32, or u32
fn textureSampleGrad(t: texture_2d_array<f32>,
                     s: sampler,
                     coords: vec2<f32>,
                     array_index: A,
                     ddx: vec2<f32>,
                     ddy: vec2<f32>) -> vec4<f32>
A is i32, or u32
fn textureSampleGrad(t: texture_2d_array<f32>,
                     s: sampler,
                     coords: vec2<f32>,
                     array_index: A,
                     ddx: vec2<f32>,
                     ddy: vec2<f32>,
                     offset: vec2<i32>) -> vec4<f32>
T is texture_3d<f32>, or texture_cube<f32>
fn textureSampleGrad(t: T,
                     s: sampler,
                     coords: vec3<f32>,
                     ddx: vec3<f32>,
                     ddy: vec3<f32>) -> vec4<f32>
fn textureSampleGrad(t: texture_3d<f32>,
                     s: sampler,
                     coords: vec3<f32>,
                     ddx: vec3<f32>,
                     ddy: vec3<f32>,
                     offset: vec3<i32>) -> vec4<f32>
A is i32, or u32
fn textureSampleGrad(t: texture_cube_array<f32>,
                     s: sampler,
                     coords: vec3<f32>,
                     array_index: A,
                     ddx: vec3<f32>,
                     ddy: vec3<f32>) -> vec4<f32>

Parameters:

t The texture to sample.
s The sampler type.
coords The texture coordinates used for sampling.
array_index The 0-based texture array index to sample.
ddx The x direction derivative vector used to compute the sampling locations.
ddy The y direction derivative vector used to compute the sampling locations.
offset The optional texel offset applied to the unnormalized texture coordinate before sampling the texture. This offset is applied before applying any texture wrapping modes.
The offset expression must be a const-expression (e.g. vec2<i32>(1, 2)).
Each offset component must be at least -8 and at most 7. Values outside of this range will result in a shader-creation error.

Returns:

The sampled value.

17.5.13. textureSampleLevel

Samples a texture using an explicit mip level.

Parameterization Overload
fn textureSampleLevel(t: texture_2d<f32>,
                      s: sampler,
                      coords: vec2<f32>,
                      level: f32) -> vec4<f32>
fn textureSampleLevel(t: texture_2d<f32>,
                      s: sampler,
                      coords: vec2<f32>,
                      level: f32,
                      offset: vec2<i32>) -> vec4<f32>
A is i32, or u32
fn textureSampleLevel(t: texture_2d_array<f32>,
                      s: sampler,
                      coords: vec2<f32>,
                      array_index: A,
                      level: f32) -> vec4<f32>
A is i32, or u32
fn textureSampleLevel(t: texture_2d_array<f32>,
                      s: sampler,
                      coords: vec2<f32>,
                      array_index: A,
                      level: f32,
                      offset: vec2<i32>) -> vec4<f32>
T is texture_3d<f32>, or texture_cube<f32>
fn textureSampleLevel(t: T,
                      s: sampler,
                      coords: vec3<f32>,
                      level: f32) -> vec4<f32>
fn textureSampleLevel(t: texture_3d<f32>,
                      s: sampler,
                      coords: vec3<f32>,
                      level: f32,
                      offset: vec3<i32>) -> vec4<f32>
A is i32, or u32
fn textureSampleLevel(t: texture_cube_array<f32>,
                      s: sampler,
                      coords: vec3<f32>,
                      array_index: A,
                      level: f32) -> vec4<f32>
L is i32, or u32
fn textureSampleLevel(t: texture_depth_2d,
                      s: sampler,
                      coords: vec2<f32>,
                      level: L) -> f32
L is i32, or u32
fn textureSampleLevel(t: texture_depth_2d,
                      s: sampler,
                      coords: vec2<f32>,
                      level: L,
                      offset: vec2<i32>) -> f32
A is i32, or u32
L is i32, or u32
fn textureSampleLevel(t: texture_depth_2d_array,
                      s: sampler,
                      coords: vec2<f32>,
                      array_index: A,
                      level: L) -> f32
A is i32, or u32
L is i32, or u32
fn textureSampleLevel(t: texture_depth_2d_array,
                      s: sampler,
                      coords: vec2<f32>,
                      array_index: A,
                      level: L,
                      offset: vec2<i32>) -> f32
L is i32, or u32
fn textureSampleLevel(t: texture_depth_cube,
                      s: sampler,
                      coords: vec3<f32>,
                      level: L) -> f32
A is i32, or u32
L is i32, or u32
fn textureSampleLevel(t: texture_depth_cube_array,
                      s: sampler,
                      coords: vec3<f32>,
                      array_index: A,
                      level: L) -> f32

Parameters:

t The sampled or depth texture to sample.
s The sampler type.
coords The texture coordinates used for sampling.
array_index The 0-based texture array index to sample.
level The mip level, with level 0 containing a full size version of the texture. For the functions where level is a f32, fractional values may interpolate between two levels if the format is filterable according to the Texture Format Capabilities.
offset The optional texel offset applied to the unnormalized texture coordinate before sampling the texture. This offset is applied before applying any texture wrapping modes.
The offset expression must be a const-expression (e.g. vec2<i32>(1, 2)).
Each offset component must be at least -8 and at most 7. Values outside of this range will result in a shader-creation error.

Returns:

The sampled value.

17.5.14. textureSampleBaseClampToEdge

Samples a texture view at its base level, with texture coordinates clamped to the edge as described below.

Parameterization Overload
T is texture_2d<f32> or texture_external
fn textureSampleBaseClampToEdge(t: T,
                                s: sampler,
                                coords: vec2<f32>) -> vec4<f32>

Parameters:

t The sampled or external texture to sample.
s The sampler type.
coords The texture coordinates used for sampling.

Before sampling, the given coordinates will be clamped to the rectangle

[ half_texel, 1 - half_texel ]

where

half_texel = vec2(0.5) / vec2<f32>(textureDimensions(t))

Note: The half-texel adjustment ensures that, independent of the sampler’s addressing and filter modes, wrapping will not occur. That is, when sampling near an edge, the sampled texels will be at or adjacent to that edge, and not selected from the opposite edge.

Returns:

The sampled value.

17.5.15. textureStore

Writes a single texel to a texture.

Parameterization Overload
F is a texel format
C is i32, or u32
CF depends on the storage texel format F. See the texel format table for the mapping of texel format to channel format.
fn textureStore(t: texture_storage_1d<F,write>,
                coords: C,
                value: vec4<CF>)
F is a texel format
C is i32, or u32
CF depends on the storage texel format F. See the texel format table for the mapping of texel format to channel format.
fn textureStore(t: texture_storage_2d<F,write>,
                coords: vec2<C>,
                value: vec4<CF>)
F is a texel format
C is i32, or u32
A is i32, or u32
CF depends on the storage texel format F. See the texel format table for the mapping of texel format to channel format.
fn textureStore(t: texture_storage_2d_array<F,write>,
                coords: vec2<C>,
                array_index: A,
                value: vec4<CF>)
F is a texel format
C is i32, or u32
CF depends on the storage texel format F. See the texel format table for the mapping of texel format to channel format.
fn textureStore(t: texture_storage_3d<F,write>,
                coords: vec3<C>,
                value: vec4<CF>)

Parameters:

t The write-only storage texture.
coords The 0-based texel coordinate.
array_index The 0-based texture array index.
value The new texel value.

Note:

An out-of-bounds access occurs if:

If an out-of-bounds access occurs, the built-in function may do any of the following:

17.6. Atomic Built-in Functions

Atomic built-in functions can be used to read/write/read-modify-write atomic objects. They are the only operations allowed on § 5.2.8 Atomic Types.

All atomic built-in functions use a relaxed memory ordering. This means synchronization and ordering guarantees only apply among atomic operations acting on the same memory locations. No synchronization or ordering guarantees apply between atomic and non-atomic memory accesses, or between atomic accesses acting on different memory locations.

Atomic built-in functions must not be used in a vertex shader stage.

The address space AS of the atomic_ptr parameter in all atomic built-in functions must be either storage or workgroup.

T must be either u32 or i32

17.6.1. Atomic Load

fn atomicLoad(atomic_ptr: ptr<AS, atomic<T>, read_write>) -> T

Returns the atomically loaded the value pointed to by atomic_ptr. It does not modify the object.

17.6.2. Atomic Store

fn atomicStore(atomic_ptr: ptr<AS, atomic<T>, read_write>, v: T)

Atomically stores the value v in the atomic object pointed to by atomic_ptr.

17.6.3. Atomic Read-modify-write

fn atomicAdd(atomic_ptr: ptr<AS, atomic<T>, read_write>, v: T) -> T
fn atomicSub(atomic_ptr: ptr<AS, atomic<T>, read_write>, v: T) -> T
fn atomicMax(atomic_ptr: ptr<AS, atomic<T>, read_write>, v: T) -> T
fn atomicMin(atomic_ptr: ptr<AS, atomic<T>, read_write>, v: T) -> T
fn atomicAnd(atomic_ptr: ptr<AS, atomic<T>, read_write>, v: T) -> T
fn atomicOr(atomic_ptr: ptr<AS, atomic<T>, read_write>, v: T) -> T
fn atomicXor(atomic_ptr: ptr<AS, atomic<T>, read_write>, v: T) -> T
Each function performs the following steps atomically:
  1. Load the original value pointed to by atomic_ptr.

  2. Obtains a new value by performing the operation (e.g. max) from the function name with the value v.

  3. Store the new value using atomic_ptr.

Each function returns the original value stored in the atomic object.

fn atomicExchange(atomic_ptr: ptr<AS, atomic<T>, read_write>, v: T) -> T

Atomically stores the value v in the atomic object pointed to atomic_ptr and returns the original value stored in the atomic object.

fn atomicCompareExchangeWeak(atomic_ptr: ptr<AS, atomic<T>, read_write>, cmp: T, v: T) -> __atomic_compare_exchange_result<T>

struct __atomic_compare_exchange_result<T> {
  old_value : T; // old value stored in the atomic
  exchanged : bool; // true if the exchange was done
}

Note: A value cannot be explicitly declared with the type __atomic_compare_exchange_result, but a value may infer the type.

Performs the following steps atomically:

  1. Load the original value pointed to by atomic_ptr.

  2. Compare the original value to the value cmp using an equality operation.

  3. Store the value v only if the result of the equality comparison was true.

Returns a two member structure, where the first member, old_value, is the original value of the atomic object and the second member, exchanged, is whether or not the comparison succeeded.

Note: the equality comparison may spuriously fail on some implementations. That is, the second component of the result vector may be false even if the first component of the result vector equals cmp.

17.7. Data Packing Built-in Functions

Data packing builtin functions can be used to encode values using data formats that do not correspond directly to types in WGSL. This enables a program to write many densely packed values to memory, which can reduce a shader’s memory bandwidth demand.

Each builtin applies the inverse of a channel transfer function to several input values, then combines their results into a single output value.

Note: For packing unorm values, the normalized floating point values are in the interval [0.0, 1.0].

Note: For packing snorm values, the normalized floating point values are in the interval [-1.0, 1.0].

17.7.1. pack4x8snorm

Overload
@const fn pack4x8snorm(e: vec4<f32>) -> u32
Description Converts four normalized floating point values to 8-bit signed integers, and then combines them into one u32 value.
Component e[i] of the input is converted to an 8-bit twos complement integer value ⌊ 0.5 + 127 × min(1, max(-1, e[i])) ⌋ which is then placed in bits 8 × i through 8 × i + 7 of the result.

17.7.2. pack4x8unorm

Overload
@const fn pack4x8unorm(e: vec4<f32>) -> u32
Description Converts four normalized floating point values to 8-bit unsigned integers, and then combines them into one u32 value.
Component e[i] of the input is converted to an 8-bit unsigned integer value ⌊ 0.5 + 255 × min(1, max(0, e[i])) ⌋ which is then placed in bits 8 × i through 8 × i + 7 of the result.

17.7.3. pack2x16snorm

Overload
@const fn pack2x16snorm(e: vec2<f32>) -> u32
Description Converts two normalized floating point values to 16-bit signed integers, and then combines them into one u32 value.
Component e[i] of the input is converted to a 16-bit twos complement integer value ⌊ 0.5 + 32767 × min(1, max(-1, e[i])) ⌋ which is then placed in bits 16 × i through 16 × i + 15 of the result.

17.7.4. pack2x16unorm

Overload
@const fn pack2x16unorm(e: vec2<f32>) -> u32
Description Converts two normalized floating point values to 16-bit unsigned integers, and then combines them into one u32 value.
Component e[i] of the input is converted to a 16-bit unsigned integer value ⌊ 0.5 + 65535 × min(1, max(0, e[i])) ⌋ which is then placed in bits 16 × i through 16 × i + 15 of the result.

17.7.5. pack2x16float

Overload
@const fn pack2x16float(e: vec2<f32>) -> u32
Description Converts two floating point values to half-precision floating point numbers, and then combines them into one u32 value.
Component e[i] of the input is converted to a IEEE-754 binary16 value, which is then placed in bits 16 × i through 16 × i + 15 of the result. See § 13.6.2 Floating Point Conversion.

If either e[0] or e[1] is outside the finite range of binary16 then the result is any value of type u32.

17.8. Data Unpacking Built-in Functions

Data unpacking builtin functions can be used to decode values in data formats that do not correspond directly to types in WGSL. This enables a program to read many densely packed values from memory, which can reduce a shader’s memory bandwidth demand.

Each builtin breaks up an input value into channels, then applies a channel transfer function to each.

Note: For unpacking unorm values, the normalized floating point result is in the interval [0.0, 1.0].

Note: For unpacking snorm values, the normalized floating point result is in the interval [-1.0, 1.0].

17.8.1. unpack4x8snorm

Overload
@const fn unpack4x8snorm(e: u32) -> vec4<f32>
Description Decomposes a 32-bit value into four 8-bit chunks, then reinterprets each chunk as a signed normalized floating point value.
Component i of the result is max(v ÷ 127, -1), where v is the interpretation of bits 8×i through 8×i + 7 of e as a twos-complement signed integer.

17.8.2. unpack4x8unorm

Overload
@const fn unpack4x8unorm(e: u32) -> vec4<f32>
Description Decomposes a 32-bit value into four 8-bit chunks, then reinterprets each chunk as an unsigned normalized floating point value.
Component i of the result is v ÷ 255, where v is the interpretation of bits 8×i through 8×i + 7 of e as an unsigned integer.

17.8.3. unpack2x16snorm

Overload
@const fn unpack2x16snorm(e: u32) -> vec2<f32>
Description Decomposes a 32-bit value into two 16-bit chunks, then reinterprets each chunk as a signed normalized floating point value.
Component i of the result is max(v ÷ 32767, -1), where v is the interpretation of bits 16×i through 16×i + 15 of e as a twos-complement signed integer.

17.8.4. unpack2x16unorm

Overload
@const fn unpack2x16unorm(e: u32) -> vec2<f32>
Description Decomposes a 32-bit value into two 16-bit chunks, then reinterprets each chunk as an unsigned normalized floating point value.
Component i of the result is v ÷ 65535, where v is the interpretation of bits 16×i through 16×i + 15 of e as an unsigned integer.

17.8.5. unpack2x16float

Overload
@const fn unpack2x16float(e: u32) -> vec2<f32>
Description Decomposes a 32-bit value into two 16-bit chunks, and reinterpets each chunk as a floating point value.
Component i of the result is the f32 representation of v, where v is the interpretation of bits 16×i through 16×i + 15 of e as an IEEE-754 binary16 value. See § 13.6.2 Floating Point Conversion.

17.9. Synchronization Built-in Functions

WGSL provides the following synchronization functions:

fn storageBarrier()
fn workgroupBarrier()

All synchronization functions execute a control barrier with Acquire/Release memory ordering. That is, all synchronization functions, and affected memory and atomic operations are ordered in program order relative to the synchronization function. Additionally, the affected memory and atomic operations program-ordered before the synchronization function must be visible to all other threads in the workgroup before any affected memory or atomic operation program-ordered after the synchronization function is executed by a member of the workgroup. All synchronization functions use the Workgroup memory scope. All synchronization functions have a Workgroup execution scope. All synchronization functions must only be used in the compute shader stage.

storageBarrier affects memory and atomic operations in the storage address space.

workgroupBarrier affects memory and atomic operations in the workgroup address space.

18. Grammar for Recursive Descent Parsing

This section is non-normative.

The WGSL grammar is specified in a form suitable for an LALR(1) parser. An implementation may want to use a recursive-descent parser instead.

The normative grammar cannot be used directly in a recursive-descent parser, because several of its rules are left-recursive. A grammar rule is directly left-recursive when the nonterminal being defined appears first in one of its productions.

The following is the WGSL grammar, but mechanically transformed to:

However, it is not LL(1). For some nonterminals, several productions have common lookahead sets. For example, all productions for the attribute nonterminal start with the attr token. A more subtle example is global_decl, where three productions start with an attribute * phrase, but then are distinguished by tokens fn, override, and var.

For the sake of brevity, many token definitions are not repeated. Use token definitions from the main part of the specification.

access_mode:

| 'read'

| 'read_write'

| 'write'

additive_operator:

| minus

| plus

address_space:

| 'function'

| 'private'

| 'storage'

| 'uniform'

| 'workgroup'

attribute:

| attr 'align' paren_left expression comma ? paren_right

| attr 'binding' paren_left expression comma ? paren_right

| attr 'builtin' paren_left builtin_value_name comma ? paren_right

| attr 'compute'

| attr 'const'

| attr 'fragment'

| attr 'group' paren_left expression comma ? paren_right

| attr 'id' paren_left expression comma ? paren_right

| attr 'interpolate' paren_left interpolation_type_name comma ? paren_right

| attr 'interpolate' paren_left interpolation_type_name comma interpolation_sample_name comma ? paren_right

| attr 'invariant'

| attr 'location' paren_left expression comma ? paren_right

| attr 'size' paren_left expression comma ? paren_right

| attr 'vertex'

| attr 'workgroup_size' paren_left expression comma ? paren_right

| attr 'workgroup_size' paren_left expression comma expression comma ? paren_right

| attr 'workgroup_size' paren_left expression comma expression comma expression comma ? paren_right

bitwise_expression.post.unary_expression:

| and unary_expression ( and unary_expression )*

| or unary_expression ( or unary_expression )*

| xor unary_expression ( xor unary_expression )*

bool_literal:

| 'false'

| 'true'

builtin_value_name:

| 'frag_depth'

| 'front_facing'

| 'global_invocation_id'

| 'instance_index'

| 'local_invocation_id'

| 'local_invocation_index'

| 'num_workgroups'

| 'position'

| 'sample_index'

| 'sample_mask'

| 'vertex_index'

| 'workgroup_id'

callable:

| ident

| mat_prefix

| type_specifier_without_ident

| vec_prefix

| 'array'

case_selector:

| expression

| 'default'

component_or_swizzle_specifier:

| bracket_left expression bracket_right component_or_swizzle_specifier ?

| period member_ident component_or_swizzle_specifier ?

| period swizzle_name component_or_swizzle_specifier ?

compound_assignment_operator:

| and_equal

| division_equal

| minus_equal

| modulo_equal

| or_equal

| plus_equal

| shift_left_equal

| shift_right_equal

| times_equal

| xor_equal

compound_statement:

| brace_left statement * brace_right

core_lhs_expression:

| ident

| paren_left lhs_expression paren_right

decimal_float_literal:

| /0[fh]/

| /[0-9]*\.[0-9]+([eE][+-]?[0-9]+)?[fh]?/

| /[0-9]+[eE][+-]?[0-9]+[fh]?/

| /[0-9]+\.[0-9]*([eE][+-]?[0-9]+)?[fh]?/

| /[1-9][0-9]*[fh]/

decimal_int_literal:

| /0[iu]?/

| /[1-9][0-9]*[iu]?/

depth_texture_type:

| 'texture_depth_2d'

| 'texture_depth_2d_array'

| 'texture_depth_cube'

| 'texture_depth_cube_array'

| 'texture_depth_multisampled_2d'

element_count_expression:

| unary_expression ( multiplicative_operator unary_expression )* ( additive_operator unary_expression ( multiplicative_operator unary_expression )* )*

| unary_expression bitwise_expression.post.unary_expression

expression:

| unary_expression bitwise_expression.post.unary_expression

| unary_expression relational_expression.post.unary_expression

| unary_expression relational_expression.post.unary_expression and_and unary_expression relational_expression.post.unary_expression ( and_and unary_expression relational_expression.post.unary_expression )*

| unary_expression relational_expression.post.unary_expression or_or unary_expression relational_expression.post.unary_expression ( or_or unary_expression relational_expression.post.unary_expression )*

float_literal:

| decimal_float_literal

| hex_float_literal

for_init:

| ident func_call_statement.post.ident

| variable_statement

| variable_updating_statement

for_update:

| ident func_call_statement.post.ident

| variable_updating_statement

func_call_statement.post.ident:

| paren_left ( expression ( comma expression )* comma ? )? paren_right

global_decl:

| attribute * 'fn' ident paren_left ( attribute * ident colon type_specifier ( comma param )* comma ? )? paren_right ( arrow attribute * type_specifier )? brace_left statement * brace_right

| attribute * 'override' optionally_typed_ident ( equal expression )? semicolon

| attribute * 'var' ( less_than address_space ( comma access_mode )? greater_than )? optionally_typed_ident ( equal expression )? semicolon

| semicolon

| 'const' optionally_typed_ident equal expression semicolon

| 'static_assert' expression semicolon

| 'struct' ident brace_left attribute * member_ident colon type_specifier ( comma attribute * member_ident colon type_specifier )* comma ? brace_right

| 'type' ident equal type_specifier semicolon

hex_float_literal:

| /0[xX][0-9a-fA-F]*\.[0-9a-fA-F]+([pP][+-]?[0-9]+[fh]?)?/

| /0[xX][0-9a-fA-F]+[pP][+-]?[0-9]+[fh]?/

| /0[xX][0-9a-fA-F]+\.[0-9a-fA-F]*([pP][+-]?[0-9]+[fh]?)?/

ident: ident_pattern_token
int_literal:

| decimal_int_literal

| hex_int_literal

interpolation_sample_name:

| 'center'

| 'centroid'

| 'sample'

interpolation_type_name:

| 'flat'

| 'linear'

| 'perspective'

lhs_expression:

| ( star | and ) * core_lhs_expression component_or_swizzle_specifier ?

literal:

| bool_literal

| float_literal

| int_literal

mat_prefix:

| 'mat2x2'

| 'mat2x3'

| 'mat2x4'

| 'mat3x2'

| 'mat3x3'

| 'mat3x4'

| 'mat4x2'

| 'mat4x3'

| 'mat4x4'

member_ident: ident_pattern_token
multiplicative_operator:

| forward_slash

| modulo

| star

optionally_typed_ident:

| ident ( colon type_specifier )?

param:

| attribute * ident colon type_specifier

primary_expression:

| callable paren_left ( expression ( comma expression )* comma ? )? paren_right

| ident

| literal

| paren_left expression paren_right

| 'bitcast' less_than type_specifier greater_than paren_left expression paren_right

relational_expression.post.unary_expression:

| shift_expression.post.unary_expression

| shift_expression.post.unary_expression equal_equal unary_expression shift_expression.post.unary_expression

| shift_expression.post.unary_expression greater_than unary_expression shift_expression.post.unary_expression

| shift_expression.post.unary_expression greater_than_equal unary_expression shift_expression.post.unary_expression

| shift_expression.post.unary_expression less_than unary_expression shift_expression.post.unary_expression

| shift_expression.post.unary_expression less_than_equal unary_expression shift_expression.post.unary_expression

| shift_expression.post.unary_expression not_equal unary_expression shift_expression.post.unary_expression

sampled_texture_type:

| 'texture_1d'

| 'texture_2d'

| 'texture_2d_array'

| 'texture_3d'

| 'texture_cube'

| 'texture_cube_array'

sampler_type:

| 'sampler'

| 'sampler_comparison'

shift_expression.post.unary_expression:

| ( multiplicative_operator unary_expression )* ( additive_operator unary_expression ( multiplicative_operator unary_expression )* )*

| shift_left unary_expression

| shift_right unary_expression

statement:

| compound_statement

| ident paren_left ( expression ( comma expression )* comma ? )? paren_right semicolon

| variable_statement semicolon

| variable_updating_statement semicolon

| break_statement semicolon

| continue_statement semicolon

| semicolon

| 'discard' semicolon

| 'for' paren_left for_init ? semicolon expression ? semicolon for_update ? paren_right compound_statement

| 'if' expression compound_statement ( 'else' 'if' expression compound_statement )* ( 'else' compound_statement )?

| 'loop' brace_left statement * ( 'continuing' brace_left statement * ( 'break' 'if' expression semicolon )? brace_right )? brace_right

| 'return' expression ? semicolon

| 'static_assert' expression semicolon

| 'switch' expression brace_left switch_body * brace_right

| 'while' expression compound_statement

storage_texture_type:

| 'texture_storage_1d'

| 'texture_storage_2d'

| 'texture_storage_2d_array'

| 'texture_storage_3d'

switch_body:

| 'case' case_selector ( comma case_selector )* comma ? colon ? compound_statement

| 'default' colon ? compound_statement

swizzle_name:

| '/[rgba]/'

| '/[rgba][rgba]/'

| '/[rgba][rgba][rgba]/'

| '/[rgba][rgba][rgba][rgba]/'

| '/[xyzw]/'

| '/[xyzw][xyzw]/'

| '/[xyzw][xyzw][xyzw]/'

| '/[xyzw][xyzw][xyzw][xyzw]/'

texel_format:

| 'r32float'

| 'r32sint'

| 'r32uint'

| 'rg32float'

| 'rg32sint'

| 'rg32uint'

| 'rgba16float'

| 'rgba16sint'

| 'rgba16uint'

| 'rgba32float'

| 'rgba32sint'

| 'rgba32uint'

| 'rgba8sint'

| 'rgba8snorm'

| 'rgba8uint'

| 'rgba8unorm'

texture_and_sampler_types:

| depth_texture_type

| sampled_texture_type less_than type_specifier greater_than

| sampler_type

| storage_texture_type less_than texel_format comma access_mode greater_than

| multisampled_texture_type less_than type_specifier greater_than

translation_unit:

| ( 'enable' float16 semicolon )* global_decl *

type_specifier:

| ident

| type_specifier_without_ident

type_specifier_without_ident:

| mat_prefix less_than type_specifier greater_than

| texture_and_sampler_types

| vec_prefix less_than type_specifier greater_than

| float16

| float32

| int32

| pointer less_than address_space comma type_specifier ( comma access_mode )? greater_than

| uint32

| 'array' less_than type_specifier ( comma element_count_expression )? greater_than

| 'atomic' less_than type_specifier greater_than

| 'bool'

unary_expression:

| primary_expression component_or_swizzle_specifier ?

| and unary_expression

| bang unary_expression

| minus unary_expression

| star unary_expression

| tilde unary_expression

variable_decl:

| 'var' ( less_than address_space ( comma access_mode )? greater_than )? optionally_typed_ident

variable_statement:

| variable_decl

| variable_decl equal expression

| 'const' optionally_typed_ident equal expression

| 'let' optionally_typed_ident equal expression

variable_updating_statement:

| ( star | and ) * core_lhs_expression component_or_swizzle_specifier ? ( equal | compound_assignment_operator ) expression

| ( star | and ) * core_lhs_expression component_or_swizzle_specifier ? minus_minus

| ( star | and ) * core_lhs_expression component_or_swizzle_specifier ? plus_plus

| underscore equal expression

vec_prefix:

| 'vec2'

| 'vec3'

| 'vec4'

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Conformant Algorithms

Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.

Conformance requirements phrased as algorithms or specific steps can be implemented in any manner, so long as the end result is equivalent. In particular, the algorithms defined in this specification are intended to be easy to understand and are not intended to be performant. Implementers are encouraged to optimize.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[DeRemer1969]
Practical Translators for LR(k) Languages. 24 October 1969. URL: http://publications.csail.mit.edu/lcs/pubs/pdf/MIT-LCS-TR-065.pdf
[IEEE-754]
IEEE Standard for Floating-Point Arithmetic. 29 August 2008. URL: http://ieeexplore.ieee.org/servlet/opac?punumber=4610933
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119
[UAX14]
Christopher Chapman. Unicode Line Breaking Algorithm. 16 August 2022. Unicode Standard Annex #14. URL: https://www.unicode.org/reports/tr14/tr14-49.html
[UAX31]
Mark Davis; Robin Leroy. Unicode Identifier and Pattern Syntax. 31 August 2022. Unicode Standard Annex #31. URL: https://www.unicode.org/reports/tr31/tr31-37.html
[UnicodeVersion14]
The Unicode Standard, Version 14.0.0. URL: http://www.unicode.org/versions/Unicode14.0.0/
[VanWyk2007]
Eric R. Van Wyk; August C. Schwerdfeger. Context-Aware Scanning for Parsing Extensible Languages. 2007. URL: https://dl.acm.org/doi/10.1145/1289971.1289983
[VulkanMemoryModel]
Jeff Bolz; et al. Vulkan Memory Model. URL: https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#memory-model
[WebGPU]
Dzmitry Malyshau; Kai Ninomiya; Brandon Jones. WebGPU. Editor's Draft. URL: https://gpuweb.github.io/gpuweb/
[WEBIDL]
Edgar Chen; Timothy Gu. Web IDL Standard. Living Standard. URL: https://webidl.spec.whatwg.org/

Informative References

[CHARMOD-NORM]
Addison Phillips; et al. Character Model for the World Wide Web: String Matching. 11 August 2021. NOTE. URL: https://www.w3.org/TR/charmod-norm/

Issues Index

Can we query upper bounds on workgroup size dimensions? Is it independent of the shader, or a property to be queried after creating the shader module?
Check behavior of the f32 to f16 conversion for numbers just beyond the max normal f16 values. I’ve written what an NVIDIA GPU does. See https://github.com/google/amber/pull/918 for an executable test case.
https://github.com/gpuweb/gpuweb/issues/1621