This specification defines the term URL, and defines various algorithms for dealing with URLs, because for historical reasons the rules defined by the URI and IRI specifications are not a complete description of what HTML user agents need to implement to be compatible with Web content.
The term "URL" in this specification is used in a manner distinct from the precise technical meaning it is given in RFC 3986. Readers familiar with that RFC will find it easier to read this specification if they pretend the term "URL" as used herein is really called something else altogether. This is a willful violation of RFC 3986. [RFC3986]
A URL is a string used to identify a resource.
A URL is a valid URL if at least one of the following conditions holds:
The URL is a valid IRI reference and it has no query component. [RFC3987]
The URL is a valid IRI reference and its query component contains no unescaped non-ASCII characters. [RFC3987]
The URL is a valid IRI reference and the character encoding of the
URL's Document
is UTF-8 or a UTF-16 encoding. [RFC3987]
A string is a valid non-empty URL if it is a valid URL but it is not the empty string.
A string is a valid URL potentially surrounded by spaces if, after stripping leading and trailing whitespace from it, it is a valid URL.
A string is a valid non-empty URL potentially surrounded by spaces if, after stripping leading and trailing whitespace from it, it is a valid non-empty URL.
This specification defines the URL about:legacy-compat
as a
reserved, though unresolvable, about:
URI,
for use in DOCTYPEs in HTML
documents when needed for compatibility with XML tools.
[ABOUT]
This specification defines the URL about:srcdoc
as a reserved,
though unresolvable, about:
URI, that is used
as the document's address of
iframe
srcdoc
documents. [ABOUT]
The fallback base URL of a
Document
object is the absolute URL obtained by running these
substeps:
If the Document
is an iframe
srcdoc
document, then
return the document base URL of the
Document
's browsing context's browsing context container's
Document
and abort these steps.
If the document's address is
about:blank
,
and the Document
's browsing context has a creator browsing context, then
return the document base URL of the creator Document
, and
abort these steps.
Return the document's address.
The document base URL of a
Document
object is the absolute URL obtained by running these
substeps:
Let fallback base url be the Document
's fallback base URL.
If there is no base
element that has an href
attribute, then the document base URL is fallback base url; abort these steps. Otherwise, let
url be the value of the href
attribute of the first such element.
Resolve url relative
to fallback base url (thus, the base
href
attribute isn't affected by xml:base
attributes).
The document base URL is the result of the previous step if it was successful; otherwise it is fallback base url.
Resolving a URL is the process of taking a relative URL and obtaining the absolute URL that it implies.
A URL is an absolute URL if resolving it results in the same output regardless of what it is resolved relative to, and that output is not a failure.
An absolute URL is a hierarchical URL if, when resolved and then parsed, there is a character immediately after the <scheme> component and it is a "/" (U+002F) character.
An absolute URL is an authority-based URL if, when resolved and then parsed, there are two characters immediately after the <scheme> component and they are both "//" (U+002F) characters.
An interface that has a complement of URL decomposition IDL attributes has seven attributes with the following definitions:
attribute DOMString protocol; attribute DOMString host; attribute DOMString hostname; attribute DOMString port; attribute DOMString pathname; attribute DOMString search; attribute DOMString hash;
protocol
[ = value ]Returns the current scheme of the underlying URL.
Can be set, to change the underlying URL's scheme.
host
[ = value ]Returns the current host and port (if it's not the default port) in the underlying URL.
Can be set, to change the underlying URL's host and port.
The host and the port are separated by a colon. The port part, if omitted, will be assumed to be the current scheme's default port.
hostname
[ = value ]Returns the current host in the underlying URL.
Can be set, to change the underlying URL's host.
port
[ = value ]Returns the current port in the underlying URL.
Can be set, to change the underlying URL's port.
pathname
[ = value ]Returns the current path in the underlying URL.
Can be set, to change the underlying URL's path.
search
[ = value ]Returns the current query component in the underlying URL.
Can be set, to change the underlying URL's query component.
hash
[ = value ]Returns the current fragment identifier in the underlying URL.
Can be set, to change the underlying URL's fragment identifier.
The table below demonstrates how the getter for search
results in different results depending on the exact original syntax
of the URL:
Input URL | search
value |
Explanation |
---|---|---|
http://example.com/ |
empty string | No <query> component in input URL. |
http://example.com/? |
? |
There is a <query> component, but it is empty. |
http://example.com/?test |
?test |
The <query> component has the value "test ". |
http://example.com/?test# |
?test |
The (empty) <fragment> component is not part of the <query> component. |
The following table is similar; it provides a list of what each of the URL decomposition IDL attributes returns for a given input URL.
Input | protocol |
host |
hostname |
port |
pathname |
search |
hash |
---|---|---|---|---|---|---|---|
http://example.com/carrot#question%3f |
http: |
example.com |
example.com |
(empty string) | /carrot |
(empty string) | #question%3f |
https://www.example.com:4443? |
https: |
www.example.com:4443 |
www.example.com |
4443 |
/ |
? |
(empty string) |
A CORS settings attribute is an enumerated attribute. The following table lists the keywords and states for the attribute — the keywords in the left column map to the states in the cell in the second column on the same row as the keyword.
Keyword | State | Brief description |
---|---|---|
anonymous |
Anonymous | Cross-origin CORS requests for the element will have the omit credentials flag set. |
use-credentials |
Use Credentials | Cross-origin CORS requests for the element will not have the omit credentials flag set. |
The empty string is also a valid keyword, and maps to the Anonymous state. The attribute's invalid value default is the Anonymous state. The missing value default, used when the attribute is omitted, is the No CORS state.