In MathML3, content markup is divided into two subsets "Strict"- and "Pragmatic" Content MathML. The first subset uses a minimal set of elements representing the meaning of a mathematical expression in a uniform structure, while the second one tries to strike a pragmatic balance between verbosity and formality. Both forms of content Expressions are legitimate and have their role in representing mathematics. Strict Content MathML is canonical in a sense and simplifies the implementation of content MathML processors and the comparison of content expressions and Pragmatic Content MathML is much simpler and more intuitive for humans to understand, read, and write.
Strict Content MathML3 expressions can directly be given a formal semantics in terms of "OpenMath Objects" [OpenMath2004], and we interpret Pragmatic Content MathML3 expressions by specifying equivalent Strict variants, so that they inherit their semantics.
Editorial note: MiKo | |
We are using the notions of "Strict" and "Pragmatic" Content MathML in this working draft, even though they do not fully convey the intention of the representations choices. However, they carry the intuition much better than the terms "canonical" and "legacy" we used before, since they are less judgmental. |
MathML content encoding is based on the concept of an expression tree built up from
basic expressions, i.e. Numbers, Symbols, and Identifiers
derived expressions, i.e. function applications and binding expressions, and
As a general rule, the terminal nodes in the tree represent basic mathematical objects such as numbers, variables, arithmetic operations and so on. The internal nodes in the tree generally represent some kind of function application or other mathematical construction that builds up a compound object. Function application provides the most important example; an internal node might represent the application of a function to several arguments, which are themselves represented by the terminal nodes underneath the internal node.
This section provides the basic XML Encoding of content MathML expression trees. General usage and the mechanism used to associate mathematical meaning with symbols are provided here. [mathml3cds] provides a complete listing of the specific Content MathML symbols defined by this specification along with full reference information including attributes, syntax, and examples. It also describes the intended semantics of those symbols and suggests default renderings. The rules for using presentation markup within content markup are explained in Section 5.4.2 Presentation Markup in Content Markup.
Strict Content MathML is designed to be and XML encoding of OpenMath Objects (see [OpenMath2004]), which constitute the semantics of strict content MathML expressions. The table below gives an element-by-element correspondence between the OpenMath XML encoding of OpenMath objects and strict content MathML.
strict Content MathML | OpenMath |
---|---|
cn |
OMI , OMF |
csymbol |
OMS |
ci |
OMV |
apply |
OMA |
bind |
OMBIND |
bvar |
OMBVAR |
condition |
OMC |
share |
OMR |
semantics |
OMATTR , OMATP |
annotation ,
annotation-xml |
OMFOREIGN |
error |
OME |
Note that with this correspondence, strict content MathML also gains the OpenMath binary encoding as a space-efficient way of encoding content MathML expressions.
The cn
element is the MathML token element used to represent numbers. The
supported types of numbers include
integers,
real numbers,
double precision floating point numbers,
rational numbers and complex numbers. Where it makes sense, the base in which
the number is written can be specified.
For most numeric values, the content of a cn
element should be
either PCDATA or other cn
elements.
The permissible attributes on the cn
are:
Name | Values | Default |
---|---|---|
type |
"integer" | "real" | "double" | "e-notation," | "rational" | "complex-cartesian" | "complex-polar" | real |
base |
number | 10 |
The attribute type
is used to specify the kind of number being represented.
The pre-defined values are given in the table above. Unless otherwise specified,
the default "real" is used.
The attribute base
is used to specify how the content is to be parsed.
The attribute value is a base 10 positive integer giving the value of base in
which the PCDATA is to be interpreted.
The base
attribute should only be used on elements with type
"integer" or "real". Its use on cn
elements of other type is deprecated.
The default value for base
is "10".
Each data type implies that the content be of a certain form, as detailed below.
An integer is represented by an optional sign followed by a string of
one or more "digits". How a "digit" is interpreted
depends on the base
attribute. If base
is present,
it specifies the base for the digit encoding, and it specifies it base 10.
Thus base
='16' specifies a hexadecimal encoding.
When base
> 10, letters are used in alphabetical order as digits.
For example,
<cn base="16">7FE0</cn>
encodes the number written as 32736 in base ten.
When base
> 36, some integers cannot be represented using
numbers and letters alone and it is up to the application what additional
characters (if any) may be used for digits.
For example,
<cn base="1000">10F</cn>
represents the number written in base 10 as 1,000,015. However, the number
written in base 10 as 1,000,037 cannot be represented using letters and
numbers alone when base
is 1000.
A real number is presented in radix notation. Radix notation consists of an
optional sign ("+" or "-") followed by a string of
digits possibly separated into an integer and a fractional part by a
"decimal point". Some examples are 0.3, 1, and -31.56.
If a different base
is specified, then the digits are interpreted
as being digits computed to that base (in the same was as described for
type "integer").
This type is used to mark up those double-precision floating point numbers that can be represented in the IEEE 754 standard. This includes a subset of the (mathematical) real numbers, negative zero, positive and negative real infinity and a set of "not a number" values.
The content of a cn
element may be
PCDATA (representing numeric values as described below),
a infinity
element (representing positive real infinity),
a minfinity
element (representing negative real infinity)
or a notanumber
element.
If the content is PCDATA, it is interpreted as a real number in scientific notation. The number then has one or two parts, a significand and possibly an exponent. The significand has the format of a base 10 real number, as described above. The exponent (if present) has the format of a base 10 integer as described above. If the exponent is not present, it is taken to have the value 0. The value of the number is then that of the significand times ten to the power of the exponent.
A special case of PCDATA content is recognized. If a number of the above form has a negative sign and all digits of the signifcand are zero, then it is taken to be a negative zero in the sense of the IEEE 754 standard.
This type is deprecated. It is recommended to use
double
or real
instead.
A real number may be presented in scientific notation using this type.
Such numbers have two parts (a significand and an exponent) separated by
a <sep/>
element. The first part is a real number,
while the second part is an integer exponent indicating a power of the base.
For example, 12.3<sep/>
5 represents 12.3 times
105. The default presentation of this example is 12.3e5.
A rational number is given as two integers giving the numerator
and denominator of a quotient. These should themselves be given as
nested cn
elements.
For backward compatibility, deprecated usage allows the two integers to
be given as PCDATA separated by <sep/>
.
If a base
is present in this deprecated use,
it specifies the base used for the digit encoding of both integers.
A complex cartesian number is given as two numbers giving the
real and imaginary parts. These should themselves be given
as nested cn
elements. As for rational numbers,
the deprecated use of <sep/>
is also allowed.
A complex polar number is given as two numbers giving the
magnitude and angle. These should themselves be given
as nested cn
elements. As for rational numbers,
the deprecated use of <sep/>
is also allowed.
This type was deprecated in MathML 2.0 and is now no longer supported.
The notion of constructing a general expression tree is essentially that of applying an operator to sub-objects. For example, the sum "x+y" can be thought of as an application of the addition operator to two arguments x and y. And the expression "cos(π)" as the application of the cosine function to the number π.
In Content MathML, elements are used for operators and functions to capture the
crucial semantic distinction between the function itself and the expression
resulting from applying that function to zero or more arguments. This is addressed
by making the functions self-contained objects with their own properties and
providing an explicit apply
construct corresponding to function
application. We will consider the apply
construct in the next section.
In a sum expression "x+y" above, x and y typically taken to be "variables", since they have properties, but no fixed value, whereas the addition function is a "constant" or "symbol" as it denotes a specific function, which is defined somewhere externally. (Note that "symbol" is used here in the abstract sense and has no connection with any presentation of the construct on screen or paper).
Strict Content MathML3 uses the ci
element (for "content
identifier") to construct a variable, or an identifier that is not a
symbol. Its PCDATA content is interpreted as a name that identifies
it. Two variables are considered equal, iff their names are in the respective
scope (see Section 4.2.6 Bindings and Bound Variables for a discussion). A type
attribute indicates the type of object the symbol represents. Typically,
ci
represents a real scalar, but no default is specified.
Name | values | default |
---|---|---|
type | string | unspecified |
Due to the nature of mathematics the meaning of the mathematical expressions must
be extensible. The key to extensibility is the ability of the user to define new
functions and other symbols to expand the terrain of mathematical discourse. The
csymbol
element is used represent a "symbol" in much the same
way that ci
is used to construct a variable. The difference is that
csymbol
should refer to some mathematically defined concept with an
external definition referenced via the content dictionary attributes, whereas
ci
is used for identifiers that are essentially "local" to the
MathML expression.
In MathML3, external definitions are grouped in Content Dictionaries (structured documents for the definition of mathematical concepts; see [OpenMath2004] and [mathml3cds]).
We need three bits of information to fully identify a symbol: a symbol
name, a Content Dictionary name, and (optionally) a
Content Dictionary base URI, which we encode in the textual content
(which is the symbol name) and two attributes of the csymbol
element:
cd
and cdbase
. The Content Dictionary is the location of the
declaration of the symbol, consisting of a name and, optionally, a unique prefix
called a cdbase which is used to disambiguate multiple Content
Dictionaries of the same name. There are multiple encodings for content
dictionaries, this referencing scheme does not distinguish between them. If a
symbol does not have an explicit cdbase
attribute, then it inherits its
cdbase
from the first ancestor in the XML tree with one, should such an
element exist. In this document we have tended to omit the cdbase
for
brevity.
Name | values | default |
---|---|---|
cdbase | URI | inherited |
cd | URI | required |
Editorial note: MiKo | |
need to fix the default URI here |
Issue default_cd | wiki (member only) |
---|---|
Current CD default for csymbol |
|
We might make the |
|
Resolution | None recorded |
There are other properties of the symbol that are not explicit in these fields but whose values may be obtained by inspecting the Content Dictionary specified. These include the symbol definition, formal properties and examples and, optionally, a Role which is a restriction on where the symbol may appear in a MathML expression tree. The possible roles are described in Section 8.5 Symbol Roles.
<csymbol cdbase="http://www.example.com" cd="VectorCalculus">Christoffel</csymbol>
For backwards compatibility with MathML2 and to facilitate the use of MathML
within a URI-based framework (such as RDF [rdf] or OWL [owl]), the content of the name
, cd
, and
cdbase
can be combined in the definitionURL
attribute: we
provide the following scheme for constructing a canonical URI for an MathML
Symbol, which can be given in the definitionURL
attribute.
URI =
cdbase-value+ '/' +
cd-value+ '#' +
name-value
In the case of the Christoffel symbol above this would be the URL
http://www.example.com/VectorCalculus#Christoffel
For backwards compatibility with MathML2, we do not require that the
definitionURL
point to a content dictionary. But if the URL in this
attribute is of the form above, it will be interpreted as the canonical URL of a
MathML3 symbol. So the representation above would be equivalent to the one below:
<csymbol definitionURL="http://www.example.com/VectorCalculus">Christoffel</csymbol>
Issue MathML_CDs_URI | wiki (member only) |
---|---|
What is the official URI for MathMLCDs | |
We still have to fix this. Maybe it should correspond to the final resting place for CDs. |
|
Resolution | None recorded |
Issue definitionURL_encoding | wiki (member only) ISSUE-17 (member only) |
---|---|
URI encoding of cdbase /cd /name triplet
|
|
The URI encoding of the triplet we propose here does not work (not yet for
MathMLCDs and not at all for OpenMath2 CDs). The URI reference proposed uses a bare
name pointer |
|
Resolution | None recorded |
Issue cdbase-default | wiki (member only) ISSUE-13 (member only) |
---|---|
cdbase default value | |
For the inheritance mechanism to be complete, it would make sense to define a
default cdbase attribute value, e.g. at the math element. We'd support
expressions ignorant of cdbase as they all are thus far. Something such as
|
|
Resolution | None recorded |
The most fundamental way of building a compound object in mathematics is by applying a function or an operator to some arguments. MathML supplies an infrastructure to represent this in expression trees, which we will present in this section.
An apply
element is used to build an expression tree that represents the
result of applying a function or operator to its arguments. The tree corresponds to
a complete mathematical expression. Roughly speaking, this means a piece of
mathematics that could be surrounded by parentheses or "logical
brackets" without changing its meaning.
Name | values | default |
---|---|---|
cdbase | URI | inherited |
For example, (x + y) might be encoded as
<apply><csymbol cd="algebra-logic">plus</csymbol><ci>x</ci><ci>y</ci></apply>
The opening and closing tags of apply
specify exactly the scope of any
operator or function. The most typical way of using apply
is simple and
recursive. Symbolically, the content model can be described as:
<apply> op a b </apply>
where the operands a and b are MathML
expression trees themselves, and op is a MathML expression tree that
represents an operator or function. Note that apply
constructs can be
nested to arbitrary depth.
An apply
may in principle have any number of operands:
<apply> op a b [c...] </apply>
For example, (x + y + z) can be encoded as
<apply> <csymbol cd="algebra-logic">plus</csymbol> <ci>x</ci> <ci>y</ci> <ci>z</ci> </apply>
Mathematical expressions involving a mixture of operations result in nested
occurrences of apply
. For example, a x + b
would be encoded as
<apply><csymbol cd="algebra-logic">plus</csymbol> <apply><csymbol cd="algebra-logic">times</csymbol> <ci>a</ci> <ci>x</ci> </apply> <ci>b</ci> </apply>
There is no need to introduce parentheses or to resort to operator precedence in
order to parse the expression correctly. The apply
tags provide the proper
grouping for the re-use of the expressions within other constructs. Any expression
enclosed by an apply
element is viewed as a single coherent object.
An expression such as (F+G)(x) might be a product, as in
<apply><csymbol cd="algebra-logic">times</csymbol> <apply><csymbol cd="algebra-logic">plus</csymbol> <ci>F</ci> <ci>G</ci> </apply> <ci>x</ci> </apply>
or it might indicate the application of the function F + G to the argument x. This is indicated by constructing the sum
<apply><csymbol cd="algebra-logic">plus</csymbol><ci>F</ci><ci>G</ci></apply>
and applying it to the argument x as in
<apply> <apply><csymbol cd="algebra-logic">plus</csymbol> <ci>F</ci> <ci>G</ci> </apply> <ci>x</ci> </apply>
Both the function and the arguments may be simple identifiers or more complicated expressions.
The apply
element is conceptually necessary in order to distinguish
between a function or operator, and an instance of its use. The expression
constructed by applying a function to 0 or more arguments is always an element from
the codomain of the function. Proper usage depends on the operator that is being
applied. For example, the plus
operator may have zero or more arguments,
while the minus
operator requires one or two arguments to be properly
formed.
If the object being applied as a function is not already one of the elements
known to be a function (such as sin
or plus
) then it is treated as
if it were a function.
Some complex mathematical objects are constructed by the use of bound variables. For instance the integration variables in an integral expression is one.
Such expressions are represented as MathML expression trees using the
bind
element. Its first child is a MathML expression that represents a
binding operator (the integral operator in our example). This can be followed by a
non-empty list of bvar
elements for the bound variables, possibly
augmented by the qualifier element condition
(see Section 4.2.7 Qualifiers. The last child is the body of the binding,
it is another content MathML expression.
Name | values | default |
---|---|---|
cdbase | URI | inherited |
The bvar
element is a special qualifier element that is used to denote
the bound variable of a binding expression, e.g. in sums, products, and quantifiers
or user defined functions.
Name | values | default |
---|---|---|
cdbase | URI | inherited |
<bind> <csymbol cd="algebra-logic">forall</csymbol> <bvar><ci>x</ci></bvar> <apply><csymbol cd="relations">eq</csymbol> <apply><csymbol cd="algebra-logic">minus</csymbol><ci>x</ci><ci>x</ci></apply> <cn>0</cn> </apply> </bind>
<bind> <csymbol cd="calculus_veccalc">int</csymbol> <bvar><ci xml:id="var-x">x</ci></bvar> <apply><csymbol cd="algebra-logic">power</csymbol> <ci definitionURL="#var-x"><mi>x</mi></ci> <cn>7</cn> </apply> </bind>
Editorial note: MiKo | |
We need to say something about alpha-conversion here for OpenMath compatibility. |
The integrals we have seen so far have all been indefinite, i.e. the range of the
bound variables range is unspecified. In many situations, we also want to specify
range of bound variables, e.g. in definitive integrals. MathML3 provides the optional
condition
element as a general restriction mechanism for binding expressions.
A condition
element contains a single child that represents a truth
condition. Compound conditions are indicated by applying operators such as
and
in the condition. Consider for instance the following representation of a
definite integral.
Name | values | default |
---|---|---|
cdbase | URI | inherited |
<bind> <int/> <bvar><ci>x</ci></bvar> <condition> <apply><csymbol cd="sets">in</csymbol> <apply><interval/><cn>0</cn><infty/></apply> </apply> </condition> <apply><sin/><ci>x</ci></apply> </bind>
Here the condition
element restricts the bound variables to range over the
non-negative integers. A number of common mathematical constructions involve such
restrictions, either implicit in conventional notation, such as a bound variable, or
thought of as part of the operator rather than an argument, as is the case with the
limits of a definite integral.
A typical use of the condition
qualifier is to define sets by rule, rather
than enumeration. The following markup, for instance, encodes the set {x |
x < 1}:
<bind><set/> <bvar><ci>x</ci></bvar> <condition><apply><lt/><ci>x</ci><cn>1</cn></apply></condition> <ci>x</ci> </bind>
In the context of quantifier operators, this corresponds to the "such that" construct used in mathematical expressions. The next example encodes "for all x in N there exist prime numbers p, q such that p+q = 2x".
<bind><csymbol cd="algebra-logic">forall</csymbol> <bvar><ci>x</ci></bvar> <condition> <apply><csymbol cd="sets">in</csymbol> <ci>x</ci> <csymbol cd="contstants">naturalnumbers</csymbol> </apply> </condition> <bind><csymbol cd="algebra-logic">exists</csymbol> <bvar><ci>p</ci></bvar> <bvar><ci>q</ci></bvar> <condition> <apply><csymbol cd="algebra-logic">and</csymbol> <apply><csymbol cd="sets">in</csymbol><ci>p</ci><primes/></apply> <apply><csymbol cd="sets">in</csymbol><ci>q</ci><primes/></apply> </apply> </condition> <apply><csymbol cd="relations">eq</csymbol> <apply><csymbol cd="algebra-logic">plus</csymbol><ci>p</ci><ci>q</ci></apply> <apply><csymbol cd="algebra-logic">times</csymbol><cn>2</cn><ci>x</ci></apply> </apply> </bind> </bind>
This use extends to multivariate domains by using extra bound variables and a domain corresponding to a cartesian product as in
<bind><intexp/> <bvar><ci>x</ci></bvar> <bvar><ci>y</ci></bvar> <condition> <apply><csymbol cd="algebra-logic">and</csymbol> <apply><csymbol cd="relations">leq</csymbol><cn>0</cn><ci>x</ci></apply> <apply><csymbol cd="relations">leq</csymbol><ci>x</ci><cn>1</cn></apply> <apply><csymbol cd="relations">leq</csymbol><cn>0</cn><ci>y</ci></apply> <apply><csymbol cd="relations">leq</csymbol><ci>y</ci><cn>1</cn></apply> </apply> </condition> <apply> <csymbol cd="algebra-logic">times</csymbol> <apply><csymbol cd="algebra-logic">power</csymbol><ci>x</ci><cn>2</cn></apply> <apply><csymbol cd="algebra-logic">power</csymbol><ci>y</ci><cn>3</cn></apply> </apply> </bind>
To conserve space, MathML3 expression trees can make use of structure sharing
share
element
This element has an href
attribute whose value is the value of a URI
referencing an xml:id
attribute of a MathML expression tree. When building the
MathML expression tree, the share
element is replaced by a copy of the MathML
expression tree referenced by the href
attribute. Note that this copy is
structurally equal, but not identical to the element referenced. The
values of the share
will often be relative URI references, in which case they
are resolved using the base URI of the document containing the share element
.
Name | values | default |
---|---|---|
href | URI |
For instance, the mathematical object f(f(f(a,a),f(a,a)),f(a,a),f(a,a)) can be encoded as either one of the following representations (and some intermediate versions as well).
<math> <math> <apply> <apply> <ci>f</ci> <ci>f</ci> <apply> <apply xml:id="t1"> <ci>f</ci> <ci>f</ci> <apply> <apply xml:id="t11"> <ci>f</ci> <ci>f</ci> <ci>a</ci> <ci>a</ci> <ci>a</ci> <ci>a</ci> </apply> </apply> <apply> <share href="#t11"/> <ci>f</ci> <ci>a</ci> <ci>a</ci> </apply> </apply> </apply> <apply> <share href="#t1"/> <ci>f</ci> <apply> <ci>f</ci> <ci>a</ci> <ci>a</ci> </apply> <apply> <ci>f</ci> <ci>a</ci> <ci>a</ci> </apply> </apply> </apply> </math> </math>
We say that an element dominates all its children and all elements they dominate. An
share
element dominates its target, i.e. the element that carries the
xml:id
attribute pointed to by the href
attribute. For instance in the
representation above the apply
element with xml:id="t1"
and also the
second share
dominate the apply
element with xml:id="t11"
.
The occurrences of the share
element must obey the following global
acyclicity constraint: An element may not dominate itself. For instance the
following representation violates this constraint:
<apply xml:id="foo"> <csymbol cd="algebra-logic">plus</csymbol> <cn>1</cn> <apply> <csymbol cd="algebra-logic">plus</csymbol> <cn>1</cn> <share href="foo"/> </apply> </apply>
Here, the apply
element with xml:id="foo"
dominates its third child,
which dominates the share
element, which dominates its target: the element with
xml:id="foo"
. So by transitivity, this element dominates itself, and by the
acyclicity constraint, it is not an MathML expression tree. Even though it could be given
the interpretation of the continued fraction
this would correspond to an infinite tree of applications, which is not admitted by
Content MathML
Note that the acyclicity constraints is not restricted to such simple cases, as the following example shows:
<apply xml:id="bar"> <apply xml:id="baz"> <csymbol cd="algebra-logic">plus</csymbol> <csymbol cd="algebra-logic">plus</csymbol> <cn>1</cn> <cn>1</cn> <share href="baz"/> <share href="bar"/> </apply> </apply>
Here, the apply
with xml:id="bar"
dominates its third child, the
share
with href="baz"
, which dominates its target apply
with xml:id="baz"
, which in turn dominates its third child, the share
with href="bar"
, this finally dominates its target, the original
apply
element with xml:id="bar"
. So this pair of representations
violates the acyclicity constraint.
Note that the share
element is a syntactic referencing mechanism:
an share
element stands for the exact element it points to. In particular,
referencing does not interact with binding in a semantically intuitive way, since it
allows for variable capture. Consider for instance
<bind xml:id="outer"> <lambda/> <bvar><ci>x</ci></bvar> <apply> <ci>f</ci> <bind xml:id="inner"> <lambda/> <bvar><ci>x</ci></bvar> <share xml:id="copy" href="#orig"/> </bind> <apply xml:id="orig"><ci>g</ci><ci>X</ci></apply> </apply> </bind>
it represents the term which has two sub-terms of the form , one with xml:id="orig"
(the one explicitly represented) and one with xml:id="copy"
, represented by the
share
element. In the original, the variable x is bound by the
outer bind
element, and in the copy, the variable x is
bound by the inner bind
element. We say that the inner bind
has captured the variable X.
It is well-known that variable capture does not conserve semantics. For instance, we could use α-conversion to rename the inner occurrence of x into, say, y arriving at the (same) object Using references that capture variables in this way can easily lead to representation errors, and is not recommended.
semantics
Content elements can be adorned with additional information via the
semantics
element, see Section 5.3 Semantic Annotations beyond Alternate Representations for details. As
such, the semantics
element should be considered part of both presentation
MathML and content MathML. MathML3 considers a semantics
element (strict)
content MathML, if and only if its first child is (strict) content MathML. All MathML
processors should process the semantics
element, even if they only process
one of those subsets.
Editorial note: MiKo | |
Give an elaborated example from the types note here (or in the primer?), reference Section 8.4 Type Declarations |
A content error expression is made up of a symbol and a sequence of zero or more MathML expression trees. This object has no direct mathematical meaning. Errors occur as the result of some treatment on an expression tree and are thus of real interest only when some sort of communication is taking place. Errors may occur inside other objects and also inside other errors.
Name | values | default |
---|---|---|
cdbase | URI | inherited |
To encode an error caused by a division by zero, we would employ a
aritherror
Content Dictionary with a DivisionByZero
symbol
with role error
we would use the following expression tree:
<cerror> <csymbol cd="aritherror">DivisionByZero</csymbol> <apply><divide/><ci>x</ci><cn>0</cn></apply> </cerror>
Note that the error should cover the smallest erroneous subexpression so cerror
can be a subexpression of a bigger one, e.g.
<apply><csymbol cd="relations">eq</csymbol> <cerror> <csymbol cd="aritherror">DivisionByZero</csymbol> <apply><divide/><ci>x</ci><cn>0</cn></apply> </cerror> <cn>0</cn> </apply>
If an application wishes to signal that the content MathML expressions it has received is invalid or is not well-formed then the offending data must be encoded as a string. For example:
<cerror> <csymbol cd="parser">invalid_XML</csymbol> <mtext> <<!--LESS-THAN SIGN-->apply><!--GREATER-THAN SIGN--><<!--LESS-THAN SIGN-->cos><!--GREATER-THAN SIGN--> <<!--LESS-THAN SIGN-->ci><!--GREATER-THAN SIGN-->v<<!--LESS-THAN SIGN-->/ci><!--GREATER-THAN SIGN--> <<!--LESS-THAN SIGN-->/apply><!--GREATER-THAN SIGN--> </mtext> </cerror>
Note that the <
and >
characters have been escaped as
is usual in an XML document.
MathML3 content markup differs from earlier versions of MathML in that it has been regularized and based on the content dictionary model introduced by OpenMath [OpenMath2004].
MathML3 also supports MathML2 markup as a pragmatic representation that is easier to read and more intuitive for humans. We will discuss this representation in the following and indicate the equivalent strict representations. Thus the "pragmatic content MathML" representations inherit the meaning from their strict counterparts.
The cn
element can be used with the value "constant" for
the type
attribute and the Unicode symbols for the content. This use of
the cn
is deprecated in favor of the number constants
exponentiale
,
imaginaryi
,
true
,
false
,
notanumber
,
pi
,
eulergamma
, and
infinity
in the content dictionary constants
CD, or the use of csymbol
with an appropriate value for the definitionURL
attribute. For example, instead of using the
pi
element, an instance of <cn
type="constant">π</cn>
could be used.
csymbol
Elements with Presentation MathML
Issue csymbol_pmathml_strict | wiki (member only) |
---|---|
Strict equivalent for csymbol with pMathML content
|
|
What is the strict equivalent for the case of a |
|
Resolution | None recorded |
In pragmatic MathML3 the csymbol
element can contain presentation MathML
instead of the symbol name. For example,
<csymbol definitionURL="http://www.example.com/ContDiffFuncs.htm"> <msup><mi>C</mi><mn>2</mn></msup> </csymbol>
encodes an atomic symbol that displays visually as C2 and that, for purposes of content, is treated as a single symbol representing the space of twice-differentiable continuous functions. This pragmatic representation is equivalent to
<semantics> <csymbol definitionURL="http://www.example.com/ContDiffFuncs.htm">C2</csymbol> <annotation-xml encoding="MathMLP"> <msup><mi>C</mi><mn>2</mn></msup> </annotation-xml> </semantics>
Both can be used interchangeably.
In Pragmatic Content MathML, the ci
and csymbol
elements can
contain a general presentation construct (see Section 3.1.6 Summary of Presentation Elements), which is
used for rendering (see Section 8.6 Rendering of Content Elements). In this case, the
definitionURL
attribute can be used to associate a name with with a
ci
element, which identifies it. See the discussion of bound variables
(Section 4.2.6 Bindings and Bound Variables) for a discussion of an important instance of this. For
example,
<ci definitionURL="c1"><msub><mi>c</mi><mn>1</mn></msub></ci>
encodes an atomic symbol that displays visually as c1 which, for purposes of content, is treated as a atomic concept representing a real number.
Instances of the bound variables are normally recognized by comparing the XML
information sets of the relevant ci
elements after first carrying out XML
space normalization. Such identification can be made explicit by placing an
xml:id
on the ci
element in the bvar
element and referring
to it using the definitionURL
attribute on all other instances. An
example of this approach is
This xml:id
based approach is especially helpful when constructions involving
bound variables are nested.
It can be necessary to associate additional information with a bound variable one
or more instances of it. The information might be something like a detailed
mathematical type, an alternative presentation or encoding or a domain of
application. Such associations are accomplished in the standard way by replacing a
ci
element (even inside the bvar
element) by a semantics
element containing both it and the additional information. Recognition of and
instance of the bound variable is still based on the actual ci
elements and
not the semantics
elements or anything else they may contain. The
xml:id
based approach outlined above may still be used.
A ci
element with Presentation MathML content is equivalent to a
semantics
construction where the first child is a ci
whose content is
the value of the definitionURL
attribute and whose second child is an
annotation-xml
element with the MathML Presentation. For example the Strict
Content MathML equivalent to the example above would be
<semantics> <ci>c1</ci> <annotation-xml encoding="PMathML"> <msub><mi>c</mi><mn>1</mn></msub> </annotation-xml> </semantics>
The ci
element uses the type
attribute to specify the basic type
of object that it represents. While any CDATA string is a valid type, the
predefined types include "integer", "rational",
"real", "complex", "complex-polar",
"complex-cartesian", "constant", "function"
and more generally, any of the names of the MathML container elements (e.g.
vector
) or their type values. For a more advanced treatment of types, the
type
attribute is inappropriate. Advanced types require significant
structure of their own (for example, vector(complex)) and are probably best constructed
as mathematical objects and then associated with a MathML expression through use of the
semantics
element.
Editorial note: MiKo | |
Give the Strict equivalent here by techniques from the Types Note |
For convenience and backwards compatibility MathML3 provides empty token elements for the operators and functions of the K-14 fragment of mathematics. The general rule is that for any symbol defined in the MathML3 content dictionaries (see Chapter 8 MathML3 Content Dictionaries), there is an empty content element with the same name. For instance, the empty MathML element
<plus/>
is equivalent to the element
<csymbol cdbase="http://w3.org/Math/CD" cd="algebra-logic" name="plus"/>
both can be used interchangeably.
In MathML2, the definitionURL
attribute could be used to modify the
meaning of an element to allow essentially the same notation to be re-used for a
discussion taking place in a different mathematic domain. This use of the attribute is
deprecated in MathML3, in favor of using a
csymbol
with the same definitionURL
attribute.
In MathML2, the meaning of various token elements could be specialized via various
attributes, usually the type
attribute. Strict Content MathML does not
have this possibility, therefore these attributes are either passed to the symbols as
extra arguments in the apply
or bind
elements, or MathML3 adds new
symbols for the non-default case to the respective content dictionaries.
We will summarize the cases in the following table:
pragmatic Content MathML | strict Content MathML |
---|---|
<diff type="function"/> | <csymbol cd="calculus_veccalc">diff</csymbol> |
<diff type="algebraic"/> | <csymbol cd="calculus_veccalc">aDiff</csymbol> |
Editorial note: MiKo | |
systematically consider all the cases here |
To retain compatibility with MathML2, MathML3 provides an alternative
representation for applications of constructor elements. For instance for the
set
element, the following two representations are considered equivalent
<set><ci>a</ci><ci>b</ci><ci>c</ci></set>
<apply><set/><ci>a</ci><ci>b</ci><ci>c</ci></apply>
and following the discussion in section Section 4.2.4 Symbols and Identifiers they are equivalent to
<apply><csymbol cd="sets">set</csymbol><ci>a</ci><ci>b</ci><ci>c</ci></apply>
Other constructors are interval
, list
, matrix
,
matrixrow
, vector
, apply
, lambda
,
piecewise
, piece
, otherwise
The domainofapplication
element was used in MathML2 an apply
element which denotes the domain over which a given function is being applied. In
contrast to its use as a qualifier
in the bind
element, the usage in the apply
element only marks the
argument position for the range argument of the definite integral.
MathML3 supports this representation as a pragmatic form. For instance, the integral of a function f over an arbitrary domain C can be represented as
<apply><int/> <domainofapplication><ci>C</ci></domainofapplication> <ci>f</ci> </apply>
in the Pragmatic Content MathML representation, it is considered equivalent to
<apply><int/><ci>C</ci><ci>f</ci></apply>
Editorial note: MiKo | |
be careful with Int and int here
|
The domainofapplication
was intended to be an alternative to
specification of range of bound variables for condition
. Generally, a domain
of application D can be specified by a condition
element
requesting that the bound variable is a member of D. For instance, we consider
the Pragmatic Content MathML representation
<apply><int/> <bvar><ci>x</ci></bvar> <domainofapplication><ci type="set">D</ci></domainofapplication> <apply><ci type="function">f</ci><ci>x</ci></apply> </apply>
as equivalent to the Strict Content MathML representation
<bind><intexp/> <bvar><ci>x</ci></bvar> <condition><apply><in/><ci>x</ci><ci type="set">D</ci></apply></condition> <apply><ci type="function">f</ci><ci>x</ci></apply> </bind>
MathML2 used the int
element for the definite or indefinite integral of
a function or algebraic expression on some sort of domain of application. There are
several forms of calling sequences depending on the nature of the arguments, and
whether or not it is a definite integral. Those forms using interval
,
condition
, lowlimit
, or uplimit
, provide convenient
shorthand notations for an appropriate domainofapplication
.
Editorial note: Miko | |
the following must be reworked |
MathML separates the functionality of the int
element into three
different symbols: int
, defint
, and defintset
. The first two are integral
operators that can be applied to functions and the latter is binding operators for
integrating an algebraic expression with respect to a bound variable.
The following two indefinite function integrals are equivalent.
<apply><int/><sin/></apply>
<apply><intfun/><sin/></apply>
The following two definite function integrals are equivalent (see also Section 4.3.8 Domain of Application in Applications).
<apply><int/> <domainofapplication><ci type="set">D</ci></domainofapplication> <sin/> </apply>
<apply><defintfun/><ci type="set">D</ci><sin/></apply>
The following two indefinite integrals over algebraic expressions are equivalent.
<apply><bvar><ci>x</ci></bvar><int/><apply><sin/><ci>x</ci></apply></apply>
<bind><bvar><ci>x</ci></bvar><intexp/><apply><sin/><ci>x</ci></apply></bind>
The following two definite function integrals are equivalent.
<apply><int/> <bvar><ci>x</ci></bvar> <domainofapplication><ci type="set">D</ci></domainofapplication> <apply><sin/><ci>x</ci></apply> </apply>
<bind><intexp/> <bvar><ci>x</ci></bvar> <domainofapplication><ci type="set">D</ci></domainofapplication> <apply><sin/><ci>x</ci></apply> </bind>
The degree element is a qualifier used by some MathML containers to specify that, for example, a bound variable is repeated several times.
Editorial note: MiKo | |
specify a complete list of containers that allow degree elements,
so far I see diff , partialdiff , root |
The degree
element is the container element for the "degree"
or "order" of an operation. There are a number of basic mathematical
constructs that come in families, such as derivatives and moments. Rather than
introduce special elements for each of these families, MathML uses a single general
construct, the degree
element for this concept of "order".
<bind><diff/> <bvar><ci>x</ci><degree><cn>2</cn></degree></bvar> <apply><power/><ci>x</ci><cn>5</cn></apply> </bind>
<bind> <partialdiff/> <bvar> <ci>x</ci> <degree><ci> n </ci></degree> </bvar> <bvar> <ci>y</ci> <degree><ci>m</ci></degree> </bvar> <apply><sin/> <apply><times/><ci>x</ci><ci>y</ci></apply> </apply> </bind>
A variable that is to be bound is placed in this container. In a derivative, it
indicates which variable with respect to which a function is being differentiated.
When the bvar
element is used to qualify a derivative, the bvar
element may contain a child degree
element that specifies the order of the
derivative with respect to that variable.
<apply> <diff/> <bvar> <ci>x</ci> <degree><cn>2</cn></degree> </bvar> <apply><power/><ci>x</ci><cn>4</cn></apply> </apply>
it is equivalent to
<bind> <apply><diff/><cn>2</cn></apply> <bvar><ci>x</ci></bvar> <apply><power/><ci>x</ci><cn>4</cn></apply> </bind>
Editorial note: MiKo | |
what do we want to use for degree? |
Note that the degree element is only allowed in the container representation. The strict representation takes
the degree as a regular argument as the second child of the apply
or
bind
element.
Editorial note: MiKo | |
Make sure that all MMLdefinition s of degree-carrying symbols get a
paragraph like the one for root .
|
The default rendering of the degree
element and its contents depends on
the context. In the example above, the degree
elements would be rendered as
the exponents in the differentiation symbols:
The uplimit
and lowlimit
elements are Pragmatic Content MathML
qualifiers that can be used to restrict the range of a bound variable to an interval,
e.g. in some integrals and sums. uplimit
/lowlimit
pairs can be
expressed via the interval
element from
the CD Basic Content
Elements
. For instance, we consider the Pragmatic Content MathML representation
<apply><int/> <bvar><ci> x </ci></bvar> <lowlimit><ci>a</ci></lowlimit> <uplimit><ci>b</ci></uplimit> <apply><ci type="function">f</ci><ci>x</ci></apply> </apply>
as equivalent to the following strict representation
<bind><int/> <bvar><ci>x</ci></bvar> <condition> <apply><in/><ci>x</ci><apply><interval/><ci>a</ci><ci>b</ci></apply></apply> </condition> <lowlimit><ci>a</ci></lowlimit> <uplimit><ci>b</ci></uplimit> <apply><ci type="function">f</ci><ci>x</ci></apply> </bind>
If the lowlimit
qualifier is missing, it is interpreted as negative infinity,
similarly, if uplimit
is then it is interpreted as positive infinity.
Issue lifted_operators | wiki (member only) ISSUE-8 (member only) |
---|---|
New Symbols for Lifted Operators | |
MathML2 allowed the use of n-ary operators as binding operators
with bound variables induced by them. For instance |
|
Resolution | None recorded |
MathML2 allowed to use a associative operators to be "lifted" to "big operators", for instance the n-ary union operator to the union operator over sets, as the union of the U-complements over a family F of sets in this construction
<apply> <union/> <bvar><ci>S</ci></bvar> <condition> <apply><in/><ci>S</ci><ci>F</ci></apply> </condition> <apply><setdiff/><ci>U</ci><ci>S</ci></apply> </apply>
While the relation between the nary and the set-based operators is deterministic,
i.e. the induced big operators are fully determined by them, the concepts are quite
different in nature (different notational conventions, different types, different
occurrence schemata). Therefore the MathML3 content dictionaries provides explicit
symbols for the "big operators", much like MathML2 did with sum
as the big operator for for the n-ary plus
symbol, and prod
for
times
. Concretely, these are
big_union
,
big_intersect
,
big_max
,
big_min
,
big_gcd
,
big_lcm
,
big_or
,
big_and
, and
big_xor
. With these, we can express all
Pragmatic Content MathML expressions. For instance, the union above can be represented
strictly as
<bind><Union/> <bvar><ci>S</ci></bvar> <condition> <apply><in/><ci>S</ci><ci>F</ci></apply> </condition> <apply><setdiff/><ci>U</ci><ci>S</ci></apply> </bind>
For the exact meaning of the new symbols, consult the content dictionaries.
Issue large_ops | wiki (member only) ISSUE-18 (member only) |
---|---|
Large Operators | |
The large operators can be solved in two ways, in the way described here, by inventing large operators (and David does not like symbol names distinguished only by case; and I agree tend to agree with him). Or by extending the role of roles to allow duplicate roles per symbol, then we could re-use the symbols like we did in MathML2, but then we would have to extend OpenMath for that |
|
Resolution | None recorded |
declare
)
Editorial note: MiKo | |
This should maybe be moved into a general section about changes or deprecated elements. Also Stan thinks the text should be improved. |
MathML2 provided the declare
element that allowed to bind properties like
types to symbols and variables and to define abbreviations for structure sharing. This
element is deprecated in MathML3. Structure sharing can obtained via the share
element (see Section 4.2.8 Structure Sharing for details).
We will now give an overview over the MathML3 symbols: they are grouped into content dictionaries that broadly reflect the area of mathematics from which they come.
Editorial note: MiKo | |
The list will eventually be generated from the MathML3 Content Dictionaries, it is currently only very vaguely in sync with them |
basic_content_elements.mcd
for
the basic content elements.
interval
Token element, description, model, and examples to be generated from content dictionary
inverse
Token element, description, model, and examples to be generated from content dictionary
lambda
Token element, description, model, and examples to be generated from content dictionary
compose
Token element, description, model, and examples to be generated from content dictionary
ident
Token element, description, model, and examples to be generated from content dictionary
domain
Token element, description, model, and examples to be generated from content dictionary
codomain
Token element, description, model, and examples to be generated from content dictionary
image
Token element, description, model, and examples to be generated from content dictionary
piecewise
Token element, description, model, and examples to be generated from content dictionary
algebra-logic
for arithmetic, algebra and
logic.
quotient
Token element, description, model, and examples to be generated from content dictionary
factorial
Token element, description, model, and examples to be generated from content dictionary
divide
Token element, description, model, and examples to be generated from content dictionary
big_max
Token element, description, model, and examples to be generated from content dictionary
big_min
Token element, description, model, and examples to be generated from content dictionary
minus
Token element, description, model, and examples to be generated from content dictionary
plus
Token element, description, model, and examples to be generated from content dictionary
power
Token element, description, model, and examples to be generated from content dictionary
rem
Token element, description, model, and examples to be generated from content dictionary
times
Token element, description, model, and examples to be generated from content dictionary
root
Token element, description, model, and examples to be generated from content dictionary
gcd
Token element, description, model, and examples to be generated from content dictionary
big_gcd
Token element, description, model, and examples to be generated from content dictionary
and
Token element, description, model, and examples to be generated from content dictionary
big_and
Token element, description, model, and examples to be generated from content dictionary
big_or
Token element, description, model, and examples to be generated from content dictionary
xor
Token element, description, model, and examples to be generated from content dictionary
big_xor
Token element, description, model, and examples to be generated from content dictionary
not
Token element, description, model, and examples to be generated from content dictionary
implies
Token element, description, model, and examples to be generated from content dictionary
forall
Token element, description, model, and examples to be generated from content dictionary
exists
Token element, description, model, and examples to be generated from content dictionary
abs
Token element, description, model, and examples to be generated from content dictionary
conjugate
Token element, description, model, and examples to be generated from content dictionary
arg
Token element, description, model, and examples to be generated from content dictionary
real
Token element, description, model, and examples to be generated from content dictionary
imaginary
Token element, description, model, and examples to be generated from content dictionary
lcm
Token element, description, model, and examples to be generated from content dictionary
big_lcm
Token element, description, model, and examples to be generated from content dictionary
relations
for relations.
equivalent
Token element, description, model, and examples to be generated from content dictionary
calculus_veccalc
for calculus and
vector calculus.
diff
Token element, description, model, and examples to be generated from content dictionary
partialdiff
Token element, description, model, and examples to be generated from content dictionary
divergence
Token element, description, model, and examples to be generated from content dictionary
grad
Token element, description, model, and examples to be generated from content dictionary
sets
for theory of sets.
list
Token element, description, model, and examples to be generated from content dictionary
union
Token element, description, model, and examples to be generated from content dictionary
union
Token element, description, model, and examples to be generated from content dictionary
intersect
Token element, description, model, and examples to be generated from content dictionary
intersect
Token element, description, model, and examples to be generated from content dictionary
notin
Token element, description, model, and examples to be generated from content dictionary
subset
Token element, description, model, and examples to be generated from content dictionary
prsubset
Token element, description, model, and examples to be generated from content dictionary
notsubset
Token element, description, model, and examples to be generated from content dictionary
notprsubset
Token element, description, model, and examples to be generated from content dictionary
setdiff
Token element, description, model, and examples to be generated from content dictionary
sequences_series
for sequences and
series.
product
Token element, description, model, and examples to be generated from content dictionary
specfun
for elementary classical
functions.
sinh
Token element, description, model, and examples to be generated from content dictionary
cosh
Token element, description, model, and examples to be generated from content dictionary
tanh
Token element, description, model, and examples to be generated from content dictionary
sech
Token element, description, model, and examples to be generated from content dictionary
csch
Token element, description, model, and examples to be generated from content dictionary
coth
Token element, description, model, and examples to be generated from content dictionary
arcsin
Token element, description, model, and examples to be generated from content dictionary
arccos
Token element, description, model, and examples to be generated from content dictionary
arctan
Token element, description, model, and examples to be generated from content dictionary
arccosh
Token element, description, model, and examples to be generated from content dictionary
arccot
Token element, description, model, and examples to be generated from content dictionary
arccoth
Token element, description, model, and examples to be generated from content dictionary
arccsc
Token element, description, model, and examples to be generated from content dictionary
arccsch
Token element, description, model, and examples to be generated from content dictionary
arcsec
Token element, description, model, and examples to be generated from content dictionary
arcsech
Token element, description, model, and examples to be generated from content dictionary
statistics
for statistics.
mean
Token element, description, model, and examples to be generated from content dictionary
sdev
Token element, description, model, and examples to be generated from content dictionary
variance
Token element, description, model, and examples to be generated from content dictionary
median
Token element, description, model, and examples to be generated from content dictionary
mode
Token element, description, model, and examples to be generated from content dictionary
linear_algebra
for linear algebra.
vector
Token element, description, model, and examples to be generated from content dictionary
matrix
Token element, description, model, and examples to be generated from content dictionary
matrixrow
Token element, description, model, and examples to be generated from content dictionary
determinant
Token element, description, model, and examples to be generated from content dictionary
transpose
Token element, description, model, and examples to be generated from content dictionary
selector
Token element, description, model, and examples to be generated from content dictionary
vectorproduct
Token element, description, model, and examples to be generated from content dictionary
constants
for constant and symbol
elements.
integers
Token element, description, model, and examples to be generated from content dictionary
reals
Token element, description, model, and examples to be generated from content dictionary
rationals
Token element, description, model, and examples to be generated from content dictionary
naturalnumbers
Token element, description, model, and examples to be generated from content dictionary
complexes
Token element, description, model, and examples to be generated from content dictionary
primes
Token element, description, model, and examples to be generated from content dictionary
exponentiale
Token element, description, model, and examples to be generated from content dictionary
imaginaryi
Token element, description, model, and examples to be generated from content dictionary
notanumber
Token element, description, model, and examples to be generated from content dictionary
true
Token element, description, model, and examples to be generated from content dictionary
false
Token element, description, model, and examples to be generated from content dictionary
emptyset
Token element, description, model, and examples to be generated from content dictionary
pi
Token element, description, model, and examples to be generated from content dictionary
eulergamma
Token element, description, model, and examples to be generated from content dictionary
infinity
Token element, description, model, and examples to be generated from content dictionary
Editorial note: SMW | |
Changes required:
A minfinity element must be added for negative
real infinity.
Furthermore, The notanumber element, which in MathML2 was cannonically
empty, can in MathML3 have CDATA content giving an integer
specifying the NaN desired.
The values 0 < n < 252 enumerate all IEEE 754
double-precision NaNs.
For compatiility with other elements, notanumber
now accepts a base attribute with default value "10"
to give the radix with which the CDATA content is to be interpreted.
|