CellML Secondary Specification for Parameter Uncertainty : Unofficial working draft


Table of Contents

1. Terminology
2. CellML Secondary Specification - DAEs with Optional Resets
2.1. Preliminary matters
2.2. Requirements for compliant models
2.3. Rules for the specification of distributions
2.4. The uncertainty content dictionary

Abstract

This document is an unofficial working draft. The below describes the intended status of the specification, and not the current status right now.

This document is the CellML Secondary Specification for Uncertain Parameters. It provides additional constraints on top of those specified in the CellML 1.2 specification and the CellML Secondary Specification for Differential Algebraic Equations and Events; to be compliant with this secondary specification, processing software MUST be able to correctly process every model that is compliant with this specification.

This is the normative specification of a secondary specification. It is intended to provide the minimum amount of information needed to accurately describe the class of models allowed by this secondary specification.

1. Terminology

All terms defined in the CellML Core Specification version 1.2 and the CellML Secondary Specification for Differential Algebraic Equations with Events also apply.

Uncertainty

A lack of information about the exact true value of a variable.

Random variable

A variable about which there is uncertainty.

Distribution of a random variable

The set of all possible values for a random variable, along with sufficient information to determine the probability that the random variable takes a particular value, or that it takes a value which is a member of some subset of all possible values.

Probability density function (or p.d.f.)

A function f(x) which specifies the distribution of a continuous random variable such that P ( a < X < b ) = a b f ( x ) dx , where P refers to the probability of events occurring.

Probability mass function (or p.m.f.)

A function, defined over a discrete probability distribution, which gives the probability of the function taking a particular value.

Cumulative density function (or c.d.f.)

The function F ( x ) = x f ( y ) dy , where f(y) represents the p.d.f.

2. CellML Secondary Specification - DAEs with Optional Resets

2.1. Preliminary matters

  1. This specification is a secondary specification to the CellML 1.2 Specification, and so should be read together with the CellML 1.2 specification. It acts to further narrow the CellML 1.2 specification. Models and software implementations must comply with the CellML 1.2 specification to comply with this specification.

  2. This specification additionally extends and modifies the secondary specification entitled "CellML Secondary Specification - DAEs with Optional Resets". All the rules provided in that specification apply in addition to those defined in this specification, except that where the rules in this specification conflict with the rules in the 'DAEs with Optional Resets' specification, the rules in this specification shall prevail.

  3. The rules provided in this specification are against the MathML 3 Strict form; the rules in this document apply to content MathML after all possible transformations to strict MathML defined in the MathML 3 specification have been applied.

  4. Except where stated otherwise, references to variables in this specification refer to model variables, and not to individual variable element information items.

2.2. Requirements for compliant models

  1. Every child of a math element information item in the MathML namespace must either comply with one of the two rules it is required to comply with according to the DAEs with Optional Resets specification, or with the following rule:

    1. The child of the MathML math element information item is an apply element information item, and the first MathML element information item child of that apply element is the sampleFromDistribution symbol from the uncertainty1 content dictionary, as defined in Section 2.4, “The uncertainty content dictionary”. In this case, the first argument of the sampleFromDistribution shall either:

      1. be a ci reference to a variable, or,

      2. be an application of the evaluatedAt operator in a manner consistant with the DAEs with Optional Resets specification, or,

      3. be an application of the vector constructor in content dictionary linalg2, such that each element of the vector is compliant with one of the above two options.

      The second argument of the sampleFromDistribution application shall be a distribution specification consistent with the rules in Section 2.3, “Rules for the specification of distributions”. Where the first argument is an application of the vector constructor, the second argument must describe the distribution of a vector; when the first argument is a ci reference or an evaluatedAt application, the second argument must describe the distribution of a real number.

  2. In addition to the two types of real constraint equation defined in the DAEs with Optional Resets specification, every application of sampleFromDistribution shall be treated as a real constraint equation, except that in the case that the first operand of the sampleFromDistribution operator is a vector constructor, it shall be treated as one real constraint equation for each element of the vector.

  3. In addition to the variable references that may appear in low complexity subset expressions according to the DAEs with Optional Resets specification, variables which appear as the first operand of a sampleFromDistribution operand, including variable references inside a vector constructor in the sampleFromDistribution operand, but excluding variable references inside an evaluatedAt application, may be referenced in low complexity subset expressions. It must be possible to place the expression into the non-negative levels defined for low complexity subset expressions such that the level of any variables used to define the expression are assigned a lower level.

2.3. Rules for the specification of distributions

A distribution must be specified in one of the following ways:
  1. Through the application of the distributionFromPDF symbol from the content dictionary uncertainty1, where the operand is a binding using the lambda symbol from the fns1 content dictionary, with one bound variable; the bound expression must be a valid real-valued low complexity subset expression that only uses the bound variable and low complexity subset variables that have been assigned a lower level. The definite integral of the function over bound variable values between minus infinity and infinity must be one, and the function must be defined and non-negative for any value in that range.

  2. Through the application of the distributionFromPMF symbol from the content dictionary uncertainty1, where the operand is a binding using the lambda symbol from the fns1 content dictionary, with one bound variable; the bound expression must be a valid real-valued low complexity subset expression that only uses the bound variable and low complexity subset variables that have been assigned a lower level. The sum of the function values over bound variable values between minus infinity and infinity must be one, and the function must be defined and non-negative for any value in that range.

  3. Through the application of the distributionFromRealisations symbol from the content dictionary uncertainty1, where the operand is either a valid real-valued low complexity subset expression, or an application of the vector constructor vector from content dictionary vector2, where each operand of vector is a valid real-valued low complexity subset expression.

2.4. The uncertainty content dictionary

  1. This section defines the content dictionary uncertainty1. This section merely provides definitions of the symbols in the content dictionary. To be compliant with this specification, models must only use symbols from this content dictionary in ways specified elsewhere in this specification.

  2. The sampleFromDistribution symbol is a binary operator that relates a random variable, the first operand, to its distribution, the second operand. The random variable may be a scalar or a vector.

  3. The distributionFromPDF symbol is a unary operator that describes a distribution using a probability density function. It takes as its only operand a function of one variable (which may be vector or scalar). The function should be non-negative for all values in its domain, and the integral over its domain should be equal to one. The result is a probability distribution for a random variable which may take values on the domain of the function, with the probability density function specified by the function.

  4. The distributionFromPMF symbol is a unary operator that describes a distribution using a probability mass function. It takes as its only operand a function of one variable (which may be vector or scalar). The function should be non-negative for all values in its domain, and the sum over its domain should be equal to one. The result is a probability distribution for a random variable which may take values on the domain of the function, with the probability mass function specified by the function.

  5. The distributionFromRealisations symbol is a unary operator that partially describes a distribution by using a series of samples (realisations) from the true distribution. It takes as its only operand a finite list, describing the realisations. The list shall be such that one of the following applies:

    1. All members of the list shall be vectors of real numbers, and the length of all the vectors shall be the same, or,

    2. All members of the list shall be real numbers.