CellMLGraphicalRepresentation
This section addresses the graphical representation of CellML models using a biologically relevant visual language.
This is very much in a draft form.
A proposal for representing graphical representations of CellML models
Theory
The goal is to represent CellML models in visual forms that help people to comprehend the biology of the model. Typically these are 2D representations that use a visual language familiar to people studying a particular area of biology. This proposal should be flexible enough to accommodate any visual language or purpose, but will focus on models of cellular physiology.
The current thinking for CellML ontologies (CellMLOntologies) is that each biological entity or relation is associated with CellML model elements through the use of the cmeta:id. Appropriate abstractions need to be declared in the model so that it can represent abstract concepts that may group more than one component. Given this level of annotation to a CellML model, then we should be able to address the equivalent graphical representation of the biological entities and relations that a CellML model represents.
For a given instance of a graphical representation, all that would be needed is for each rendered object that has biological significance to be associated with the relevant cmeta:id of the associated CellML element. But we want to go beyond describing this final instance of a graphical representation and also discuss the process for generating these and how the ontology data can help with this. Our chosen rendering representation language is SVG, but that should not change these ideas being applied to other languages.
Two aspects of building the graphical representation of a model are :
- choosing the visual language for the domain of biology
- laying out the graphical objects to form the final rendering
Requirements
Our choice of visual language needs to cover the biological entities and relations for cellular biology. We have not yet chosen exact visual representations for all of these. One requirement is that all instances of graphical representations are easily updated with new visual representations for the biological entities and relations as we evolve them.
The layouts for the graphical representations will be generated manually and stored in the instance documents. We will not be applying layout algorithms just yet.
To help build the diagrams, it is useful(almost necessary) that the graphical representation of the biological entities involved are automatically rendered into a document that can be loaded into the layout workspace.
One caveat, our target renderer at the moment is the intersection of what is supported by SVG in Mozilla and Adobe SVG 3.0. This means that one useful SVG element, use cannot declare non-local xlink attributes, i.e. all xlink targets must be ids within the same document. Xinclude is also not supported, so we cannot use that to load external SVG definitions into a local document. Both non-local use xlink targets and xinclude are not supported by either of the target platforms. XSLT is perhaps the best method left for us to build local SVG documents using external libraries of SVG elements.
Desired outcome
Each CellML model will have biological annotation added according to the CellMLOntologies specification. The outcome of this is that a cellml model will have appropriate cmeta:id attributes on the relevant CellML elements in the model. The biological annotation for these elements can be obtained from the relevant instances of the model annotation in the ontology.
Each biological entity and relation that is present in the CellML model will, as an outcome of 1, be represented as instances in the ontology. Each of these instances will have associated rendering template information. These may be available through properties that relate this instance to more generic parent instances. E.g. a class of enzymes may have a preferred representation that all specific members acquire. The resolution of acquired properties is left to the ontology interface to deal with, and is not part of this rendering specification.
The rendering template is used in combination with a short name, or other graphical objects(e.g. a chemical structure) to produce a specific graphical object for the final instance of the rendering.
The biological annotation of a CellML model is parsed and the relevant specific graphical objects generated and accumulated into a rendering instance document. It would be nice to support libraries of templates dynamically using SVG constructs such as use or xinclude, but as the caveat above describes, this is not yet possible. We can however produce an intermediate form that does contain use attributes with non-local links into a library and use XSLT to transform these into something that works now, which means bringing in the element referenced and referencing it using local links.
The graphical objects within this rendering instance document will also reference the cmeta:ids of the relevant elements they represent in the CellML model. This close association between the rendered objects and the CellML elements suggests there is a 1:1 relationship between the two. The discussion in CellMLOntologies addresses this. In summary, biological annotation should add CellML structures to a model(e.g. user defined grouping) to explicitly declare all biological abstractions that warrant annotation.
The rendering instance document is edited in an SVG capable editor to link up the objects and add any that are missing. If there are objects missing, one needs to question whether all relevant biological annotation is present for the model.
The final rendering instance document is made available to the public as part of the repository. There should be either explicit linking mechanisms(e.g. meta-data references in the CellML model and diagram linking the two) or implicit linking(e.g. through relations defined in the ontology) or both that link the CellML model and a graphical representation.
Initial steps to implement this
- Research current visual languages for representing cellular
physiology. They should cover:
- biochemical pathways
- cell signalling(includes electrophysiology)
- regulation of gene expression
- Assess the outcome of research in 1 and decide whether some set of visual languages is adequate. If not, then set about creating our own.
- Take a small group of models from the repository that have diagrams associated with them and redraw these using the selected visual language set. Review these.
- Implement an SVG template library for the visual set required to cover models in the repository.
- Implement the graphic processor that takes a CellML model's biological annotation and creates a document with all the relevant graphical structures for the model in it. This assumes that the biological annotation is present in the models in a form that can be parsed consistently. This is addressed in another document on CellML repository migration (CellMLRepositoryMigration). This can be done in parallel to item 4 and use what is available for testing purposes.
- For each model, render initial sets of graphical objects. Add biological annotation and graphical templates as necessary to achieve a full rendering of the model.
- Implement the linking mechanisms between the CellML model and the visualisation. This can be done in parallel with 6.
Considerations
- abstractions of groups of entities into biological processes or complex biological entities. This is addressed lightly in the discussion above.
- scaling - what happens when we scale back from a model representation. Do we collapse groups of biological entities and processes into single glyph representations? How are these collapsible groups defined?
glossary
rendering instance
instance of a graphical representation
(both of above are the same)
specific graphical object
rendering template