planning

About the possible pathways team cellml could take from here for tool development

About this document

This document is intended to enumerate the possibilities for the future of CellML tools. As such, if you feel that you have anything to add, please make the changes(or if you have insufficient access to the site to do this, e-mail your suggestions to ak.miller@auckland.ac.nz).

Intro

We are now at a point where we have to consider how best to continue our tool development projects. There are several reasons it is timely to consider this now.

We now have a new CellML API implementation. None of our existing software uses it (cmiss, mozCellML, cell editor).
We want a more seamless work environment for authoring models and running simulations
There are mixed views on the appropriateness of various design choices and technical implementations of what we currently have (cmgui, svg, canvas, xul, java, corba,...). The debate needs resolution so we can move forward more effectively. We want a framework that can incorporate other projects, such as Sarala's work.

Objectives

Easy to use environment for authoring models, running simulations, publishing and reviewing models.
Target users?
Target platforms?

Similar Software

There are currently several pieces of widely used software that provide similar functionality to what we are try to achieve with these CellML tools. They are typically focussed within a specific domain, with some exceptions. While none of them are open source, they all have significant numbers of users and we should be drawing from their experience in developing usable tools.

Current Status

Summary of our current tools.

mozCellML: Simulation environment written in XUL, Javascript, C++. Uses Cmgui for graphing via CORBA

pros

mozilla toolkit, which is widely considered the best XML workbench, support for MathML, SVG, XHTML etc. Good for building network aware application, for model repository interaction. Separation of GUI and backends. NOte this has nothing to do with a web browser.
good pathway for cross platform distribution and automatic updates.

cons

current virtual desktop design has problems
gui not at all friendly to the casual or firstime user.
cmgui not yet fully functional as a plotting tool
simulation speed questionable(note: this is not due to the graphical environment but a consequence of the algorithms used in the back end).
compatibliity with non-mozilla based web browsers (IE, opera). (CS. Need to support xulrunner to remove browser dependence - not a big deal. Toolkit choice has nothing to do with web browsers)

Cell Elite: CellML authoring environment written in Java
Around 80% of the code deals with user interface API like java.awt.* and javax.swing.*.
pros

"Write once, run anywhere" that that has a Java interpreter(i.e. better portability of binaries, in theory).

cons

Integration of the CellML editor and MathML editor are currently very poor: it is serialised to XML, saved to disk, and then the MathML editor is invoked on that file. It is then re-serialised and passed back via the disk to the CellML editor. This seems potentially quite disruptive to users' normal workflow, as the CellML document isn't available while the user is editing the MathML.
A canvas is used to paint the boxes used for the graphical display, rather than a vector graphics approach like SVG. As such, some degree of flexibility, which would permit extensions like Sarala's work, is lost.
Currently there is no separation of concerns between the GUI and the code, as all packages reference the Java GUI APIs.

Pathways Forward

Migrating to the new CellML API

The CellML tools do not currently use the new CellML API, but it would be useful if they did, as we currently have a large number of incompatible CellML APIs.

mozCellML currently has its own CellML API, which is tightly coupled with the function of mozCellML. The C++ part of mozCellML will require a significant rewrite to make it work with the new API. However, changes to the Javascript/user interface code should not be necessary.

For API, the current editor has about 3000 references to its Java CellML API, spread throughout the codebase. Around 30-50% of the code needs to be reviewed to fully understand/be able to modify the API usage.

Other possible changes

The issue of integrating the tools actually has two almost orthogonal aspects to it:

Making the tools refer to the same data. This means that changes in one tool will take effect in the other tool, but doesn't mean that there will necessarily be any closer integration than that.
Making the tools appear in a consistent environment, for example, providing a single window in which both tools can be used, and as such, facilitating workflows that involve both tools.

We have a number of options with regard to implementing one or more of these:
1. Data sharing only: Continue development in the current frameworks.

Migrate both mozCellML and Cell Elite to use the new CellML API. We can then use the "CellML Events" specification, which is currently being developed, to communicate between the applications. Communication could be across CORBA, or we could develop code which automatically maps our IDLs to JNI(Java Native Interface) and XPCOM(Mozilla's component object model).

This has the benefit of requiring less changes to achieve it, although migration to the CellML API will still be a large undertaking.

2. Adopt Framework eg Eclipse: Migrate to an established framework.

Eclipse is a well established environment for building Java based tools. It offers the similar services to a window manager, and in addition, offers some services specific to editing files, such as notification of when the file has changed(although these seem mainly intended for plugins which re-parse the entire file when any part of it changes, and so may not be well suited to CellML tools).

3. Migrate towards the Mozilla/XULRunner platform

For most people Mozilla means web browser. However for developers the Mozilla platform is actually a toolkit, like Java, QT or GTK. Firefox is a web browser built using the Mozilla toolkit. We are using the xulrunner target in this discusison to emphasise that this development has no dependance upon a web browser. The use of the name Mozilla is used to refer to the toolkit, not a web browser.

Designate the Xulrunner platform as the core technology of choice. Along with this we need to decide what supporting tech will we use or develop.

The Mozilla platform already offers a lot of technology which can replace work done on the editor out-of-the-box. For example, it has native SVG support with a fairly complete implementation(note that Batik, a Java product, is also available). We can write a lot of the code in platform-neutral Javascript, while still easily connecting over XPCOM to native code(which can be largely source-portable due to the libraries provided by Mozilla).

How do we get there?

Embed the existing java tools directly(perhaps as an applet)? This would be faster to implement, but it would give quite poor integration.

Rewrite existing Java tools for the Mozilla platform? By migrating to the CellML API it is likely that a high level of rewriting is required anyway, so it is not clear that this is better or worse than any other option.

Exploring the Pathways: 2. Adopt a Framework

This would require either a port of mozCellML to Java(which would likely be costly in terms of both developer time and performance, especially if the integrator is to be run in Java as well), or to use Rhino(a Java based Javascript engine) instead of the faster Spidermonkey engine used by Mozilla for the user interface, and Blackwood for Java-to-XPCOM access. This would likely have performance consequences, although the integrator could be kept in C.

Exploring the Pathways: 3. Migrating to Mozilla/XULRunner Platform

The mozilla platform is attractive because:

Good cross platform toolkit
Widely considered the best XML workbench, support for presentation MathML, SVG, XHTML etc.
Good for building network aware application, for model repository interaction.
Separation of GUI and backends. Javascript for GUI (python support not far away), C++, python or anything wrapped in XPCOM for backend components.
good pathway for cross platform distribution and automatic updates.

mozCellML already Mozilla platform based, only need to connect Java based tools. The Java tools could be either embedded largely as are within XUL interfaces, like applets, or rewritten.

A significant rewrite of the Java tools is needed to move to the CellML API. This will need most code to be checked and differences from the current API to be identified and fixed(around 20,000 lines could potentially contain calls to the CellML API, but we should be able to identify which ones are a problem just by attempting to compile).

Graphics in the CellML editor use a canvas to paint the controls, rather than using SVG. This could potentially cause problems for work extending the editor, and so a move to SVG would be very beneficial. This would probably require most of the cml.draw.* package to be replaced(1817 lines to be deleted, but we could probably replace it with less code using SVG), and possibly other changes throughout the codebase too.

Useful resources in the current Java tools:

Imported object promotion: the Java code provides tools for splitting initial conditions and MathML into separate documents, and we do not have support for this anywhere else.
The Java API contains code for a property representation of CellML models, i.e. plain English strings which describe a model in terms of properties. These strings could be useful to us.

Strategies to leverage the existing Java code:

Refer to them when porting to another language.
Access them, perhaps across CORBA, JNI, or Blackwood(this could be more trouble than its worth).
Automatically extract strings from them, either by an external parser, or from Java, using reflect, and converting these into a useful format(its not clear how easy this will be, but it might be possible for the properties, and so could give a port a head-start).

Development times and costs

Need more documentation on the pathways, including time estimates. Also need discussion on issues such as

resourcing, do we even have the time to consider this undertaking is it really worth it?
available expertise, can always get java developers, however mozilla platform requires a higher level of expertise to develop beyond the XUL GUI's. Only people with experience are Andrew, Matt, Carey and Shane - Matt is part time, Carey leaving, Shane very busy

Currently the CellML Editor uses about 45,000 lines of code, although this would likely be less if it was written in a language like Javascript.

mozCellML Issues to Resolve

Issue 1: Layout
Continue with virtual desktop approach(note these also apply to using javax.swing.* and so also apply to using Java for Cell Elite):

Advantages:

Nicer user interfaces, as there is no other portable way to provide the user interfaces and workflows that were created as a result of user feedback.
Consistency cross-platform, and ease of porting.

Disadvantages:

Doesn't interact well with plugins or cross-platform embedded windows.
Doesn't preserve a user's window manager settings for window look-and-feel(Mozilla gets around this to some extent by writing different XULs for different platforms).
Javascript code can't see mouse events after the mouse leaves the document, and so it can't distinguish a mouse down, mouse exit, mouse up, mouse enter from a mouse down, mouse exit, mouse enter. This means that dragging virtual window boundaries to the edge of the parent window can be confusing. However, the current behaviour could still probably be improved, making this a minor issue.
non-standard user interface resulting in user confusion and extra code to be maintained.

Issue 2: Graphing

Continue with cmgui for plotting?

Advantages:

Could be leveraged by cmgui users(?)
Provides a use-case for cmgui, and so drives improvements to cmgui.
Political implications of using our own product over external options.

Disadvantages:

A heavy-weight solution to a simple problem(binary size, memory usage, startup time, etc... less than optimal).
Portability issues with embedding windows cross-process — especially painful to get right on Win32, due to race conditions between Win32 messages and communication with cmgui. Win32 also doesn't get the repaint order right in all circumstances. Clearly this is not often used on Windows, and so not extensively tested by Microsoft.
Difficult to build outside of the institute, especially on Win32 — since the addition of cmgui, the advice to users has been "don't try to build this unless you have to" rather than the tradition for open source software of encouraging source builds.
Currently makes heavy use of Perl, although this may change with cmgui updates.
Embedding windows cross-process makes proper integration tricky — incompatible with virtual windows(or SWING if we port to Java), as one system window is either completely behind another or completely in front.This makes it impossible to implement some user interface features correctly.

Use Canvas element?

Advantages:

Lightweight(comes with Mozilla platform already, very fast to set-up, and only stores a single bitmap of the graph).
Possibility of using existing graphing library(although it is not clear they will be useful to us without extensive modifications => it might be easier and cleaner to write our own).
Easy to maintain(only a few screens of Javascript code were needed to implement axis labelling), C++ code to plot data should be simpler than the current code to send data and keep the scales up to date.
Will interact properly with both virtual windows and system windows.

Disadvantages:

Non-heterogeneity with SVG based tools. This is only really an issue if we want to plot graphs amongst the shapes in Cell Elite. Also non heterogeneity with others in the institute using cmgui.
Text is a pain, as it cannot be drawn on the Canvas. Instead need to use CSS position: absolute and move text over the graph. However, this only required a page or so of code to get right in my mock-up scale-free graph axis page.
Need to repaint all the data every time we change the scale due to new data or user activities like zooming.

Use SVG ?

Advantages:

Lightweight for graphs with small numbers of points.
Can re-use SVG in a number of contexts, e.g. to draw graphs(possibly with transformations applied) in the editor, or served up via the CellML repository.
Can simply apply different transformations to existing graphs to change scale etc...
More widely accepted standard.

Disadvantages:

Memory and time utilisation increases faster than the other approaches with number of points, and so graphs with large numbers of points(probably 1000s to 100,000s, depending on memory, CPU, etc... available) are likely to result in memory exhaustion or very poor performance. This is due to the fact that DOM/XML technologies are generally inefficient for representing large amounts of data. Mozilla SVG has known efficiency problems for large documents.
Although we don't have to explicitly redraw, the SVG engine still has to when we change the scale, so we are not necessarily gaining any performance, just shifting who does the work.

Use OpenGL directly?

Advantages:

Can create draw lists for the data, and so get better graphics acceleration.
Much more lightweight than a complete cmgui process.

Disadvantages:

Still need a plugin, which brings all the problems of having a system window.
Not clear how much work needs to be done on the XPCOM-OpenGL code to make it work on all platforms, and bring it up to date with changes made since it was created.

Issue 3: Access to the CellML API
Use CORBA

This provides an obvious strategy for integrating CellML tools: they simply connect to the local model server via CORBA, get a list of open models, and then modify them. Other applications can monitor for changes.

Advantages

Easily extensible, as any number of new tools could interact using the standardised CellML API and the upcoming CellML Events API, from a diverse range of different languages.
Very little effort required to create distributed CellML processing systems on top of this.
The standards compliant approach.
CellML tools can easily be run separately, and will be isolated from each other by process boundaries(giving additional robustness).
XPCORBA already written, so will let the distributed refcounting work properly from Javascript.

Disadvantages

CORBA requires a number of libraries to use.
CORBA imposes a performance penalty, as it must marshall, send across some form of socket, and unmarshall the request. Whether this is significant is unclear.
On Java, also need to generate finalisation code to make distributed refcounting work.

Use of a custom mapping from XPCOM to the CellML API

We write code to translate the API IDLs into code to connect the CellML API to XPCOM and/or JNI, and any other languages we need to access it from.
omniidl, part of omniORB, already provides an extensible framework for writing IDL compilers in Python. All the IDL parsing work is already done, so only need to write the code generation code. We have already used this to generate the C++ header files for the CellML API.

Advantages:

XPCOM access faster. JNI probably faster, although JNI is known to be slow as well.

Disadvantages:

No cross-process boundary, which has implications for robustness against programmer error.
Need to write a new binding for every new language we want to work with(unless we can go via XPCOM).
Cannot easily be distributed(although this option does not preclude building a CORBA server into the Mozilla process at a later time).