CellMLRepositoryMigration
This section discusses a pathway to implementing the CellMLRepositories.
CellML repository migration
This is the process a model in our current repository needs to undergo to become part of the new repository systems as described in the cellml repository documentation.
Current status
The current status of information relating to models is as follows:
The model source is located in the CellML model repository file-system, which is under CVS control(in the cellml cvs repository) and is served to the public through HTTP(apache web-server).
Example URL:
http://www.cellml.org/examples/models/cAMP_PKA_cascade_2000.xml
There is some human readable documentation about the model and publication(s) it is supposed to represent on a web-page that is generated from docbook formatted data in the cellml cvs repository.
Example of rendered documentation:
http://www.cellml.org/examples/repository/qualitative/cAMP_PKA_cascade_doc.html
Some models have biological annotation defined for them in the anatomy ontology. The web interface for the anatomy ontology does not display these yet. Even so, the data is available through the RDF representation that can be exported from the protege project file. There are some pressing issues with the ontology representation, but these are dealt with elsewhere(see cellml ontologies for any relevant information).
The initial instances for all the cellml models, their components, and the variables of the components were added through a script, but the actual linkage to biological concepts is being done by hand.
Initial outcome sought
The outcome we are looking for initially is not a complete set of features described in the cellml repository document. The most immediate ones to implement are :
Models to be shifted to new location in the repository based on the CellML naming convention described in the cellml repository document. There is one location where publicly available models are placed.
All associated data to be serialised into these model files also. This means we need to:
- keep existing meta-data in these files
- add the biological annotation from the anatomy ontology(see 3 of Current status) to these files
- add the descriptions from the web-page(see 2 of Current status) to these files. This could be added to the anatomy ontology and pulled out in the same way as the biological annotation in 2 is.
The models are made available through the CellML plone site. Each model is represented as an XML object and a default XSLT transform used to render some of the meta-data as the public interface. This requires we settle on a particular RDF serialisation form, such as abbreviated, so that XSLT becomes a viable option for this.
A simple work-flow is setup to allow people to add new models that are to be reviewed and published as part of the repository. The simplest is that public users are allowed to upload new file objects to a particular directory on the site(dubbed the public inbox). The names of the models should follow the naming convention set out in the cellml repository document. Any associated biological data should be added either as meta-data within that model file or in a plain text file with the same name as the model file but with a .txt extension instead of .xml. These are marked as private so that they are not publicly available. A reviewer with the appropriate role has access to these files and can download them to review. We should perhaps put these files into the cvs under a review subsection so that conflicts do not occur between reviewers as they review and modify them in their local workspaces. Using the Plone/Zope locking/version system is probably not a good option. One possible solution is that all files uploaded to this public inbox could be automatically passed through into the cvs repository. The other option is that a reviewer checks a submission in the inbox and pushes it through to the cvs repository by calling a method themselves. They can resolve conflicts.
In the future we can build in automatic creation of tickets for reviewers to accept responsibility for individual models that are to be reviewed. This will be fleshed out in more detail later in the cellml repository document.
Once a reviewer thinks a model is ready to publish they move it to the public portion of the CVS repository. These will automatically become published to the repository interface of the CellML site.
(I don't know about others, but maybe it is time to move from CVS to subversion. It's quite nice.)
An RSS feed is established for CellML models through the CellML site repository interface.
Public users should be able to add natural language comments to model objects through the CellML website interface.
Test case
We should implement the features from Initial outcome sought on one or two models to see what we think. To do this we need to:
- Identify two models from the repository for which there is sufficient biological annotation in the anatomy ontology.
- Add the website descriptions to the ontology data for these models. At the moment this should just be escaped HTML. We can work on a better schema for this in the future.
- Make copies of the original models in the CVS repository and put these(named appropriately) into the new location.
- Export the RDF form of the biological annotation for the two models from the anatomy ontology. Run this through an RDF serialiser to transform into the RDF-abbreviated form. Insert this manually into the two new model files in the CVS repository. This will require adding the appropriate cmeta:id identifiers on the relevant CellML elements.
- Implement the CVS repository -> CellML website object tool.
- Implement the XSLT transform for the models.
- Turn on public commenting and RSS syndication.
- Add the public inbox. For now a reviewer can download the files and manually add them to the review section of the CVS repository and delete them from the inbox.
- Add appropriate reviewer roles and check permissions.