Securing the future of important pharmacology software
21 August 2015
When the original developers of an important suite of research software retired, the Research Software Development Team were asked to contribute to the efforts being made to ensure that this code is updated and maintained for future generations of researchers.
DCprogs is a suite of programs developed by Professor David Colquhoun for the analysis of stochastic data generated by observation of currents that flow through single ion channels.
The patch clamp technique, developed by Erwin Neher and Bert Sakmann in the late 1970s, makes it possible to measure currents generated by the opening and closing of single ion channels in excitable cells. Time series of measured open/shut time intervals provide information about the conformational transitions of the ion channels. This makes it possible to estimate the transition rates in the system, which is modelled as a Markov process with discrete states in continuous time. For example, one can estimate from experimental observations things like the energy barriers for conformational changes of the protein and the binding and unbinding rates of ligands.
DCprogs implements the general stochastic theory of single ion channel measurements (developed largely by Colquhoun & Hawkes from 1977 onwards) for the interpretation and fitting of data. No other commercial or academic software carries out the same analysis. In particular, no other software implements an exact allowance for the “missed events” problem in single channel recording where, due to physical limits in recording bandwidth, some short-lived transitions do not appear in the data. This makes DCprogs unique and important for the ion channel community at large, beyond UCL and the UK.
For the last 10 years Professor D Colquhoun and Professor L Sivilotti have been running a summer workshop, to train scientists from all over the world to analyse single channel data (http://www.onemol.org.uk). This course is always oversubscribed.
Some of the code in DCprogs was written for PDP11 minicomputers in the 1970s, but most of the software was produced as DOS programs in the 1990s, and later some of it was rewritten for Windows.
The present version of DCprogs is functional and is currently in constant use in Prof L Sivilotti’s lab and others. However, the retirement of the key developers, and the considerable changes in computing in last 30 years make it difficult to support the programs and adapt them to new research requirements.
This puts the future of this significant
research pipeline at high risk, as the research demands that increasingly
complex kinetic models are fitted to larger and larger data sets. Analysis
techniques that achieve this are needed by the ion channel field, to allow
biophysicists to combine functional and structural information and provide a
firm quantitative basis for in silico approaches such as molecular dynamics.
What we did
The Research Software Development Team (RSDT) were brought in to help the group update DCprogs, making it sustainable to maintain, improve and extend functionality over time. Within the time available it would not have been possible to produce a fully re-written suite of programs, so the RSDT focused on redesigning a part of HJCFIT, one of the core DCprogs tools. The team has fully re-written the likelihood calculation - a core function of HJCFIT.
The original version of HJCFIT was in the FORTRAN programming language and attempts had been made to convert it into Python and Matlab. All these versions were, to varying degrees, a compromise between speed and readability of the code. The RSDT’s efforts therefore focused on re-writing the code in a form that optimised both speed and readability. The solution was to write the key algorithms in the C++ programming language, which is very efficient when it comes to performing calculations, and then to create ‘bindings’ for this code, which allow these processes to be called from Python, a more user-friendly programming language.
Throughout the project the RSDT maintained regular two-way communication with Professor Sivilotti’s group to ensure that the new version of HJCFIT was fit for purpose and that the researchers who would be responsible for maintaining DCprogs understood the changes being made. The code is now held on the GitHub repository, making it easier to manage individual contributions to the project in future.
Results / impact of the work
From the outset, the objective of the project was to have a code base that is not only computationally efficient, but also documented and maintainable. Part and parcel of RSDT’s work was therefore to ensure that the code base contains detailed, accurate and testable documentation. This allowed Professor Sivilotti’s group to add MATLAB bindings (written by Michael Epstein) with scarce input from RSDT, thus showing that RSDT’s efforts will be sustainable in the future.
Currently the lab is intensively testing a prototype version of Python-based HJCFIT that uses the new core calculations written in C++. This should be ready for release as a working version by the end of 2013. In practice, RSDT’s involvement has made it possible to transform what used to be a legacy code into an active research tool upon which further research can be built.