Background: This project addresses the need to rapidly create functional prototypes for advanced image and data processing pipelines in modern life science research laboratories. Research data are big, heterogeneous, fast-evolving, and distributed. These data are typically processed by typical office applications (MS Excel) or custom vendor software on a variety of platforms (lab PCs, user workstations, Linux compute clusters, cloud, etc.). Enterprise-level applications for managing, integrating, and analyzing these data are expensive to build, maintain and evolve and don't always address the emerging needs of new research platforms and smaller-scale projects/teams. Finding solutions to these challenges that can quickly adapt to new requirements, data types, and timelines requires some 'outside the box' thinking.
Goals: Rapid prototyping of life science workflows and data integration.
Solution & Results: We have used Jenkins and the rich Jenkins plugins ecosystem to build easily accessible, highly functional, reproducible, and well-managed image and data processing/analysis pipelines for life science research applications.
Jenkins freestyle jobs(with the obligatory use of the Active Choices Plugin and some Groovy/Javascript glue-code) provide enough functionality to quickly build rather sophisticated 'one-page web applications' with the required interactivity and richness expected from other life science applications. Researchers can easily perform required image and data analysis tasks on the Jenkins server integrating the discovery, processing, visualization, and management of both results and workflow.
When I demonstrated 'Jenkins for Life Science Continuous Integration' a few years ago, Koshuke commented that this is a totally different domain application for Jenkins. Indeed true, yet built on the same foundational work of the Jenkins Community and The Plugin Contributors! Let's start thinking Jenkins is the Way to 'Think Outside the Box'.
The provenance and reproducibility of the results and computational tasks (highly desirable in a research environment) are well supported with the extensive Jenkins logging, fingerprinting, and storage of each job's parameters. Additionally, Jenkins allows laboratory scientists transparent access to high-performance compute cluster resources for tasks that otherwise would require the dedicated involvement of a research computing engineer to complete.
With the accumulated experience and development of reusable strategies and components, these freestyle research applications can be developed in a matter of days and put into the hands of real users for functional testing and requirement refinement. Importantly these Jenkins rapid prototypes have proven invaluable to providing guidance and requirements for building successful research enterprise applications in a fraction of the typical time.
Here's what we used:
And here's what we got in return: