Project logistics
- Mentor: Rudolph Pienaar email: rudolph.pienaar-at-childrens.harvard.edu, Kristi Nikolla knikolla at bu dot edu, Ata Turk ataturk at bu dot edu
- Min-max team size: 4-5
- Expected project hours per week (per team member): 6-8
- Will the project be open source? Yes
Preferred past experience
- Job handling / concepts of distributed computing (Very important)
- Linux job basics (ssh to remote hosts, running/handling jobs) (Rather important)
- Python (Valuable)
- Docker / HPC scheduling (Nice to have, but will learn)
- OpenShift (Nice to have, but will learn)
Project Overview
Cloud-based medical image processing is an emerging component of the larger distributed processing field and offers many opportunties to explore new software approaches to collecting, disseminating, processing, and sharing both compute and data.
This project seeks to architect practical solutions to medical data processing. Our group at Boston Children's Hospital has developed a web-based workflow manager called ChRIS that allows for the collection, processing, and real-time collaboration on image data.
In Spring of 2015, a BU team designed and built a python-based scheduler for ChRIS on Massachussetts Open Cloud (MOC). In 2016, a second team demoed the encapsulation of a processing pipeline within a docker container. THis pipeline was integrated into the web-based front end.
This year, we seek to integrate several components into a web-based distributed system. The team will deploy ChRIS on a publically accessible web-server. For this project, the team will have login/ssh access to this server. On this server the team will also instantiate a containerized PACS (Picture Archive and Communications System) image server (called Orthanc) and populate it with anonymized MRI data provided by the mentor. The team will adapt/tweak the existing image search plugin within ChRIS to query the dockerized server (currently the search plugin queries the commerical PACS server at BCH). Once the existing system has been adapted/tweaked to pull images from the dockerized server, the team will implement a solution to transferring data from the web server filesystem out to a remote compute platform. As with the search, the BCH team already has developed modules to do this, but some tweaking is necessary to deploy in a new system. Finally, the team will execute a dockerized analysis pipeline on the remote compute platform, return the results to the web server, and the ChRIS system will then present to the user.
In summary:
- Instantiate a docker instance of a PACS server (Orthanc) on a publically accessible machine (provide by the mentor)
- Deploy the existing ChRIS system on this machine in its own docker container.
- Connect/tweak/adapt the existing ChRIS PACS search plugin to query/retrieve from the Orthanc dock.
- Adpat existing python modules to transfer data (via REST-http) from the server to a remote HPC.
- Execute a dockerized pipeline on the remote HPC.
- Adapt existing python modules to pull processed data back to ChRIS.
Some Technologies you will learn/use:
- Linux process management and distributed system design
- Docker usage and architecturing in a real-world practical system
- HPC, particularly as pertains to self-contained compute.