This repository provides a ready-to-use Docker image for Cloudgene and installs all requirements, especially Apache Hadoop (CDH5). Cloudgene is a workflow system for managing Hadoop jobs graphically. Hadoop workflows (or simple “apps”) can be connected to Cloudgene. The idea of Cloudgene is summarized here.

Connecting Apps to Cloudgene

When starting a new Cloudgene Docker container, a repository with apps need to be specified. By default, we connect our repository including the following apps:

  • WordCount: The “Hello World” of Hadoop
  • mtDNA-Server: A contamination and heteroplasmy pipeline, available also as a service.
  • Michigan Imputation Server: Currently only available as a service, but soon on Docker!

Pull & Start Cloudgene

docker pull seppinho/cloudgene-docker

Start Cloudgene with remote repository

sudo docker run --privileged -it -p 8082:8082 -p 50030:50030 -p 50060:50060  seppinho/cloudgene-docker --repository https://github.com/seppinho/cloudgene-apps-docker

Start Cloudgene with local repository

sudo docker run --privileged -it -p 8082:8082 -p 50030:50030 -p 50060:50060 -v <local-app-repository>:/opt/cloudgene/apps/ seppinho/cloudgene-docker



© 2010-2016 by Lukas Forer and Sebastian Schoenherr • Cloudgene is licensed under AGPL-3.0 • Template based on jetstrap-docs