Snakemake for scalable and reproducible data analysis

Date 3 November 2017
Speaker(s) Johannes Köster
Cancellation deadline 1 Nov 2017
City Zurich

Overview

Data analyses usually entail the application of many command line tools or scripts to transform, filter, aggregate or plot data and results. With ever increasing amounts of data being collected in science, reproducible and scalable automatic workflow management becomes increasingly important. Snakemake is a workflow management system, consisting of a text-based workflow specification language and a scalable execution environment, that allows the parallelized execution of workflows on workstations, compute servers and clusters without modification of the workflow definition. Thereby, a scheduling algorithm based on a multidimensional knapsack problem allows Snakemake to maximize workflow execution speed while not exceeding given constraints like the number of available processor cores, cluster nodes or auxilliary hardware like graphics cards.

Since its publication, Snakemake has been widely adopted and was used to build analysis workflows for a variety of high impact publications. With about 5000 homepage visits per month, it has a large and stable user community.

Audience

This course is addressed to bioinformaticians and life scientists interested in learning how to create workflows of data management.

Learning objectives

This tutorial will introduce the Snakemake workflow definition language. After completing this workshop, the participants should be able to:

  • use the Snakemake execution environment to scale workflows to compute servers and clusters while adapting to hardware specific constraints.

  • create reproducible analyses that can be adapted to new data with little effort.

Prerequisites

Knowledge / competencies

Participants should have basic programming skills in Python.

Technical

Participants are required to bring a own laptop with Linux, Mac OS X or Linux on a virtual machine

Application

Please note that his course is oversubscribed. You can still apply for it by clicking on the boutton below and your name will be added to the waiting list.

The registration fees for academics are 30 CHF. This includes course content material and coffee breaks. Participants from non-academic institutions should contact us before application.

Deadline for registration and free-of-charge cancellation is set to 01/11/2017. Cancellation after this date will not be reimbursed. Please note that participation to SIB courses is subject to our general conditions.

You will be informed by email of your registration confirmation.

Location

Irchel Campus, University of Zurich.

Additional information

Coordination: Mark Robinson and Patricia Palagi

You are welcome to register to the SIB courses mailing list to be informed of all future courses and workshops, as well as all important deadlines using the form here.

For more information, please contact training@sib.swiss.