Bioinformatics of long read sequencing

Overview

The aim of this course is to familiarise the participants with long read (also called “third generation”) sequencing technologies, their applications and the bioinformatics tools used to assemble this kind of data. Multiple sequencing platforms, including Pacific Biosciences and Oxford Nanopore MinION, are now available to generate reads that are several kilobases-long. It is also possible to assemble Illumina reads to generate in-silico long reads. These improvements have greatly facilitated the assembly of genomes but some other applications are emerging, for example, for haplotype phasing, or for the study of alternative splicing using RNA-seq.

This course will be composed of an introduction to the techniques and data analysis methods, a minisymposium and a hands-on session. The minisymposium will consist of short presentations by SIB researchers on the applications of these technologies. It will be followed by a panel discussion between speakers and the audience, letting the opportunity to debate on the advantages and pitfalls of these technologies for research projects. The hands-on session will consist of computer exercises that will enable the participants to familiarize with real datasets from different technologies and the bioinformatics tools to assemble genomes.

Audience

This course is aimed at PhD students, post-doctoral and researchers in life sciences who would like to have a grasp of these technologies or who are planning to use and analyze them in their research.

Learning objectives

At the end of the course participants should be able to:

Identify the various applications, advantages and limitations of the methods presented
Assess the quality of your datasets
Extract raw reads and align them to a reference genome
Use the reads to assemble genomes de novo

Prerequisites

Knowledge / competencies:

Participants should have a good understanding of command line tools on Linux or Windows-based operating systems.If you do not feel comfortable with UNIX commands, please take our UNIX fundamentals e-learning module.

We also recommend a basic knowledge of the file formats used in the short reads NGS techniques (fastq, SAM/BAM, annotation file formats).

Technical:

Participants should bring a laptop with at least 4 GB RAM, 50 GB free disk space, and WIFI preinstalled.

Application

Please note that this course is oversubscribed.

The registration fees for academics are 100 CHF. This includes course content material and coffee breaks. Participants from non-academic institutions should contact us before application.

You will be informed by email of your registration confirmation. Upon reception of the confirmation email, participants will be asked to confirm attendance by paying the fees within 5 days.

Deadline for registration and free-of-charge cancellation is set to September 18, 2017. Cancellation after this date will not be reimbursed. Please note that participation to SIB courses is subject to this and other general conditions, available here.

Location

Irchel Campus, University of Zurich, Switzerland.

Schedule

Day1
Introduction	09:00	Introduction to the current technologies used for long read sequencing - Andrea Patrignani (ETHZ/SIB)
Mini-symposium	09:45	Methods for clustering highly similar isoforms from targeted PacBio sequencing - Mark Robinson (University of Zurich/SIB)
	10:10	Coffee break
	10:30	Using long read sequencing technologies to improve the study of alternative splicing - Amina Echchiki (University of Lausanne/SIB)
	10:55	The application of third generation, single molecule sequencing to alternative splicing and low frequency variant detection - Giancarlo Russo (ETHZ/SIB)
	11:20	Integrating NGS data for the assembly of prokaryotic genomes & a proteogenomics approach to identify their complete protein-coding potential - Christian Ahrens (Agroscope/SIB)
	11:45	Long read sequencing and assembly - fairytale and reality - Emanuel Schmid-Siegert (SIB)
	12:10	Round table with the speakers
	13:00	Lunch
Hands-on	14:00-17:30	Hands on - Amina Echchiki, Dan Jeffries and Kamil Jaroň (UniL/SIB) Cloud computing and Docker image - Walid Gharib (UniBe/SIB) In this session, the participants will familiarise with the raw long reads, learn how to assemble genomic reads, how to map genomic and RNA-seq reads to a reference genome, and finally how to proceed to transcriptome assembly.
Day2
Hands on	09:00 - 12:30	Hands on - Amina Echchiki, Dan Jeffries and Kamil Jaroň (UniL/SIB) Cloud computing and Docker image - Walid Gharib (UniBe/SIB) In this session, participants will interpret assemblies they have computed on the first day, learning how to assess quality of a genome assembley and run dowstream analysis.
	12:30	Lunch
Hands on	14:00 - 17:30	Analysis of PacBio Long Reads using the SMRT Analysis Package and SUSHI - Weihong Qi and Giancarlo Russo (ETHZ/SIB) This session consists of lectures providing an overview of the individual pipelines in the SMRT Analysis Package as well as hands-on exercises using the SUSHI data analysis framework developed at FGCZ, including quality control of PacBio data and data preparation, sequence alignment and variant detection, de novo assembly.

Additional information

You are welcome to register to the SIB courses mailing-list to be informed when the applications for this course will be open, of all future courses and workshops, as well as all important deadlines using the form here.

For more information, please contact training@sib.swiss.