Bioinformatics of long read sequencing

08 - 09 November 2018
The aim of this course is to familiarise the participants with long read (also called “third generation”) sequencing technologies, their applications and the bioinformatics tools used to assemble this kind of data. Multiple sequencing platforms, including Pacific Biosciences and Oxford Nanopore MinION, are now available to generate reads that are several kilobases-long. It is also possible to assemble Illumina reads to generate in-silico long reads. These improvements have greatly facilitated the assembly of genomes but some other applications are emerging, for example, for haplotype phasing, or for the study of alternative splicing using RNA-seq.

This course will be composed of an introduction to the techniques and data analysis methods, a minisymposium and a hands-on session. The minisymposium will consist of short presentations by SIB researchers on the applications of these technologies. It will be followed by a panel discussion between speakers and the audience, letting the opportunity to debate on the advantages and pitfalls of these technologies for research projects. The hands-on session will consist of computer exercises that will enable the participants to familiarize with real datasets from different technologies and the bioinformatics tools to assemble genomes.


This course is aimed at PhD students, post-doctoral and researchers in life sciences who would like to have a grasp of these technologies or who are planning to use and analyze them in their research.

Learning objectives

At the end of the course participants should be able to:

  • Identify the various applications, advantages and limitations of the methods presented
  • Assess the quality of your datasets
  • Extract raw reads and align them to a reference genome
  • Use the reads to assemble genomes de novo
  • Assemble a transcriptome
  • Deal with a long-reads transcriptome and differentiate it from a short-reads transcriptome


Knowledge / competencies:

Participants should have a good understanding of command line tools on Linux or Windows-based operating systems.If you do not feel comfortable with UNIX commands, please take our UNIX fundamentals e-learning module.

We also recommend a basic knowledge of the file formats used in the short reads NGS techniques (fastq, SAM/BAM, annotation file formats).


Irchel Campus, University of Zurich, Switzerland.


Thursday 08 November 2018


Introduction 09:00 Introduction to the current technologies used for long read sequencing - Andrea Patrignani (ETHZ and SIB)
09:45 Reference-based gene expression analysis with long-read sequencing data - Charlotte Soneson (University of Zurich, FMI and SIB)
10:10 Coffee break
10:30 Improving protein-coding gene calls using a long-reads based genome assembly - Amina Echchiki (University of Lausanne and SIB)
10:55 Applications of single molecule sequencing: alternative splicing, rare variants and methylation - Giancarlo Russo (ETHZ and SIB)
11:20 Integrating NGS data for the assembly of prokaryotic genomes & a proteogenomics approach to identify their complete protein-coding potential - Christian Ahrens (Agroscope and SIB)
11:45 Studying Bacterial gene flow in real time; The benefits of using long reads data - Marc Garcia Garcera (University of Lausanne and SIB)
12:10 Round table with the speakers
13:00 Lunch

End of the minisymposium

Hands-on 14:00-17:30

Hands on - Amina Echchiki (University of Lausanne and SIB) and Walid Gharib (University of Bern and SIB)

In this session, the participants will familiarise with the raw long reads, learn how to assemble genomic reads, how to map genomic and RNA-seq reads to a reference genome, and finally how to proceed to transcriptome assembly.

Friday 09 November 2018
Hands on 09:00 - 12:30

Hands on - Amina Echchiki (University of Lausanne and SIB) and Walid Gharib (University of Bern and SIB)

In this session, participants will interpret assemblies they have computed on the first day, learning how to assess quality of a genome assembley and run dowstream analysis.

12:30 Lunch
Hands on 14:00 - 17:30

Analysis of PacBio Long Reads using the SMRT Analysis Package and SUSHI - Weihong Qi and Giancarlo Russo (ETHZ/SIB)

This session consists of lectures providing an overview of the individual pipelines in the SMRT Analysis Package as well as hands-on exercises using the SUSHI data analysis framework developed at FGCZ, including quality control of PacBio data and data preparation, sequence alignment and variant detection, de novo assembly.

Additional information

