Protein sequence databases and sequence annotation

20 March 2018
Cancellation deadline:
13 March 2018
Ivo Pedruzzi, Elisabeth Gasteiger, Marie-Claude Blatter
0.25 credits

No future instance of this course is planned yet


Using high-throughput technologies, you can identify long lists of candidate genes that differ between two experimental conditions. In order to interpret these gene lists and to discover fundamental properties like gene function and disease relevance, you need to use the annotation linked to a given gene or protein sequence.

The goals of this course and the practical exercises that follow are to give some basic theoretical and practical knowledge on protein sequence databases with a focus on UniProtKB, on Gene Ontology, on the different manual and automated annotation pipelines (such as HAMAP) and, in particular, on the optimum use of UniProt. During the theory and the practical sessions, we will discuss questions such as:

  • Where do the protein sequences come from?
  • What are the differences between the major protein sequence databases?
  • What are the manual and automated gene / protein annotation pipelines?
  • What are the Gene Ontology (GO) annotation pipelines?
  • How to assess protein sequence accuracy and annotation quality?
  • How to extract biological knowledge from a Blast result or gene list?


This course targets biologists and bioinformaticians who seek to analyse protein data. It will also be useful for people who programmatically access protein sequence databases and need to understand the data.

Min 12 participants.

Learning objectives

At the end of the course, the participants are expected to:

  • know the differences between the major protein sequence databases
  • understand the major sequence annotation pipelines and the GO annotation pipelines
  • assess the protein sequence accuracy and the annotation quality


Knowledge / competencies



You are required to bring your own laptop with an Internet connexion.


09h00 Welcome

09h15 Protein sequence databases, manual and automated annotation pipelines

10h30 Pause

11h00 Protein sequence databases… continued

13h00 Pause

14h00 Practicals

15h00 Pause

15h30 Practicals and discussion

17h00 End


The registration fees for academics are 50 CHF, and 250 CHF for industrial participants. This includes course content material and coffee breaks.

Deadline for registration and free-of-charge cancellation is set is set to 08 March 2018. Cancellation after this date will not be reimbursed. Please note that participation to SIB courses is subject to our general conditions.

You will be informed by email of your registration confirmation.


University of Lausanne, Genopode building, (Metro M1 line, Sorge station).

Additional information

Coordination: Patricia Palagi

We will recommend 0.25 ECTS credits for this course (given a passed exam at the end of the course).

You are welcome to register to the SIB courses mailing list to be informed of all future courses and workshops, as well as all important deadlines using the form here.

For more information, please contact