Protein Sequence Databases and Sequence Annotation at UniProtKB

28 April 2023
Application deadline:
20 April 2023

Cancellation deadline:
20 April 2023
Elisabeth Gasteiger, Marie-Claude Blatter, Ivo Pedruzzi
SIB resources
0.25 credits
Check out the
Course material

Applications are closed because the course is full with a long waiting list or has just passed. to receive notification when a new course is scheduled.

No future instance of this course is planned yet


Using high-throughput technologies, you can identify long lists of candidate genes that differ between two experimental conditions. In order to interpret these gene lists and to discover fundamental properties like gene function and disease relevance, you need to use the annotation linked to a given gene or protein sequence.

The goals of this course are to give some basic theoretical and practical knowledge on protein sequence databases with a focus on UniProtKB, on the different manual and automated annotation pipelines (such as HAMAP) and, in particular, on the optimum use of UniProtKB. UniProtKB and HAMAP are SIB resources; they are listed in Expasy, the Swiss Bioinformatics Portal. During the lecture and exercise sessions, we will cover questions such as:

  • Where do the protein sequences come from?
  • What are the differences between the major protein sequence databases?
  • What are the manual and automated gene / protein annotation pipelines?
  • What are the Gene Ontology (GO) annotation pipelines?
  • How to assess protein sequence accuracy and annotation quality?
  • How to extract biological knowledge from a Blast result or gene list?


This course targets biologists and bioinformaticians who seek to analyze protein data. It will also be useful to programmers and data scientists, be they from academia or industry, who programmatically access protein sequence databases and need to understand the data.

Learning outcomes

At the end of the course, the participants are expected to:

  • list the differences between the major protein sequence databases
  • describe the major protein sequence and GO annotation pipelines
  • assess the accuracy of a protein sequence and the quality of annotation
  • use the query interfaces on the UniProtKB website to make meaningful and productive requests


Knowledge / competencies

This course is designed for beginners. There are no requirements.


This course will be streamed, you are thus required to have your own computer with an Internet connection.


The registration fees for academics are 100 CHF and 500 CHF for for-profit companies. While participants are registered on a first come, first served basis, exceptions may be made to ensure diversity and equity, which may increase the time before your registration is confirmed.

You will be informed by email of your registration confirmation. Upon reception of the confirmation email, participants will be asked to confirm attendance by paying the fees within 5 days.

Deadline for free-of-charge cancellation is set to 14/04/2023. Cancellation after this date will not be reimbursed. Please note that participation in SIB courses is subject to our general conditions.

Venue and Time

This course will take place at the University of Lausanne.

The course will start at 9:00 CET and end around 17:00 CET.

Precise location will be communicated to the participants in due time.

Additional information

Coordination: Grégoire Rossier, SIB training group.

We will recommend 0.25 ECTS credits for this course (given a passed exam at the end of the course).

You are welcome to register to the SIB courses mailing list to be informed of all future courses and workshops, as well as all important deadlines using the form here.

Please note that participation in SIB courses is subject to our general conditions.

SIB abides by the ELIXIR Code of Conduct. Participants of SIB courses are also required to abide by the same code.

For more information, please contact