Protein sequence databases and sequence annotation at UniProtKB

Date 14 October 2022
Speaker(s) Elisabeth Gasteiger, Marie-Claude Blatter, Ivo Pedruzzi
ECTS 0.25
Fees *academic: 60 CHF   -   for-profit: 300 CHF
Cancellation deadline 7 Oct 2022
City Streamed
*academic fee includes non for-profit organisations as well as unemployed participants at the time of application.

This course will be streamed only for registered participants. Registered participants will receive information from the course organizer in due time.


Using high-throughput technologies, you can identify long lists of candidate genes that differ between two experimental conditions. In order to interpret these gene lists and to discover fundamental properties like gene function and disease relevance, you need to use the annotation linked to a given gene or protein sequence.

The goals of this course are to give some basic theoretical and practical knowledge on protein sequence databases with a focus on UniProtKB, on the different manual and automated annotation pipelines (such as HAMAP) and, in particular, on the optimum use of UniProtKB. UniProtKB and HAMAP are SIB resources; they are listed in Expasy, the Swiss Bioinformatics Portal. During the lecture and exercise sessions, we will cover questions such as:

  • Where do the protein sequences come from?
  • What are the differences between the major protein sequence databases?
  • What are the manual and automated gene / protein annotation pipelines?
  • What are the Gene Ontology (GO) annotation pipelines?
  • How to assess protein sequence accuracy and annotation quality?
  • How to extract biological knowledge from a Blast result or gene list?


This course targets biologists and bioinformaticians who seek to analyze protein data. It will also be useful to programmers and data scientists, be they from academia or industry, who programmatically access protein sequence databases and need to understand the data.

Learning outcomes

At the end of the course, the participants are expected to:

  • list the differences between the major protein sequence databases
  • describe the major protein sequence and GO annotation pipelines
  • assess the accuracy of a protein sequence and the quality of annotation
  • use the query interfaces on the UniProtKB website to make meaningful and productive requests


Knowledge / competencies

This course is designed for beginners. There are no requirements.


This course will be streamed, you are thus required to have your own computer with an Internet connection.


The registration fees for academics are 60 CHF and 300 CHF for for-profit companies. While participants are registered on a first come, first served basis, exceptions may be made to ensure diversity and equity, which may increase the time before your registration is confirmed.

You will be informed by email of your registration confirmation. Upon reception of the confirmation email, participants will be asked to confirm attendance by paying the fees within 5 days.

Deadline for free-of-charge cancellation is set to 07/10/2022. Cancellation after this date will not be reimbursed. Please note that participation in SIB courses is subject to our general conditions.

Venue and Time

This course will be streamed using Zoom.

The course will start at 9:00 CET and end around 17:00 CET. Precise information will be provided to the participants in due time.

Additional information

Coordination: Monique Zahn, SIB training group.

We will recommend 0.25 ECTS credits for this course (given a passed exam at the end of the course).

You are welcome to register to the SIB courses mailing list to be informed of all future courses and workshops, as well as all important deadlines using the form here.

Please note that participation in SIB courses is subject to our general conditions.

SIB abides by the ELIXIR Code of Conduct. Participants of SIB courses are also required to abide by the same code.

For more information, please contact