Focus on the group's mission

The Swiss-Prot team excels in the art of generating machine-readable knowledge of biology from the ever growing body of scientific publications. It is harnessing the power of deep learning to accelerate literature triage and information extraction, thus delivering the most accurate and informative evidence to users in a timely manner. 

Biocuration and software development

Our team of biocurators and software developers annotate, maintain and develop a range of internationally renowned expert-curated knowledge resources: 

  • The SIB Resources UniProtKB/Swiss-Prot, the most widely used protein information resource in the world, and Rhea, the database of biochemical reactions, are recognized as Global Core Data Resources and as ELIXIR Core Data Resources. 

Learn more about SIB’s Open software and databases 

  • The HAMAP and PROSITE databases of protein families and domains, the ENZYME database of enzyme nomenclature, the SwissLipids database of lipid structures and biological knowledge, the ViralZone portal, and SwissBioPics, a resource of interactive cellular images. 
  • The group participates in the development and maintenance of many of the protein analysis tools listed on Expasy, the Swiss Bioinformatics Resource Portal.  

Read the interview 

 

The team supports the development of tools and resources for researchers and clinicians. An example is the SVIP-O platform which offers a harmonized interpretation, reviewed by clinicians, of the variants found in cancer patients to support clinical research. 

More about our biocuration and software development offering

Supporting AI with machine-readable biological knowledge

Knowledgebases like UniProtKB are an essential part of the AI ecosystem; the collective biological knowledge they contain, in the form of pathways, ontologies and networks, can be used to create generalizable and interpretable models that reveal actionable biological mechanisms.  

The expert curated part of UniProt is for instance used as a reliable training set for Large Language Models (LLM), to support the design of new proteins with desirable functions (source: Nature biotechnology paper, 2023). 

  •  Coudert E, Gehant S, de Castro E, Pozzato M, Baratin D, Neto T, Sigrist CJA, Redaschi N, Bridge A; UniProt Consortium. Annotation of biologically relevant ligands in UniProtKB using ChEBI. Bioinformatics. 2023 Jan 1;39(1):btac793. doi: 10.1093/bioinformatics/btac793. PMID: 36484697; PMCID: PMC9825770. 
  •  UniProt Consortium. UniProt: the Universal Protein Knowledgebase in 2023. Nucleic Acids Res. 2023 Jan 6;51(D1):D523-D531. doi: 10.1093/nar/gkac1052. PMID: 36408920; PMCID: PMC9825514. 
  •  Gene Ontology Consortium. The Gene Ontology knowledgebase in 2023. Genetics. 2023 May 4;224(1):iyad031. doi: 10.1093/genetics/iyad031. PMID: 36866529; PMCID: PMC10158837. 
  •  Paysan-Lafosse T, Blum M, Chuguransky S, Grego T, Pinto BL, Salazar GA, Bileschi ML, Bork P, Bridge A, Colwell L, Gough J, Haft DH, Letunić I, Marchler-Bauer A, Mi H, Natale DA, Orengo CA, Pandurangan AP, Rivoire C, Sigrist CJA, Sillitoe I, Thanki N, Thomas PD, Tosatto SCE, Wu CH, Bateman A. InterPro in 2022. Nucleic Acids Res. 2023 Jan 6;51(D1):D418-D427. doi: 10.1093/nar/gkac993. PMID: 36350672; PMCID: PMC9825450. 
  •  Bansal P, Morgat A, Axelsen KB, Muthukrishnan V, Coudert E, Aimo L, Hyka-Nouspikel N, Gasteiger E, Kerhornou A, Neto TB, Pozzato M, Blatter MC, Ignatchenko A, Redaschi N, Bridge A. Rhea, the reaction knowledgebase in 2022. Nucleic Acids Res. 2022 Jan 7;50(D1):D693-D700. doi: 10.1093/nar/gkab1016. PMID: 34755880; PMCID: PMC8728268. 

Members

View our group members here