SIB Profile 2021 - Data scientists for life

Page 1

SIB Profile 2021

Data scientists for life

SPECIAL FOCUS

AROUND COVID-19 AROUND MACHINE LEARNING


COVER IMAGE

A different view of the pandemic Frequencies of SARS-CoV-2 variants, with a focus on sequences collected in Switzerland* (i.e. about two thirds of all the sequences in the analysis as of 6 April) since the beginning of the pandemic. Each variant is displayed in a different colour, and their respective frequency in the analysed samples varies in time, from left (older) to right (most recent). * DATA FROM SWITZERLAND WERE CONTRIBUTED BY A RANGE OF GROUPS (SEE THE FULL UPDATED LIST ON NEXTSTRAIN.ORG/GROUPS/SWISS/NCOV/ SWITZERLAND), INCLUDING THE COMPUTATIONAL EVOLUTION GROUP (SIB & ETH ZURICH), AND PROCESSED WITH THE SIB RESOURCE V-PIPE DEVELOPED BY THE COMPUTATIONAL BIOLOGY GROUP (SIB & ETH ZURICH).



2

Forewords

C

OVID-19 has taught us the hard way how crucial the readiness of a sustained, long-term funded life-science infrastructure is. The more stable the infrastructure, the more agile the research community and its ability to quickly tackle major questions. SIB has been an actor and advocate for infrastructure sustainability for decades and has been building and consolidating partnerships to this end on a national and international level. In this defining year for society and science, the institute showed its ability to build upon the existing to adapt to new needs from the scientific community, and to position itself as an essential coordinator to accelerate research, in Switzerland as well as on the international scene. Never has such a neutral, nationwide and agile structure proved so important for addressing the challenges of today and tomorrow. •

Felix Gutzwiller President of the Foundation Council

“Never has such a neutral, nationwide and agile structure proved so important for addressing the challenges of today and tomorrow.”

A

longside this year’s overwhelming theme, SIB kept on producing exciting results on the machine learning (ML) front, from research to clinical applications. ML has been part of the routine bioinformatics data analysis toolkit for decades, supporting research in drug repurposing, biomarker discovery, etc. Today, the need for expertly curated databases, where SIB is a recognized leader, is also increasingly acknowledged worldwide in the context of artificial intelligence (AI). Indeed, AI cannot offer meaningful results without reliable input data: garbage in, garbage out. The creation of high-quality databases thus offers a robust basis for AI. As more biomedical data become available to research, such as through the Swiss Personalized Health Network (SPHN), such skills and the underlying knowledge are all the more crucial to ensuring that AI delivers the expected benefits for clinicians, and ultimately citizens. •

Jérôme Wojcik Chairman of the Board of Directors


3

SIB Profile 2021

F

rom many points of view, 2020 was the worst of years. A year where hugging our loved ones was no longer a sign of care. A challenging year, where things as natural as meeting colleagues had to be reinvented. But it was also the best of years, when we enjoyed the flexibility of working from home and explored the regions we each live in. On the scientific side, bioinformatics was suddenly everywhere and the Swiss National COVID-19 Science Task Force relied on the expertise of several of our Group Leaders. The international research community was more reliant than ever on our databases and tools to uncover insights about the virus. Together with our partners, we established the Swiss SARS-CoV-2 Data Hub to contribute to global data-sharing efforts. The crisis also brought new ideas and formats: thanks to a modular and interactive online concept, our community conference, the SIB Days, remained true to its ambition of representing the scientific diversity of Swiss bioinformatics, and attracted a record number of participants. On the learning front, our virtual bio­ informatics training offer reached new and more international audiences. Alongside all these changes, there was nevertheless continuity in the battles and activities that make up the core of our institute. SIB continued to lend its expertise to support a sustainable biodata infrastructure, on the Swiss and global levels. It kept tailoring the secure BioMedIT network to fit the needs of research groups in the context of the Swiss Personalized Health Network (SPHN). In brief, scientific excellence did not flinch, despite the pressure.

“Scientific excellence did not flinch, despite the pressure.” Once more, we must pay tribute to the hard work, trust and dedication of our teams and colleagues, which reinforces our enthusiasm in steering our institute forward. We also thank the Federal government for its essential support, through the State Secretariat for Education, Research and Innovation SERI: we are pleased to be able to convert this support into concrete benefits for society, through the work of our 800 members. •

Christine Durinx Joint Executive Director

Ron Appel Joint Executive Director


4

10 DATA SCIENTISTS FOR LIFE

40 SPECIAL FOCUS AROUND COVID-19


5

SIB Profile 2021

Table of contents 06

Bioinformatics: a definition

08

Converting biological questions into answers

10

Data scientists for life

12

SIB in brief

14

Supporting our partners’ needs

20

For a lasting life-science infrastructure

26

A network of scientific expertise

34

Organization and governance

40

Around COVID-19

42

Supporting international research

44

A transformative impact

48

Around machine learning

50

In Swiss bioinformatics

52

Focus on biomedical applications

55

Index of Group and Team Leaders

59

Acknowledgements

48 SPECIAL FOCUS AROUND MACHINE LEARNING


6

Bringing bioinformatics to society From precision medicine to drug design and DNA testing: bioinformatics is increasingly tied to health and societal issues. Through its public outreach activities, SIB informs the public about the discipline and its applications. 2020 was an unusual year in terms of events but no opportunity was missed to create ways to reach the public: • An e-workshop to understand SARS-CoV-2 (SEE P. 45); • A new version of ChromosomeWalk.ch to explore the world of biomedicine with 300 fascinating gene stories and an illustrated tour of precision medicine;

Bioinformatics is the application of computer technology to the understanding and effective use of biological and clinical data

• The Protein Spotlight stories revisited as monthly comic strips, to discover the role proteins play in the grand scheme of things.

More activities and news on Facebook, our dedicated outreach channel in French and English goo.gl/4c6xCZ


7

SIB Profile 2021

Bioinformatics: a definition Thanks to computer-based approaches, researchers can improve their understanding of complex systems. Life scientists and clinicians have always tried to assemble data and evidence to find the right answers to fundamental questions. Nowadays, there is no shortage of data. But a different kind of problem has emerged. New technologies are producing data at an unprecedented speed. Indeed, so much data – and of such variety – that they can no longer be interpreted by the human mind alone. Enter bioinformatics. Bioinformatics is the application of computer technology to the understanding and effective use of bio­log­ ical and biomedical data. It is the discipline that stores, analyses and interprets the big data generated by life-science experiments, or collected in a clinical context. This multidisciplinary field is driven by experts from a variety of backgrounds: biologists, computer scientists, mathematicians, statisticians and physicists.

Bioinformatics encompasses: DATABASES for storing, retrieving and

organizing information to maximize the value of biological data; SOFTWARE TOOLS for modelling, visualizing, interpreting and comparing biological data; ANALYSIS of complex biological datasets or systems using novel statistical approaches or machine learning techniques; RESEARCH in a wide variety of biolog-

ical fields and leading to applications in diverse areas, from agriculture to precision medicine; (SEE P. 36) COMPUTING AND STORAGE INFRASTRUCTURE to process and

safeguard large amounts of data.

What sort of data are we talking about? Bioinformatics deals with a broad spectrum of complex data types. Sequence data from DNA, RNA or proteins

Expression data, such as the level of expression of a gene in a sample

Imaging data

Text And more...


8

Converting biological questions ... Massive amount of data and data types: genetics, text, biochemical, imaging, etc.

Hospitals and clinics

Research institutes

Private sector

International institutions

Life sciences and health actors

... into answers with various applications

Basic research

Medicine

Environmental sciences

Tailoring treatment to cancer patients Agriculture


9

SIB Profile 2021

SIB Swiss Institute of Bioinformatics

Secure services for sensitive data

Data management

Software engineering Biostatistics and and tailoring bioinformatics analysis

Process optimization

Training

Expert biocuration

Dedicated multidisciplinary experts

Understanding the origin of beetle diversity

Real-time tracking of pandemics


1 0


SIB Profile 2021

Data scientists for life This is who we are: multidisciplinary experts safeguarding data, sharing their value and making them speak to solve biological questions.

1 1


1 2

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

SIB in brief As a non-profit foundation, we lead the field of bioinformatics in Switzerland, in order to foster advances in life sciences and health.

82 784 189 24 160 groups

Infrastructure

Community

members, including

SIB provides the national and international life-science community with a state-of-the-art bioinformatics infrastructure, including resources, collaborative support and services. DATABASES AND SOFTWARE TOOLS

We create, maintain and disseminate worldwide a large portfolio of databases and software tools, including some of the world-leading resources for life sciences, enabling researchers to leverage knowledge about life and foster innovations. COMPETENCE CENTRES

We offer in-depth expertise and support in bioinformatics, from secure infrastructure for sensitive data and analyses of all kinds of biological data to software development and data management.

employees

institutional partners across Switzerland Over

databases and software tools developed by our members and accessible via the Expasy web portal Over

3,225 peer-reviewed articles published since SIB’s creation in 1998

As of 1 January 2021

SIB brings together worldclass researchers based in Switzerland and delivers training in bioinformatics. SCIENTIFIC COLLABORATION

Through knowledge exchange networks, collaborative projects and events, we strengthen cooperation on shared issues among bioinformatics research and service groups from Swiss schools of higher education and research institutes. TRAINING IN BIOINFORMATICS

To ensure that life scientists and clinicians make the best of the data, we provide them with a large portfolio of courses and workshops. We also foster exchanges and training among bioinformatics and computational biology PhD students.


A FEW WORDS FROM MEMBERS OF THE BOARD OF DIRECTORS Katja, you joined SIB’s Board of Directors (BoD) in January 2021. How important is the institute to the Swiss and global biodata landscape? Katja Bärenfaller My enthusiasm for SIB was certainly the major motivation to join it as a Group Leader and to become a member of its BoD. While working on proteomics and data mining, I realized, for instance, how essential well-curated and accessible datasets and data analysis tools, such as those supported by SIB, are. The huge value of the institute has also become evident in the COVID-19 pandemic, with SIB Experts and Resources taking part in the global effort (SEE P. 42). Since the expertise and the resources were already available, they could rapidly focus on this crucial topic. As one of 15 female Group Leaders at SIB, do you feel you are carrying more than the voice of Group Leaders on the BoD? KB It is important to challenge the persisting stereotype of male bioinformaticians, and to address the gender gap: I therefore find it very positive that both the Executive Directors and the Group Leaders’ representatives on the BoD are gender-balanced. In addition to carrying the voice of female Group Leaders, I am very motivated to strengthen the national relevance of SIB and its role and visibility in German-speaking Switzerland. Finally, I would also like to be a voice for different types of biodata and resources, since I used to work in plant science and was involved in the struggle to maintain model organism information resources.

1 3

SIB Profile 2021

From five groups in 1998 when SIB was created, to 82 today. Christophe, what do you foresee for the years to come in terms of new directions and challenges to be addressed by SIB? Christophe Dessimoz SIB will need to double down on its unique strengths, which I believe are 1) to provide, through globally recognized databases and resources, goldstandard data that are critical to serve as a quality source of datasets for machine-learning algorithms; 2) to bring bioinformatics resources and research together at a national and international level and continue to serve as a model in that respect; 3) to contribute to establishing standards of excellence in bioinformatics research, service, and infrastructure through its community activities and the collaborative projects it fosters, such as SVIP-O or BioMedIT (SEE P. 21).

What are some of the key initiatives undertaken to maintain a sense of belonging among SIB’s members? CD First, the systematic and proactive approach in terms of internal communication, both with members and employees, which has been greatly strengthened over recent years. Second, the SIB-wide events with a social dimension, and in particular the biennial “SIB Days” (SEE P. 46). The most recent edition, for instance, even though it was held virtually, saw a record attendance. Initiatives benefitting all members in terms of promoting scientific visibility and excellence are also key, such as the SIB Remarkable Outputs (SEE P. 32) or the Bioinformatics Awards. Finally, initiatives dedicated to specific interest groups, such as the Dev’Forums* or the SIB PhD Network. * Dev’Forums are technical, informal and short meetings to promote networking and experience exchanges across the community of SIB Developers

“It is important to challenge the persisting stereotype of male bioinformaticians, and to address the gender gap.”

Katja Bärenfaller Group Leader SIB, Swiss Institute of Allergy and Asthma Research (SIAF) Davos

Christophe Dessimoz Group Leader SIB, University of Lausanne, University College London Lausanne


1 4

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

Supporting our partners’ needs Discover our competences, explore our portfolio of leading databases and software and meet our experts.

EXPERT BIOCURATION

DATA MANAGEMENT

Generating high-quality, up-to-date annotations

Organizing data for long-term reuse

Our biocuration experts excel in the art of generating knowledge from a growing body of publications. We provide expert biocuration on various data types including proteomics, lipidomics and transcriptomics. This includes help with setting up expert-annotated resources for a wide range of applications, such as understanding protein function, facilitating clinical interpretation of cancer variants or enabling biomarker discovery.

SOFTWARE ENGINEERING AND TAILORING

Developing engaging and customized tools

We assist our partners by defining and implementing their Data Management Plans (DMP) for research proposals; reaching data interoperability targets, from local to international scales, within academic or regulated environments; ensuring the long-term management and storage of biological data.

Our software engineers and User eXperience specialists contribute to some of the world-leading databases and tools for life sciences, as well as tailored applications for personalized medicine in hospitals or industry settings. They assist our partners in creating user-friendly software – or adapting existing products to meet specific needs – based on the most up-to-date web technologies.

RAmener plus


1 5

SIB Profile 2021

SECURE SERVICES FOR SENSITIVE DATA

Dedicated, secure IT environment to process sensitive human data Our encrypted information technology infrastructure complies with all current data protection regulations (incl. GDPR) and IT security standards. Our partners can therefore process and host both sensitive data – such as genomic information or health records – and non-sensitive data – with complete confidence. We use modern virtualization technologies such as OpenStack to protect our computing environments.

TRAINING

PROCESS OPTIMIZATION

Boosting bioinformatics skills

Gaining efficiency, from analysis pipelines to quality control

Our comprehensive – and constantly evolving – course portfolio provides hands-on experience of the most up-to-date bioinformatics techniques and resources, including clinical applications for researchers or healthcare professionals. We offer about 100 course-days per year, making up over 50 courses provided by 80 trainers to 1,200 participants.

Our experts harmonize and optimize internal data management processes, analysis pipelines or software tools. We also organize and support clinical benchmarking activities with Swiss or international partners.

Find the full list of courses at sib.swiss/training

Making biological data speak

The national BioMedIT network, set up by SIB and operational at all three nodes in Basel (sciCORE, operated by the University of Basel), Lausanne (Core-IT, operated by SIB) and Zurich (SIS, operated by ETH Zurich), allows researchers to approach national collaborative projects involving human health data with trust and ease.

Our specific areas of expertise include: biomarker identification; de novo assembly of sequencing data; genome comparative data analysis; targeted, exome and whole genome sequencing analysis; metagenomic data analysis; omics analysis; data integration; gene prediction and annotation; machine learning.

Who takes part in SIB courses? 33% PhD candidates 25% Postdoctoral researchers

Our approach: collaborative, independent and reliable

s d'illustration? From one-off services to long-term collaborations, we turn our in-depth expertise into integrated solutions, in line with your goals and regulatory requirements, to make your projects a reality.

BIOSTATISTICS AND BIOINFORMATICS ANALYSIS

20% Senior scientists / Principal investigators 10% Other scientists 9% Research assistants / Technicians 3% Master's students


1 6

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

SOME OF OUR LATEST COLLABORATIONS

Find out how we put our expertise into practice through these recent partnerships with healthcare providers, pharma, start-ups and international consortia. Towards improved obesity treatment with an international research consortium The international, public-private research consortium ‘SOPHIA’ (Stratification of Obese Phenotypes to Optimize Future Obesity Therapy) aims to improve risk assessment of the complications of obesity and predict treatment response for people with obesity. Our Vital-IT and Statistical Genetics Groups, led by Mark Ibberson and Zoltán Kutalik respectively, are lending data management and analytical expertise to the project.

Taking part in External Quality Assessments (EQAs) with QCMD SIB became a new bioinformatics partner of the Quality Control for Molecular Diagnostics (QCMD) organization. The partnership, spearheaded at SIB by the Clinical Bioinformatics group, led by Valérie Barbié, will expand the bio­informatics analytical pipeline on the data submitted by international labora­ tories that are aiming to participate in certified EQAs as part of their accreditation processes. It will initially focus on the develop­ment of new computational tools to support nucleic acid sequence data analysis in the context of viral metagenomics and drug resistance.

“The combination of QCMD’s External Quality Assessments expertise and global network with SIB’s bioinformatics knowhow will help shape the future of our quality assessment schemes in rapidly developing areas of molecular diagnostics.”

Elaine McCulloch Technical Project Manager at QCMD


Rolling out the cancer diagnostics platform in its latest version with HUG In its latest version, OncoBench ® – a cancer diagnostics platform developed jointly by the Geneva University Hospitals (HUG) and SIB* – offers new key molecular insights. In addition to guiding the interpretation of sequencing data from patients, it now enables clinicians to analyse variations in the number of copies of particular genes (CNVs) – information that is linked with several cancer therapies.

“We are now able to quickly and reliably identify the relevant CNVs and integrate them with the mutation data. This is a real breakthrough that paves the way towards a true integrative analysis of a patient’s molecular data.”

Making human protein interaction data publicly available with ENYO Pharma SA Understanding how proteins interact with each other offers key insights into a range of physiological and pathological processes. New knowledge on this topic has been made publicly available and FAIR via the SIB Resource neXtProt, thanks to a collaboration between the CALIPHO group, co-led by Amos Bairoch and Lydie Lane, and ENYO Pharma SA, based in Lyon, France.

“We made the proteinprotein interaction data from ENYO easily discoverable to all, through the fantastic knowledge platform dedicated to human proteins that is neXtProt.”

Laurène Meyniel-Schicklin co-founder of ENYO

Yann Christinat Clinical Bioinformatician at HUG Clinical Pathology Division

*Clinical Bioinformatics group, led by Valérie Barbié and Vital-IT group, led by Mark Ibberson

1 7

SIB Profile 2021

Launching a new drug discovery programme with Cellestia Biotech The programme, which addresses unmet medical needs in cancer, autoimmune and inflammatory disorders (AIIDs), is being launched in collaboration with the ComputerAided Molecular Engineering group, led by Vincent Zoete.

and also

Repurposing a dopamine antagonist with Addex therapeutics Bringing AI to image analytics in cancer diagnosis with Lunaphore and HUG Read more about these two examples in our special focus on machine learning P. 52-53

Representation of the human interactome integrated into neXtProt. In red, proteins with knowledge brought by ENYO Pharma


1 8

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

MEET OUR TEAMS

Our Internal Groups, composed and headed by SIB Employees, harness their complementary expertise to collaborate with external partners and other SIB Groups on a daily basis.

Clinical Bioinformatics

Core-IT

Personalized Health Informatics

Valérie Barbié

Heinz Stockinger

Katrin Crameri

“We support health professionals from hospitals and the pharma industry to make the most out of an exponential flow of data, in order to enhance diagnostics and to foster optimal patient care and well-being. We do this through dedicated tools and methods, benchmarking and practice harmonization.”

“With our Sensitive Data Processing Platform, we support researchers to extract knowledge from biomedical human data for the benefit of society. We enable them to work with such data in a lawful and efficient way by using leading technologies and building on our expertise in IT, data protection and information security.”

EXAMPLES

EXAMPLE

Diagnostic applications (cancer, genetic diseases, etc.) for the medical and pharmaceutical domain. Collaborative databases to enable data sharing for research or clinical purposes.

Acting as a secure gateway to national (e.g. the Swiss Personalized Health Network via BioMedIT) and international (e.g. Innovative Medicines Initiative) data networks.

TAGS personalized medicine; oncology;

TAGS data protection; information security;

infectious disease; human genetics; interoperability; medicine and health; outreach; training

high-performance computing; interoperability; medicine and health; personalized medicine; training; software engineering; infrastructure provision

“We are convinced that in order to ensure high-quality care and patient safety in the long term, healthcare and research must go hand-inhand. In the context of the Swiss Personalized Health Network, we are thus making health-related data in Switzerland available to research in a lasting way. This is done through their FAIRification* and by building the national secure IT infrastructure BioMedIT.” EXAMPLE

BioMedIT, the national infrastructure for the secure handling of health data, which can be jointly used by Swiss universities, research institutions and hospitals. TAGS information security; interoperability;

medicine and health; personalized medicine; training


1 9

SIB Profile 2021

Some of our most taught topics and skills in 2020

Statistics

Single-cell techniques Enrichment Analysis Databases Coding practices

Bgee

COVID-19 Machine learning

R Python NGS

Reproducible research CRUNCH RNA-Seq Computational biology Data analysis Data management Proteins and proteomes Data visualization

Swiss-Prot

Training

Vital-IT

Alan Bridge

Patricia Palagi

Mark Ibberson

“As a competence centre for biocuration and knowledge management, we develop, annotate and maintain internationally renowned knowledge resources such as UniProtKB/ Swiss-Prot. Our resources provide an essential framework for biological data science – supporting integrated analyses of genomic, proteomic and metabolomic data to promote human health and wellbeing.”

“Thanks to the unique pool of leading experts making up SIB’s scientific network, we are able to provide a rich nationwide training offer across the spectrum of bioinformatics techniques, methods and tools, and thus support the fast-evolving needs of researchers.”

“As both computational biologists and software developers, we understand data and how to manage them, as well as the underlying biological questions. Our focus is on finding innovative approaches to data analysis, such as overcoming constraints related to sensitive data.”

KEY FIGURES 2020

EXAMPLE

EXAMPLES

Some of our flagship resources include: UniProtKB/Swiss-Prot, ENZYME, Rhea, SwissLipids, HAMAP, PROSITE and ViralZone. TAGS database curation; proteins and proteomes;

systems biology; biochemistry; ontology; lipidomics; metabolomics; proteomics; semantic web

18 78

SIB Groups engaged in teaching activities

experts and trainers

Setting up a federated data analysis system across several countries to enable access to large patient cohorts while addressing legal, ethical and FAIR principles*. TAGS structural biology; systems biology;

machine learning; data management; mass spectrometry; next-generation sequencing; data mining; genome reconstruction; software engineering; personalized medicine

*FAIR : a set of guiding principles to improve the Findability, Accessibility, Interoperability, and Reuse of digital assets


2 0

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

For a lasting life-science infrastructure A lasting, agile and qualitative bioinformatics infrastructure is essential to support research. Offering such resources is at the core of SIB’s mission.

SOIN PATIENT OR CITIZEN ORIENTED

Swiss PKcdw

SPHN-funded projects (including SVIP-O, see below), covering the full spectrum of topics and issues relating to health data digitalization. Read more: sphn.ch/network/project-overview *project co-funded by SPHN and PHRT (Personalized Health and Related Technologies)

BIOMEDICAL PLATFORMS

15% increase* in usage figures since the launch of the new Expasy.org

* as compared with the same period the previous year

Improving knowledge discovery with the new Expasy.org, the Swiss bioinformatics resource portal Created in 1993, Expasy, the SIB bioinformatics resource portal, underwent a major overhaul in 2020. Designed as a discovery tool connecting over 160 Swiss-made bioinformatics databases and software, including leading resources of particular importance to the life-science community, it is the ideal tool to support research and teaching in genomics, proteomics, structural biology, evolution, systems biology or text mining. (SEE P. 32)

Co-piloting the set-up of an international coalition to ensure the sustainability of essential biological databases Biological databases have become an intrinsic part of the life scientist’s toolkit and of biotechnological discoveries. However, an ELIXIR study* revealed a general lack of long-term funding for essential biodata resources in Europe. To ensure that such crucial infrastructure remains freely available to researchers around the globe, funding needs to be provided and coordinated on the global scale. SIB is co-piloting the set-up of the Global Biodata Coalition created to this end and gathering international research funders. *DOI: 10.1093/bioinformatics/btz959


ETHICS, LEGAL, SECURITY, PATIENT INFORMATION

Governance

SPO

E-Consent

DeID

SACR PSSS* Pediatrics

NATIONAL REPOSITORIES, TECHNOLOGY AND ANALYTICAL NETWORKS

SVIP-O C3-StuDY Frailty SHFN* PRECISE*

Immune-repertoire

QA4IQI*

SwissGenVar CREATE PRIMA

SwissBioRef

2 1

SIB Profile 2021

“After UniProtKB/ Swiss-Prot and STRING, Cellosaurus and Rhea are the third and fourth Swissmade resources to be added to the ELIXIR Core Data Resource portfolio: this is a key step in ensuring their sustainability and long-term access to high-quality biological data for users from around the world.” Christine Durinx SIB Executive Director

SOCIBP*

IMAGINE MedCo*

NLPforTC

L4CHLAB BIOINFORMATICS, MEDICAL INFORMATICS, BIG DATA ANALYTICAL PLATFORMS

Building a lasting infrastructure to boost personalized health research This is SIB’s mandate in the context of the Swiss Personalized Health Network (SPHN). In 2020, SIB continued to tailor the secure IT infrastructure network, BioMedIT, to provide all authorized researchers in Switzerland with easy access to collaborative analysis of confidential data. It also launched, in collaboration with ETH Zurich and HES-SO, the first public version of the nationwide collaborative platform to offer clinicians a harmonized interpretation of cancer variants (SVIP-O). READ THE LATEST SPHN FACTSHEET

Core Data Resource

An active part in tomorrow’s Swiss research infrastructure landscape for biology

Two resources receive the ‘highest quality, sustainability and reliability label’

What will the landscape look like for biology research infrastructure in 2025-2028? To answer this question, the Swiss Academy of Sciences (SCNAT) – appointed by the SERI – has formed a “Roundtable for Research Infrastructures in Biology (RoTaBio)” and has invited biologists from all fields and regions across Switzerland to participate in the development of a roadmap. Four infrastructure resources were identified in the course of an 18-month, bottom-up process starting in July 2019, in which SIB actively participated.

Cellosaurus – a cell lines encyclopedia – developed by Group Leader Amos Bairoch at the University of Geneva in the context of the CALIPHO group, and Rhea – a biochemical reactions knowledgebase – developed by the Swiss-Prot group led by Alan Bridge, joined the portfolio of ELIXIR Core Data Resources. This international recognition combines European data resources of fundamental importance to the wider life-science community and the longterm preservation of biological data.


2 2

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

DISCOVER OUR TOOLS AND DATABASES

To unlock life’s mysteries, scientists rely on a range of resources. Among the 160 databases and software tools developed by SIB Groups, 11 have been identified by our Scientific Advisory Board as of particular importance for life science. These benefit from SIB’s specific support. Discover some stories that exemplify their impact.

Bgee

EPD Genes and genomes

Proteins and proteomes

Eukaryotic Promoter Database

Gene expression expertise

Knowledgebase with expert curation and software tools DESCRIPTION Quality-controlled information on experimentally defined promoters of higher organisms, as well as web-based tools for promoter analysis. HIGHLIGHT Over 190,000 promoters downloadable, analysable over a web interface and viewable in the UCSC genome browser.

Knowledgebase with expert curation and software tool DESCRIPTION Gene expression data including all types of transcriptomes, allowing retrieval and comparison of expression patterns between animals, human, model organisms and diverse species of evolutionary or agronomical relevance. HIGHLIGHT Only resource to provide homologous gene expression comparisons between species.

TYPE

STORY

Lipids

Using EPD to design a gene therapy to fight blindness in mice

Structural biology

By activating silent genes to compensate for defective ones using a CRISPR-Cas9 approach, scientists could improve retinal function and attenuate retinal degeneration in mice. EPD was used to identify relevant promoter regions in the DNA and to design the synthesis of the guiding element of the CRISPRCas9 system.

TYPE

STORY

Using Bgee to help identify clinically relevant variants in patients’ genomes In a recent study, Bgee was used to map conserved sequences across vertebrates, thereby hinting at their functional role: this will help to identify clinically relevant variants in patients’ genomes in the future. DOI: 10.1093/nar/gkz1199

DOI: 10.1126/sciadv.aba5614 BGEE IN A NUTSHELL? WATCH THE INTERVIEW


2 3

SIB Profile 2021

SwissDrugDesign

SwissRegulon Portal Tools and data for regulatory genomics Software tools and knowledgebases Web portal for regulatory genomics, including genome-wide annotations of regulatory sites and motifs, the web server ISMARA for automated inference of regulatory networks and CRUNCH for automated analysis of ChIP-seq data, and REALPHY for reconstructing phylogenies from raw sequence data. HIGHLIGHT Allows users to upload raw microarray, RNA-seq or ChIP-seq data to automatically infer the core regulatory networks acting in their system of interest. TYPE

DESCRIPTION

Widening access to computer-aided drug design Software tools DESCRIPTION Web portal of computeraided drug design tools, from molecular docking (SwissDock) to pharmacokinetics and druglikeness (SwissADME), through virtual screening (SwissSimilarity), lead optimization (SwissBioisostere) and target prediction of small molecules (SwissTargetPrediction). HIGHLIGHT Comprehensive and integrated web-based drug design environment. TYPE

UniProtKB/Swiss-Prot Protein knowledgebase

V-pipe Viral genomics pipeline Software tool Pipeline integrating various open-source software packages for assessing viral genetic diversity from next-generation sequencing data. HIGHLIGHT Enabling reliable and compar-able viral genomics and epidemiological studies and facilitating clinical diagnostics of viruses. TYPE

DESCRIPTION

STORY

Using V-pipe to enable early-warning detection of SARS-CoV-2 variants in wastewater Detecting evidence of variants of concern, often earlier than in clinical samples, is made possible by a collaborative effort led by the Computational Biology group developing V-pipe. This study has generated interest from researchers worldwide seeking to replicate the approach locally, including from wastewater analysis projects in Austria, Germany, Greece and, indeed, the International Water Association. The results were published in a preprint that was made public in January 2021. DOI: 10.1101/2021.01.08.21249379v1

Knowledgebase with expert curation Hundreds of thousands of protein descriptions, including function, domain structure, subcellular location, post-translational modifications and functionally characterized variants. HIGHLIGHT Expert-curated part of UniProt, the most widely used protein information resource in the world, with over six million pageviews per month. An ELIXIR Core Data Resource. TYPE

DESCRIPTION

SwissOrthology One-stop shop for orthologs Phylogenomics databases and software tools DESCRIPTION Web portal of resources to infer orthologs, i.e. corresponding genes across different species, a key aspect to predicting gene function or reconstructing species trees. It includes OrthoDB, BUSCO as well as OMA and the Quest for Orthologs benchmark service. HIGHLIGHT World-leading orthology and comparative genomic resources. TYPE

STORY

Using OrthoDB to study the evolution of parental care in fish Little is known about the changes that take place in the brain when it comes to fatherhood. In a fish species where fathers provide care, researchers report dramatic changes in neurogenomic state as they reach fatherhood. OrthoDB was used as part of the analysis, which showed that these gene-level changes resembled those associated with pregnancy in mammalian mothers. DOI: 10.1038/s41467-019-12212-7

STORY

Using UniProtKB/Swiss-Prot to investigate schizophrenia A recent study used UniProt and Rhea, also developed by the Swiss-Prot group and now an ELIXIR Core Data Resource (SEE PREVIOUS PAGE), in the identification of potential metabolites as early biomarkers for neurodevelopmental defects, and therapeutic targets for schizophrenia.

SwissLipids

DOI: 10.1038/s42003-020-01124-8

A knowledge resource for lipids Knowledgebase Information about lipids, including lipid structures, metabolism and interactions, providing a framework for the integration of lipid and lipidomics data with biological knowledge and models. HIGHLIGHT Information on more than 590,000 lipid structures from over 640 lipid classes. TYPE

DESCRIPTION


2 4

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

neXtProt Human protein knowledgebase Knowledgebase with expert curation and associated tools DESCRIPTION Information on human proteins such as function, involvement in diseases, mRNA/protein expression, proteinprotein interactions, post-translational modifications, protein variations and their phenotypic effects. HIGHLIGHT High data coverage through integration of multiple sources and advanced semantic search functionalities. Tools specifically designed for the proteomics community.

e

EP D

Bg e

TYPE

STORY

The Human Proteome Project is ten years old and has been using neXtProt as its official reference database. The aim of this massive project, led by of the Human Proteome Organisation (HUPO), is to generate the map of the protein-based molecular architecture of the human body, and to ultimately lay a foundation for the development of diagnostic, prognostic, therapeutic, and preventive medical applications. (SEE P. 33)

STRING

Using neXtProt as the reference database for the Human Proteome Project

DOI: 10.1038/s41467-020-19045-9

SIB Resources as interconnected data sources: chord diagram showing data flow among our tools and databases (the flow has the same colour as the resource of origin). The image was produced using Circos circos.ca

Prot neXt

SW IS S -M O D EL


2 5

SIB Profile 2021

V-pi pe STRING Protein-protein interaction networks and functional enrichment analysis Knowledgebase and software tool Resource for known and predicted protein-protein interactions, including direct (physical) and indirect (functional) associations derived from various sources, such as genomic context, high-throughput experiments, (conserved) co-expression and the literature. HIGHLIGHT STRING networks cover over 5,000 different organisms with over 25 million high-confidence links between proteins. An ELIXIR Core Data Resource. TYPE

DESCRIPTION

t Pro iss/Sw KB ot iPr Un

SWISS-MODEL Protein structure homology-modelling Software tools and repository Automated protein structure homology-modelling platform for generating 3D models of a protein or a protein complex, using a comparative approach, and database of annotated models for key reference proteomes based on UniProtKB. HIGHLIGHT Easy-to-use web-based platform processing over five 5,000 model requests per day, providing model information for experts and non-specialists. TYPE

DESCRIPTION

Sw i s s R egul on

STORY

Using SWISS-MODEL to understand how SARSCoV-2 binds to our cells

S w iss O r th o l og y

In two highly cited articles published in the early days of the pandemic, the structure of SARS-CoV-2’s spike protein’s receptor binding domain was compared to other related viruses using models generated with SWISS-MODEL. The models led to the hypothesis that SARS-CoV-2 binds to ACE2, which was later confirmed to be true. DOI: 10.1038/s41586-020-2008-3 DOI: 10.1016/s0140-6736(20)30251-8

ids Lip iss Sw

esign rugD issD Sw


2 6

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

A network of scientific expertise Bioinformatics is an interdisciplinary field, where the encounter between genetics, physiology, chemistry and physics leads to many fields of activities and applications.

Genes and genomes

Proteins and proteomes

Evolution and phylogeny

Life’s instruction manual

More than meets the eye

Splitting ends

15

58

8

85

A genome is the sum of genetic material of an organism, including all of its genes. It is composed of DNA and contains all the information needed to create and maintain an organism, as well as the instructions on how this information should be expressed.

A proteome is the sum of proteins expressed by a cell, a tissue or an organism, at a given time. Proteins are the products of genes, and are involved in nearly every task carried out within an organism – from carrying oxygen to fighting off pathogens.

Bioinformatics develops tools to read genomes, store, analyse and interpret the resulting data.

Bioinformatics develops tools to understand the role of proteins.

Number of groups per domain (only the groups that gave these themes as their main activities are listed)

Key resources on Expasy (over 160 tools and databases developed)

15

29

Changes that occur in genomes tell life scientists how an organism has evolved over time. Comparisons made between genomes from different species or populations tell them how they are related to one another – this is the field of phylogenetics. Bioinformatics develops tools to compare the genomes of organisms, as well as computational methods to reconstruct their past and build their ‘family’ trees.


2 7

SIB Profile 2021

…AGRICULTURE 10 GROUPS

Structural biology

Systems biology

The third dimension

Never alone

6

18

Macromolecules such as DNA and proteins have specific 3D structures that are dictated by their sequence. A protein’s function is defined by its 3D structure, which in turn defines the way it interacts with other molecules. Bioinformatics develops software to create 3D models of proteins to study their interactions with other molecules, such as drugs.

14

33

Life occurs and is sustained by a mesh of interactions within and between cells, tissues, organisms, and their environment. Understanding how these complex systems function, allows scientists to predict what happens if one of the components changes or the conditions are altered. Bioinformatics methods help to predict metabolic pathways.

from predicting the spread of bird flu outbreaks and understanding the lifecycle of agricultural pests, to improving crop productivity.

…BASIC RESEARCH 46 GROUPS

from unravelling the evolutionary processes that have shaped today’s biodiversity, to solving the equation behind a lizard’s scale colour pattern.

…ENVIRONMENTAL SCIENCES 7 GROUPS

Machine learning and text mining

Competence centres and core facilities

Rise of the machines

The means to an end

6

3

Machine learning (ML) techniques allow computers to learn from data without explicit instructions, and to draw inferences from data patterns. Text mining algorithms, often based on ML, are designed to recognize patterns within text, such as biomedical terms. Bioinformatics is supported by and feeds into ML algorithms, with diverse applications including drug design, biomarker discovery and text mining to facilitate literature triage (SEE P. 48).

15

10

The quantity of data generated by the life sciences has grown exponentially over the years, and needs to be stored and processed. Researchers also need support in making sense of their data. Core facilities centralize research resources, and provide tools, technologies, services and expert consultation to this end. Bioinformatics core facilities are located in the major Swiss academic institutions.

from understanding how organisms adapt to climate change, to how microbial communities can be used to break down pollutants in oil spills.

…MEDICINE AND HEALTH 49 GROUPS

from designing optimized proteins in cancer immunotherapy, to creating biomedical decision-support tools.

The activities of our Training group are transversal to all these fields and domains


2 8

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

OUR COMMUNITY AT A GLANCE

Through partnerships with major Swiss schools of higher education and renowned Swiss research institutes, we are proud to federate a diverse and skilled community.

BASEL 173 MEMBERS 15 GROUPS

Figures as of 1 January 2021

784 180

BERN

SIB Members, incl. 189 employees (SEE P. 38)

29 MEMBERS 2 GROUPS

YVERDON 6 MEMBERS 1 GROUP

students taking part in the SIB PhD Training Network

FRIBOURG 17 MEMBERS 3 GROUPS

LAUSANNE 237 MEMBERS 25 GROUPS

GENEVA 138 MEMBERS 10 GROUPS

GENEVA

LAUSANNE

YVERDON

FRIBOURG

BERN

BASEL


2 9

SIB Profile 2021

A multilingual community1

Python ST GALLEN 1 MEMBER 1 GROUP

ZURICH 150 MEMBERS 18 GROUPS

Javascript R

SQL Java Bash

WÄDENSWIL 15 MEMBERS 2 GROUPS

Perl SPARQL PHP

C++ DAVOS

RDF

3 MEMBERS 1 GROUP

C Matlab 1

BELLINZONA

Creation of a Diversity working group

9 MEMBERS 2 GROUPS

How equal, diverse and inclusive is the Swiss bioinformatics community that SIB federates? What can be done to reinforce these attributes even further? Six SIB Members and Employees make up the group created in December to look into these questions.

LUGANO 4 MEMBERS 2 GROUPS

ZURICH

Source: COMPASS, the SIB Developers’ Community

WÄDENSWIL

LUGANO

BELLINZONA

ST GALLEN

DAVOS


3 0

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

RESEARCH HIGHLIGHTS

460 publications by SIB Groups in 2020

Understanding pollinators on the genomic level

Answering biological questions with federated queries across databases

Bumblebees are globally important pollinators in natural ecosystems and in agricultural food production. In this first genomic characterization of the genus, an international team co-led by SIB Researchers at the University of Lausanne sequenced 17 species. The results open up new possibilities to protect their biodiversity.

Providing biologists with a single entry-point to the wealth of information contained in data resources, and enabling them to answer typical research questions: this is the purpose of BioQuery, an interface co-developed by SIB Researchers at the University of Lausanne. It integrates data from leading databases including SIB Resources.

GROUP INVOLVED

GROUPS INVOLVED

Evolutionary-Functional Genomics, led by Robert Waterhouse Published in Molecular Biology and Evolution DOI: 10.1093/molbev/msaa240

Laboratory of Computational Evolutionary Biology, led by Christophe Dessimoz Evolutionary Bioinformatics, co-led by Marc Robinson-Rechavi & Frédéric Bastian

Published in Database DOI: 10.1093/database/baz106 WATCH THE IN SILICO TALK ABOUT THE PAPER

Detecting the environment-genetics interplay underlying a disease How much of our genome makes us susceptible to environmental risk factors, which in turn predispose us to certain pathologies – such as obesity? SIB Researchers at the University of Lausanne have developed a method to answer this question. This could, in the future, make it possible to prioritize subgroups based on their genetic risk, by assessing where disease intervention would be more effective. GROUP INVOLVED

Statistical Genetics, led by Zoltán Kutalik Published in Nature Communications DOI: 10.1038/s41467-020-15107-0


3 1

SIB Profile 2021

An improved view of influenza evolution Segmented viruses such as influenza have the property of exchanging the different parts of their parental genome as they replicate within hosts. This property makes it difficult to infer virus evolution and infection pathways. SIB Researchers at ETH Zurich have developed a framework, CoalRe, to take this phenomenon into account, leading to better estimates of effective population size and evolutionary rates.

Putting FAIR principles into action for multi-omics

The DNA regions in our brain that contribute to making us human

Making data more reproducible improves the quality of science. However, enabling data sharing and reuse still represents a practical challenge for researchers. A study co-led by a PhD student at the University of Lausanne and SIB offers a strategy and tools to maximize the value of complex, multi-omics data.

With only 1% difference, the human and chimpanzee protein-coding genomes are remarkably similar. Researchers at SIB and the University of Lausanne have developed a new approach to pinpoint, for the first time, adaptive human-specific changes in the way genes are regulated in the brain. These results open new perspectives in the study of human evolution, developmental biology and neurosciences.

GROUP INVOLVED

Vital-IT, led by Mark Ibberson GROUP INVOLVED

GROUP INVOLVED

Computational Evolution, led by Tanja Stadler Published in PNAS DOI: 10.1073/pnas.1918304117

Published in Scientific Data DOI: 10.1038/s41597-019-0171-x WATCH THE IN SILICO TALK ABOUT THE PAPER

Evolutionary Bioinformatics, co-led by Marc Robinson-Rechavi & Frédéric Bastian

Published in Science Advances DOI: 10.1126/sciadv.abc9863

WATCH THE IN SILICO TALK ABOUT THE PAPER

Full references of the papers mentioned are available at sib.swiss/about-sib/news/10036-news-2020


3 2

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

SIB REMARKABLE OUTPUTS 2020

Discover the 10 best achievements and work produced by our scientists over the last year. Staying abreast of the latest advances and bright ideas emerging in a field as diverse as bioinformatics is challenging. To provide the global bioinformatics community with a shortlist of work produced during the year by SIB Scientists that is particularly deserving of attention, the SIB Award Committee has launched the Remarkable Outputs initiative. These outputs can include peer-reviewed publications, preprints, resources, software tools, databases, videos, tutorials, outreach programmes, science advocacy, etc.

SIB Training courses online Youtube training playlist GROUP INVOLVED

Training, led by Patricia Palagi, Lausanne WHAT THE COMMITTEE SAID ABOUT THE WORK

“Bioinformatics education is one of SIB’s core missions. The SIB Training team carried out a truly impressive task in 2020 by responding quickly and dynamically to continue to serve the community.” (SEE P. 44)

SwissBioPics – an inter­active library of cell images for the visualization of sub­ cellular location data

Expasy, the Swiss Bioinformatics Resource Portal

CoVariants: Tracking SARS-CoV-2 variants in real-time

expasy.org

covariants.org

GROUPS INVOLVED

GROUP INVOLVED

swissbiopics.org

Resource Usability and Support team / Core-IT, led by Heinz Stockinger, Lausanne

GROUP INVOLVED

Swiss-Prot, led by Alan Bridge, Geneva WHAT THE COMMITTEE SAID ABOUT THE WORK

“SwissBioPics offers intricate yet standardized images of various cells and their organelles. This is an outstanding resource that provides a visual representation of bio­logical knowledge, and it is great artwork!”

WHAT THE COMMITTEE SAID ABOUT THE WORK

“Putting users at the centre with rich resource cross-links, the overhauled Expasy portal is an immense achievement that benefits all SIB Resources, increasing discoverability in a user-friendly way.” (SEE P. 20)

Microbial Evolution, led by Richard Neher, Basel WHAT THE COMMITTEE SAID ABOUT THE WORK

“Essential tool in communicating the evolving maps and characteristics of SARS-CoV-2 variants in real time. CoVariants is an excellent illustration of what bioinformatics can bring to the world.”


treeclimbR pinpoints the data-dependent resolution of hierarchical hypotheses github.com/fionarhuang/treeclimbR/ GROUPS INVOLVED

Statistical Bioinformatics, led by Mark Robinson & Bioinformatics / Systems Biology, led by Christian von Mering, Zurich WHAT THE COMMITTEE SAID ABOUT THE WORK

“treeclimbR offers novel features with a data-driven approach for testing hierarchically organized hypotheses and selecting an optimal resolution. It has a wide range of research applications.”

The Bgee suite: integrated curated expression atlas and comparative trans­ criptomics in animals DOI: 10.1093/nar/gkaa793 GROUP INVOLVED

Evolutionary Bioinformatics, co-led by Marc Robinson-Rechavi & Frédéric Bastian, Lausanne WHAT THE COMMITTEE SAID ABOUT THE WORK

“Bgee is an excellent resource with many applications. The milestone publication showcases how high-quality expression data are integrated, curated, and har­mo­nized to be comparable across species.” (SEE P. 22)

3 3

SIB Profile 2021

Large-scale DNA-based phenotypic recording and deep learning enable highly accurate sequence-function mapping DOI: 10.1038/s41467-020-17222-4 GROUP INVOLVED

Machine Learning and Computational Biology Lab, led by Karsten Borgwardt, Zurich WHAT THE COMMITTEE SAID ABOUT THE WORK

“A great interdisciplinary study, with innovations on both experimental and computational sides, and highlighting the power of deep learning applied to big ‘omics’ data.”

SARS-CoV-2 genome sequences: a Swiss resource for genomic epidemiology nextstrain.org/groups/swiss/ncov/switzerland GROUPS INVOLVED

Microbial Evolution, led by Richard Neher / Computational Biology, led by Niko Beerenwinkel / Computational Evolution, led by Tanja Stadler, Basel WHAT THE COMMITTEE SAID ABOUT THE WORK

“With open real-time analysis of the evolution and spread of SARS-CoV-2, the power of Nexstrain really shone in 2020. It informs journalists and the public, contributing to a wider understanding of bioinformatics.” (SEE COVER)

DOI: 10.1038/s41586-020-1970-0

A high-stringency blueprint of the human proteome

GROUP INVOLVED

DOI: 10.1038/s41467-020-19045-9

Biomedical Informatics, led by Gunnar Rätsch, Zurich

GROUP INVOLVED

Genomic basis for RNA alterations in cancer

WHAT THE COMMITTEE SAID ABOUT THE WORK

“Impressive and very important work representing a major step forward in cancer research. It was performed by two international consortia, in which the contribution of SIB is valuable and significant.”

Computer and Laboratory Investigation of Proteins of Human Origin (CALIPHO), co-led by Amos Bairoch and Lydie Lane, Geneva WHAT THE COMMITTEE SAID ABOUT THE WORK

“The Human Proteome Project is an outstanding contribution to the field, validating the human proteome with experimental proteomics data, where the SIB Resource neXtProt played a central role.” (SEE P. 24)


3 4

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

Organization and governance Federating such pervasive a domain as bioinformatics, even across a modestly sized country like Switzerland, requires a unique organizational structure. COMPOSITION OF SIB’S GOVERNING BODIES

The Foundation Council Each of SIB’s partner institutions is represented on the Council. President Prof. Felix Gutzwiller Former Senator Founding Members Prof. Ron Appel SIB Executive Director Prof. Amos Bairoch Group Leader, SIB and University of Geneva Dr Philipp Bucher Affiliate Group Leader, SIB Prof. Denis Hochstrasser Former Vice-Rector, University of Geneva Prof. C. Victor Jongeneel Carl R. Woese Institute for Genomic Biology, University of Illinois, USA Prof. Manuel Peitsch Chief Scientific Officer Research at Philip Morris International Ex officio Members Prof. Cezmi A. Akdis Director, Swiss Institute of Allergy and Asthma Research (SIAF) Mr Thomas Baenninger Chief Financial Officer, Ludwig Institute for Cancer Research

Prof. Edouard Bugnion EPFL

Prof. Brigitte Galliot Vice-Rector, University of Geneva

Prof. François Bussy Vice-Rector for Research, International Relations and Continuing Education, University of Lausanne

Prof. Antoine Geissbühler Vice-Rector, University of Geneva Head of eHealth and Telemedicine Division, Geneva University Hospitals (HUG)

Prof. Daniel Candinas Vice-Rector Research, University of Bern Prof. Carlo Catapano Director, IOR Institute of Oncology Research Prof. Alex Dommann Head of Department “Materials meet Life”, Swiss Federal Laboratories for Materials Science and Technology (Empa) Prof. Boas Erez Rector, Università della Svizzera Italiana Prof. Nicolas Fasel Vice-Dean for Research and Innovation, Faculty of Biology and Medicine, University of Lausanne Prof. Katharina Fromm Vice-Rector, University of Fribourg Prof. Cem Gabay Dean, Faculty of Medicine, University of Geneva

Prof. Detlef Günther Vice-President Research and Corporate Relations, ETH Zurich Dr Corinne Jud Head of the Competence Division Method Development and Analytics, Agroscope Prof. Jérôme Lacour Dean, Faculty of Science, University of Geneva Dr Vincent Peiris Dean, School of Business and Engineering Vaud (HEIG-VD), HES-SO Prof. Jean-Marc Piveteau President, Zurich University of Applied Sciences (ZHAW) Prof. Giambattista Ravano Director of Research and Development and Knowledge, University of Applied Sciences and Arts of Southern Switzerland (SUPSI)

Prof. Alexandre Reymond Director, Centre for Integrative Genomics, Faculty of Biology and Medicine, University of Lausanne Prof. Patrick Ruch Head of Research, School of Business Administration (HEG-Geneva), HES-SO Prof. Falko Schlottig Director, FHNW School of Life Sciences Prof. Dirk Schübeler Co-Director, Friedrich Miescher Institute for Biomedical Research (FMI) Prof. Torsten Schwede Vice President of Research and Talent Promotion, University of Basel Prof. Elisabeth Stark Vice-President Research, University of Zurich Prof. Juerg Utzinger Director, Swiss Tropical and Public Health Institute Co-opted Member Prof. Alfonso Valencia ICREA Professor Life Sciences Department Director Barcelona Supercomputing Centre, Spain

The Board of Directors (BoD)

The Scientific Advisory Board (SAB)

The BoD consists of two Group Leaders elected jointly by the Council of Group Leaders and the BoD, two external members elected by the Foundation Council on the recommendation of the BoD, and the SIB Executive Directors. Members of the BoD are appointed for a renewable five-year period.

The SAB is made up of at least five members, who are internationally renowned scientists from the institute’s fields of activity.

Dr Jérôme Wojcik (Chairman) Industrial Data Scientist & Entrepreneur Prof. Ron Appel and Dr Christine Durinx Joint SIB Executive Directors

PD Dr Katja Bärenfaller Group Leader, SIB and Swiss Institute of Allergy and Asthma Research (SIAF) Ms Martine Brunschwig Graf Former National Councillor Prof. Christophe Dessimoz Group Leader, SIB and University of Lausanne

Council of Group Leaders The Council consists of the Group Leaders and the SIB Executive Directors.

Prof. Alfonso Valencia (Chairman) ICREA Professor Life Sciences Department Director Barcelona Supercomputing Centre, Spain

Prof. Claudine Médigue Head of the Laboratory of Bioinformatics Analyses for Genomics and Metabolism (LABGeM), Génoscope & CNRS, Evry, France

Prof. Soren Brunak Founder of the Centre for Biological Sequence Analysis, Technical University of Denmark, Denmark

Prof. Alexey I. Nesvizhskii Department of Pathology and Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, USA

Prof. Melissa Haendel Director of the Ontology Development Group, Oregon Health & Science University, Portland, USA

Prof. Christine Orengo Department of Structural and Molecular Biology, University College London, UK Prof. Ron Shamir Computational Genomics Group at the Blavatnik School of Computer Science, Tel Aviv University, Israel


3 5

SIB Profile 2021

As a non-profit foundation and with 24 partner institutions (SEE P. 28-29), SIB’s governance structure ensures both scientific independence and efficient internal functioning.

Foundation Council

Highest authority in the institute, with supervisory powers. Its responsibilities include changes to SIB’s statutes, nomination of Group Leaders, and approval of the annual budget and financial report.

Scientific Advisory Board Board of Directors Acts as a consultative body, providing recommendations to the Board of Directors and the Council of Group Leaders. Its main tasks consist in monitoring service and infrastructure activities, such as the SIB Resources. (SEE P. 22)

Define and implement the institute's strategic goals as well as ensuring the organization’s representation at the national and international level. Support functions include finance & grant services, legal & technology transfer, human resources and communication & scientific events.

External members from the political and industrial sectors

Executive Directors

Management and support teams

Takes the decisions necessary to achieve the aims of the institute, such as defining the scientific strategy and internal procedures, and allocating federal funds to service and infrastructure activities.

Group Leaders

Council of Group Leaders

Discusses all matters relating to SIB Groups as a whole, and proposes new Group Leaders for nomination.

SIB Internal Groups

SIB Affiliated Groups

Staffed and headed by SIB Employees, they focus on SIB’s core missions.

Academic groups from partner institutions across Switzerland. They include those groups maintaining and developing an SIB-supported infrastructure, such as an SIB Resource (SEE P. 22) or a core facility and can thus include SIB Employees as well.

(SEE P. 18-19)


3 6

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

FINANCES

SIB’s funds remained stable in 2020, thanks to the sustained support of its funders. Allocation to SIB’s core missions (As per 2020 audited figures) CHF 11.7 million CHF 10.8 million Other 3 0.1 Swiss universities 0.3 SNSF / Innosuisse 0.3 European funds 0.4 Private sector / Foreign universities

Other 3 0.2 Swiss hospitals 0.2 SNSF / Innosuisse 0.3 Private sector / Foreign universities 0.3

Detail of funding sources

39% Swiss government – SERI 9.7 million

22% Swiss government – Swiss universities

BioMedIT/SPHN 2 5.6 million

1.3

10% European funds

0.9

2.5 million

9% Swiss universities 2.3 million European funds

9% National Institutes

1.9

of Health (NIH) 2.3 million

5% Private sector / Foreign

National Institutes of Health (NIH) 2.3

universities 1.3 million

4% Swiss National Science

Swiss government 1 1.9

Foundation (SNSF) / Innosuisse 1.1 million

1% Other 3 0.3 million

1%

Swiss hospitals

0.2 million Total 25.3 million

CHF 2 million Private sector / Foreign universities 0.1 European funds 0.2

CHF 0.8 million

Swiss government 1 6.5

Databases & software tools

Swiss government BioMedIT/SPHN 2

Swiss universities 5.6

Competence centres

INFRASTRUCTURE 1 Swiss government funds are allocated to SIB Resources and Core facilities as per the recommendations of SIB’s external

Scientific Advisory Board. 2 SIB received CHF 1.1 million of government funds for the SPHN Data Coordination Centre.

0.2

Swiss universities

0.5

SNSF / Innosuisse

0.5

Swiss government  0.6

Swiss government  0.7

Training

Scientific collaboration COMMUNITY

In addition, SIB received CHF 4.5 million in 2020 for BioMedIT (SEE P. 21), out of which CHF 3.4 million were used for BioMedIT projects

and nodes. 3 Loss of income insurances etc.


3 7

SIB Profile 2021

SIB Resources: anchoring infrastructure in research With the exception of UniProtKB/ Swiss-Prot, which is composed exclusively of SIB Employees, SIB Resources are developed and maintained in research groups by a mix of collaborators from the university (funded by the latter and/or grants) and SIB Employees (funded by SERI). This integrated model (see graph below) allows SIB databases and software tools to be anchored in an academic environment close to users. This ensures that they remain at the cutting edge of technology by evolving in conjunction with science.

SIB’s competence centres as an efficient federal lever to provide bioinformatics expertise and services to the community Thanks to SERI’s support of CHF 937K this year, SIB can cover the management and some running costs of its Vital-IT and Clinical Bioinformatics groups to provide an unrivalled breadth of bioinformatic expertise and services to the community. From R&D to clinical applications, these internal teams actively engage in new collaborations, and secure grants (SEE P. 16-17), combining an annual budget of CHF 3.7 million.

Computer-based, expertise-driven activities

78% of SIB’s financial resources are allocated to the payment of salaries

An integrated model: out of 189 employees2 (159 FTEs)...

56

are embedded in university research groups (46.2 FTEs) mainly working on SIB Resources

133

are in internal groups or in management (112.8 FTEs)

Allocation by activities ■ 78% Infrastructure ■ 10% Community ■ 12% Management & support teams 1

1 SIB’s Management and support teams (see previous page). are financed by the Swiss government as well as through overheads

on external funds. The groups having entrusted SIB with the management of their funds benefit from specific support in legal affairs, human

resources and financial monitoring. As of 1 january 2021

2


3 8

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

EMPLOYEES

SIB Employees share a common passion: making a positive impact on society through biological and biomedical data science.

SIB has 189 employees of 26 different nationalities *

* As of 1 January 2021

“I joined SIB in 2003 as a software developer. Since then I have worked in data management, front-end and back-end development and UX Design. Although women in software development are still too rare, in my experience you do not have to change who you are to feel you belong, and be valued in a team.”

Séverine Duvaud Software developer & UX designer at SIB

Diversity, equality and inclusion: values we hold dear! Both as an employer, and as the ambassador of the Swiss bioinformatics community, SIB has a critical role to play in the workplace as well as in the scientific ecosystem. The institute is committed to leveraging the diversity of profiles and backgrounds among its employees and members by creating a culture of equality and inclusion, and enabling everyone to develop their potential and skills.


3 9

SIB Profile 2021

Geneva 76 employees 7 groups

Lausanne 86 employees 9 groups including Management and support There are 92 women (49%) and 97 men (51%) working at SIB

Basel 22 employees 4 groups

Zurich

43

5 employees 2 groups

Being a scientist at SIB: a range of roles

The median age at SIB is 43 years old, with a balanced pyramid of ages favouring knowledge exchange by bringing earlycareer scientists together with senior experts

Software developer, examples of career paths

Software developer 8%

6%

4% 4%

10%

Senior Software developer

22%

13%

16% 17%

■ 22% Software development ■ 17% Computational biology ■ 16% Biocuration ■ 13% Other ■ 10% Scientific IT and application support ■ 8% Research ■ 6% Group management ■ 4% Clinical data analysis ■ 4% Training

Other: Scientific coordination, Outreach, Grant, Resource and Project management, PhD students, Post-docs and Trainees – each role type representing less than 5 employees. Employees with two positions and two different hierarchical managers, are listed in the category in which most time is spent. In the case of a 50/50 split of time, they appear in both associated categories.

People management path

Technical expertise path

Team lead Software development

Lead Software developer

Head Software development

7

The median length of service is 7 years, with 38% of employees having been at SIB for over 10 years

Principal Software engineer

Director of the group

A flexible environment for a better work-life balance: 43% of employees work part-time


4 0


SIB Profile 2021

The societal and health earthquake of 2020 triggered a global scientific response that pushed the life-science infrastructure, as well as key activities of institutes such as SIB, beyond their limits. Everything had to be revisited and sometimes, reinvented.

Around COVID-19

4 1


4 2

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

Supporting international research How did SARS-CoV-2 arise? How is it spreading and evolving? What are its weak points? Early on, SIB Groups were able to provide a range of tools and resources to help answer such questions.

ANO6

Here are some of the actions undertaken by resources managed at SIB in the context of the emergency:

PRIM2

T

“The topic of sustainable research infrastructure, and the funding associated with it, has become more visible than ever. At a meeting of the Global Biodata Coalition (SEE P. 20), it was noted that the availability of key databases, such as those developed at SIB, is instrumental in enabling a fast scientific response to COVID-19.” Ron Appel SIB Executive Director

— The coronavirus entries in the universal protein knowledgebase Uniprot – an international collaboration between the Swiss-Prot group led by Alan Bridge, EMBL-EBI, and PIR – were made available as a pre-release independently and faster than the general UniProt release cycle. — Nextstrain (SEE COVER), co-developed by the group led by Richard Neher and to which other SIB Groups and resources (e.g. V-pipe) contribute, started incorporating SARS-CoV-2 genomes as soon as they were shared publicly. It enables clinical researchers to track in real time how the coronavirus genome evolves. (SEE P. 33)

he first coronavirus genome sequence was published on 10 January 2020. Our scientists reacted swiftly to offer support to clinical researchers, and to make sure that the science could move fast. This reaction involved developing new features in existing software tools or repurposing them (SEE V-PIPE P. 23), issuing new releases in record time through coordinated international efforts, etc. It was made possible by our scientists’ exceptional dedication, and by the fact that most resources were already available, funded, and with teams ready to operate the necessary developments.

Knowledge of the virus genome’s evolution, its proteins, their function and structure is key to understanding how it replicates and spreads, and to identifying potential targets for drugs or vaccines.

HOOK1

162% increase in ViralZone monthly usage figures* since the SARS-CoV-2 pages were created * as compared to the same period the previous year

— Through a dedicated portal of SWISS-MODEL, developed by the group led by Torsten Schwede, threedimensional models of the viral proteins could be created, which shed further light on their evolution, functional properties and potential weaknesses, in the context of drug development. — PROSITE was used to highlight the potential role of integrins in host-cell entry by the virus, making them interesting targets for COVID-19 treatment. — New SARS-CoV-2 dedicated pages on ViralZone provide further biological insights (see key figure), including a detailed comparison with the SARS virus genome as well as cross-links to complementary resources. — Literature triage services, such as CovidTriage developed by the group led by Patrick Ruch, as part of their SIBiLS resource, makes it possible to prioritize articles according to an ontology specific to COVID-19, thereby guiding scientists through the maze of published works on the virus.


MPP5

SIB Profile 2021

4 3

IL6ST IL6R MARK3

TBK1

ITGAL DDX1

SLC9A3R1

IL6

MARK2 MARK1

KPNA2 SMAD3

FURIN

BST2 PHB

HDAC2 TMPRSS2

PHB2

ACE2 SGTA

Revealing host-factors that are critical for the infection Joining European efforts to fight COVID-19 and future pandemics Scientists at SIB and the University of Basel are part of Exscalate4CoV (E4C), a public-private consortium of 18 top EU organizations. The SIB Resource SWISS-MODEL allows scientists to generate 3D-models of proteins that have not yet been experimentally elucidated. This enables accelerated virtual screening for potential drugs for the current outbreak, but also in the event of future pandemics – in particular when the pathogen is new to science and when no treatment exists.

Several high-profile studies used the SIB Resource STRING focusing on protein-protein networks in their data analysis of genome-wide screens, an essential step in identifying ‘weak spots’ in the host’s genome that could facilitate the infection. DOI: 10.1016/j.cell.2020.10.028 DOI: 10.1016/j.cell.2020.12.006

The COVID-19 oriented version of STRING enables users to explore the host-side of the disease, while keeping a focus on human proteins thought to interact with SARS-CoV-2.

ZNF318

Access the complete list of resources managed at SIB and supporting SARSCoV-2 research: sib.swiss/about-sib/ news/10660-sibresources-supportingsars-cov-2-research

Creation of a Swiss SARS-CoV-2 Data Hub facilitating data sharing for research Through the secure Swiss Pathogen Surveillance Platform (SPSP), SIB has set up the Swiss SARS-CoV-2 Data Hub. This hub will facilitate the viral sequence submissions of Swiss laboratories to the open database European Nucleotide Archive (ENA), to further enable largescale research on the virus. The platform is also conceived as an expandible infrastructure for the surveillance of other pathogens in the future. SPSP is one of several SIB Resources that joined the European COVID Data Platform to accelerate research on the virus through data and tool sharing.


4 4

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

A transformative impact From the disruption came rapid change. Some of these changes proved to be positive and long-lasting solutions.

W

e began 2020 looking forward to a full year of events, either organized or supported by SIB, from scientific conferences to various outreach events. But it soon became evident that the disruption was here to stay and that workarounds had to be found to maintain services, activities and interactivity within the bioinformatics community. Within a few weeks, our in-house conference, the SIB Days, was welcoming registrations for a completely new online format. On the training side, thanks to the tremendous efforts made by our teams and trainers (SEE P. 32), not only could our diverse offer continue to be provided, but by making most courses accessible remotely, it also benefitted a wider and more international audience. Digital outreach projects, such as an e-workshop to understand the biology of the virus, were also put in place.

Honing one’s bioinformatics skills from home, anywhere in the world Country of origin of participants to our online courses

Algeria

Denmark

Angola

Ecuador

Argentina

Egypt

Australia

Estonia

Austria

Ethiopia

Bahrain

Finland

Bangladesh

France

Belgium

Germany

Bolivia

Greece

Brazil

Hungary

902

Cameroon

India

Canada

Indonesia

China

Iran

Colombia

Iraq

Croatia

Ireland

attended the special webinar series on SIB Resources targeting SARS-CoV-2

Cuba

Israel

Cyprus

Italy

Czech Republic

Japan

Democratic People’s Republic of Korea

Kenya

participants

ACCESS THE ONLINE TRAINING PLAYLIST

Kuwait


4 5

SIB Profile 2021

Luxembourg Malaysia Mali Mauritius Mexico Morocco Netherlands Niger Nigeria North Macedonia Norway Pakistan Panama Peru Poland Portugal Republic of Korea Romania Russian Federation Saudi Arabia Singapore South Africa Spain Sudan Sweden Switzerland Turkey Uganda United Arab Emirates United Kingdom United Republic of Tanzania United States of America Uruguay

“We already had e-learning modules but this was on a whole new level: we had to find solutions fast to provide practical and interactive training to teach how to analyse data, how to program and how to use SIB Resources.” Patricia Palagi Team lead, Training group at SIB

Outreach resources to understand the biology of the coronavirus No option for meeting the public in person? Not a problem: together with the University of Lausanne, our outreach team worked on a workshop for classrooms, and associated public resources, to offer the opportunity to discover the SARS-CoV-2 coronavirus, in particular its genome and proteins, and how they are being studied by scientists. One example of research highlighted is the work by Sigrist et al., who used the PROSITE resource to uncover the potential role of integrins as alternative binding targets for the virus (SEE P. 42). DOI: 10.1016/j.antiviral.2020.104759


4 6

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

“Coming together for the SIB Days 2020 clearly demonstrated our connections as a community, as well as showcasing how bioinformatics is both an invaluable part of lifesciences research and a scientific discipline in its own right.”

Robert Waterhouse Group Leader SIB, University of Lausanne Member of the SIB Days 2020 Scientific Committee

How are bio­ informaticians facing the COVID-19 crisis? A virtual discussion at the SIB Days The themes discussed included: biocuration, resource sustainability, training and open science. The event also highlighted the opportunities and challenges of communicating with the media during public health crises. “Phylogenies are ‘beautifully dangerous’ ” said Emma Hodcroft, co-developer of Nextstrain. She explained how she and her colleagues developed written narratives on the resource website to better explain such complex concepts to a broad audience. WATCH THE PANEL DISCUSSION

Virtual SIB Days The first virtual edition of our internal conference (8-10 June) remained true to its ambition of representing the scientific diversity of Swiss bioinformatics, and to showcase the latest computational advances, from diseases to ecosystems. It brought together 390 SIB Scientists from across Switzerland, as well as international keynote speakers Victoria Nembaware (Sickle Africa Data Coordinating Centre) and Flora Graham (Nature Briefing). ACCESS THE VIDEOS OF THE RECORDED SESSIONS AT SIB DAYS 2020

SARS-CoV-2 spike glycoprotein complex, illustration. SARS-CoV-2 spike glycoprotein trimer (red, blue, yellow) complexed with neutralizing antibody EY6A Fab. Light fab chain shown in purple, the heavy chain in green.


4 7

SIB Profile 2021

“During such challenging times, open communication, a flexible attitude and supportive measures are all the more essential for keeping everyone’s spirits up. The SIB Staff Committee worked closely with the Executive Management, People & Culture and Communications & Scientific Events departments to ensure strong support for employees’ well-being.”

Gerardo Tauriello Staff Committee Team lead, Software development SIB, University of Basel


4 8


SIB Profile 2021

Around machine learning Artificial intelligence, machine learning, deep learning… For some, these represent the most exciting fields of computer science. For others, they are just hype. Let’s have a look at how these techniques support the work of SIB Scientists, and vice versa.

4 9


5 0

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

In Swiss bioinformatics Machine learning (ML) techniques have been used, developed and built on for decades by Swiss bio­informaticians, with applications in text mining, evolution, structural modelling and biocuration.

Artificial intelligence

E

Machine learning

One message cuts across all these areas: the importance of good data science based on domain expertise, and of the interaction between human and machine intelligence.

Deep learning

A technique by which a computer can “learn” from data, without using a complex set of different rules. This approach is mainly based on training a model from datasets.

A recent technique building on multiple layers of artificial neural networks, inspired by our brain’s own network of neurones.

GENERIC DEFINITIONS OF AI, ML AND DL FROM EN.WIKIPEDIA.ORG/WIKI/DEEP_LEARNING – AND MODIFIED BY SIB'S CARLOS PENA FOR DL

xamples of applications of ML techniques abound in Swiss bioinformatics and among SIB’s Groups: diagnosing diabetic retinopathy across different populations, identifying particularly aggressive cancers from scanned tissue slides, enabling early detection of neonatal jaundice, predicting affinity between bacteriophages and bacteria, and supporting literature triage for biocuration.

Mimicking the intelligence or behavioural pattern of humans or any other living entity.


5 1

SIB Profile 2021

Biocuration feeding from and into AI: a virtuous circle At Swiss-Prot, deep learning (DL) supports expert biocuration to accelerate literature triage and facilitate information extraction in human and machinereadable forms. At the same time, expertly curated knowledgebases such as UniProtKB/ Swiss-Prot provide key prior knowledge (i.e. a reliable training set) to feed DL algorithms in biology and medicine. Examples of applications include predicting protein function, structures, gene-disease links and protein-drug interactions. Expertly curated databases can even contribute to making ML models more explicable. Indeed, the complex biological information such databases contain (gene sets, pathways, ontologies and metabolic models) can be used to create interpretable models that reveal biological mechanisms (e.g. causality) as well as statistical associations (e.g. correlations). An example: doi.org/10.1038/nmeth.4627.

“Good data science is at the core of bio­informatics: strong scripting and statistical skills, combined with domain experts who have the substantive expertise to curate and make sense of the data. These are the key ingredients to ensure the trust of our end users, from clinicians to life scientists and chemists.”

A virtual panel discussion to dive into the topic How well does a model generalize beyond its training dataset? Do all ML models necessarily need to be explainable? How can trust from end users of ML-powered applications be fostered? Invited speakers presented use cases and perspectives on ML from biocuration, digital pathology, biomarker discovery and algorithm development. The speakers were: Alan Bridge, SIB Group Leader, Swiss-Prot; Andrew Janowczyk, Senior Research Scientist in the Precision Oncology Department (CHUV) and Senior Bioinformatician at SIB; Carlos Andrés Peña, SIB Group Leader in Computational Intelligence for Computational Biology (HEIG-VD) and Julia Vogt, SIB Group Leader, Medical Data Science (ETH Zurich).

WATCH THE PANEL DISCUSSION

At the SIB Days 2020, our biennial community conference, various pieces of work using ML were presented:

detecting jaundice, pneumonia, influenza and colorectal cancer; ensuring the privacy of electronic health records; predicting hospital readmission; literature triage for variants; predicting protein function;

Aitana Lebrand Team lead data science at SIB Co-chair of the virtual panel discussion

and more.


5 2

DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING

Focus on biomedical applications Clinicians and biomedical scientists increasingly see the value of ML applications in their daily work.

NO2 N O Identifying new therapeutic indications for a molecule

I

n the biomedical sciences and precision medicine in particular, ML is becoming essential, in prevention, diagnosis and treatment. It makes it possible, for instance, to integrate a large variety of data types (e.g. images from CT scans and text from clinical reports) used to characterize each patient, as well as to identify hidden patterns in the resulting high-dimensional dataset. These can be used as biomarkers that predict susceptibility to a disease or as a diagnostic aid. But ML is also used to explore the functional side of metabolic pathways in the context of drug repurposing. Find out more about some of the recent projects led at SIB on these fronts.

Addex Therapeutics and SIB’s Vital-IT and Computer-Aided Molecular Engineering groups (led by Mark Ibberson and Vincent Zoete, respectively) have been awarded an Innosuisse grant to apply computational approaches developed by SIB, including deep learning and molecular modelling, to identify new therapeutic indications for ADX10061, a potent and selective dopamine D1 receptor antagonist. Dopamine is a major neurotransmitter in the central nervous system and D1 receptors are believed to play an important role in the control of diverse aspects of brain function, including cognition, motivation, motricity, sleep, and memory.

Bringing AI to image analytics in cancer diagnosis Characterizing the various cell types and morphological features present in a tumour can help clinicians with their prognosis and guide treatment decisions in cancer patients. SIB’s Clinical Bioinformatics group (led by Valérie Barbié), Lunaphore and the Geneva University Hospitals (HUG) are collaborating as part of an Innosuisse project to develop an integrative solution for the phenotypic analysis of tumors powered by automated multiplex staining, image analysis and machine learning. The solution will provide a major leap towards the analysis and generation of biomarkers for routine diagnostic usage in clinical pathology.

OH

Fluorescence image of biomarkers from a microfluidic multiplex


5 3

SIB Profile 2019 2021

Fighting drug-resistant bacteria with viruses Using viruses (bacteriophages) that specifically infect and kill bacteria during their life cycle is a promising re-emerging approach to curing multi-drug resistant infections. In collaboration with the University of Lausanne and the Inselspital, the Computational Intelligence for Computational Biology group (led by Carlos Peña) has developed ML models for predicting optimal phage-bacterium interactions based only on genomic data from both organisms, thereby guiding personalized phage therapy.

Predicting newborn babies’ risk of developing jaundice The Medical Data Science group (led by Julia Vogt) has developed a model – and prototype app for clinicians – that can provide an early prediction of a newborn baby’s risk of developing severe jaundice in the next two days. Knowing this risk could safeguard against discharging the mother and baby too early from the hospital. The tool requires the clinicians to type in only four variables, instead of the complete list of 45 surveyed ones – a key timesaver in an environment where every minute counts.

A smart summary of radiology reports to support diagnostics

Coloured transmission electron micro­ graph of a T4 bacteriophage (orange)

Newborn baby undergoing bili therapy, a blue light phototherapy used to combat neonatal jaundice (hyperbilirubinemia).

Being able, from a large collection of text-based reports, to offer a therapeutic answer to new patient cases: this is the goal of one of the projects of the Text Mining group (led by Patrick Ruch), as part of SOCIBP-SPO, an SPHN Driver project to improve precision oncology care and clinical outcomes. To do this, the team is developing a multiclass categorization tool using deep learning to mine huge collections of past clinical text reports, identify therapeutic responses, generate hypotheses and ultimately support the understanding of cancer mechanisms.


5 4


5 5

SIB Profile 2021

INDEX OF SIB GROUP AND TEAM LEADERS As of 1 January 2021

NAME

FIELDS OF ACTIVITY

LOCATION

Ahrens Christian

Proteins and proteomes

Agroscope

Anisimova Maria

Evolution and phylogeny

Zurich University of Applied Sciences (ZHAW)

Arguello Roman

Evolution and phylogeny

University of Lausanne

Baerenfaller Katja

Proteins and proteomes

SIAF – University of Zurich

Bairoch Amos

Proteins and proteomes

University of Geneva

Barbié Valérie

Competence centres and core facilities

SIB

Bastian Frédéric

Evolution and phylogeny

University of Lausanne

Baudis Michael

Genes and genomes

University of Zurich

Beerenwinkel Niko

Evolution and phylogeny

ETH Zurich, D-BSSE

Bergmann Sven

Genes and genomes

University of Lausanne

Bitbol Anne-Florence NEW

Evolution and phylogeny

EPFL

Boeva Valentina

Systems biology

ETH Zurich

Borgwardt Karsten

Text mining and machine learning

ETH Zurich

Bridge Alan

Proteins and proteomes

SIB

Bruggmann Rémy

Competence centres and core facilities

University of Bern

Buljan Marija

Systems biology

Empa

Carmona Santiago NEW

Systems biology

University of Lausanne

Cascione Luciano

Competence centres and core facilities

Institute of Oncology Research

A B

NEW

C

Cavalli Andrea

Structural biology

Università della Svizzera italiana

Chopard Bastien

Systems biology

University of Geneva

Ciriello Giovanni

Systems biology

University of Lausanne

Correia Bruno

Structural biology

EPFL

Crameri Katrin

Competence centres and core facilities

SIB


5 6

INDEX

NAME

FIELDS OF ACTIVITY

LOCATION

Dal Peraro Matteo

Structural biology

EPFL

Delaneau Olivier

Genes and genomes

University of Lausanne

Delorenzi Mauro

Competence centres and core facilities

University of Lausanne

Deplancke Bart

Genes and genomes

EPFL

Dermitzakis Emmanouil

Genes and genomes

University of Geneva

Dessimoz Christophe

Evolution and phylogeny

University of Lausanne

Evolution and phylogeny

University of Bern

Falquet Laurent

Genes and genomes

University of Fribourg

Fellay Jacques

Genes and genomes

EPFL

Gfeller David

Proteins and proteomes

University of Lausanne

Gonnet Gaston

Evolution and phylogeny

ETH Zurich

Goudet Jérôme

Evolution and phylogeny

University of Lausanne

Ibberson Mark

Competence centres and core facilities

SIB

Iber Dagmar

Systems biology

ETH Zurich, D-BSSE

Ivanek Robert

Systems biology

University of Basel & University Hospital Basel

D E

Excoffier Laurent

F

G I

K

Kahraman Abdullah NEW

Competence centres and core facilities

University Hospital Zurich

Kriventseva Evgenia

Genes and genomes

University of Geneva

Kutalik Zoltán

Genes and genomes

University of Lausanne


5 7

SIB Profile 2021

NAME

FIELDS OF ACTIVITY

LOCATION

Lane Lydie

Proteins and proteomes

University of Geneva

Lisacek Frédérique

Proteins and proteomes

University of Geneva

Genes and genomes

University of Lausanne

L

M

Malaspinas Anna-Sapfo Mazza Christian

Systems biology

University of Fribourg

Michielin Olivier

Structural biology

University of Lausanne

Miho Enkelejda

Systems biology

FHNW University of Applied Sciences and Arts Northwestern Switzerland

Milinkovitch Michel

Systems biology

University of Geneva

Mitri Sara

Evolution and phylogeny

University of Lausanne

Evolution and phylogeny

University of Basel

Palagi Patricia

Training

SIB

Panse Christian NEW

Competence centres and core facilities

ETH Zurich

Payne Joshua

Evolution and phylogeny

ETH Zurich

Pedrioli Patrick

Proteins and proteomes

ETH Zurich

Peña-Reyes Carlos-Andrés

Text mining and machine learning

HEIG-VD

Pivkin Igor

Systems biology

Università della Svizzera italiana

N

Neher Richard

P

R

Rätsch Gunnar

Text mining and machine learning

ETH Zurich

Rehrauer Hubert

Competence centres and core facilities

ETH Zurich, University of Zurich

Riedi Marcel

Competence centres and core facilities

University of Zurich

Rinaldi Fabio

Text mining and machine learning

SUPSI

Rinn Bernd

Competence centres and core facilities

ETH Zurich, D-BSSE

Robinson Mark

Genes and genomes

University of Zurich

Robinson-Rechavi Marc

Evolution and phylogeny

University of Lausanne

Ruch Patrick

Text mining and machine learning

HES-SO - Geneva School of Business Administration (HEG)


5 8

INDEX

NAME

FIELDS OF ACTIVITY

LOCATION

Schütz Frédéric NEW

Competence centres and core facilities

University of Lausanne

Schwede Torsten

University of Basel

Structural biology, Competence centres and core facilities

Sengstag Thierry

Competence centres and core facilities

University of Basel

Snijder Berend

Systems biology

ETH Zurich

Stadler Michael

Genes and genomes

Friedrich Miescher Institute for Biomedical Research

Stadler Tanja

Evolution and phylogeny

ETH Zurich, D-BSSE

Stekhoven Daniel

Competence centres and core facilities

ETH Zurich

Stelling Jörg

Systems biology

ETH Zurich, D-BSSE

Stockinger Heinz

Competence centres and core facilities

SIB

Sunagawa Shinichi

Genes and genomes

ETH Zurich

van Nimwegen Erik

Genes and genomes

University of Basel

Vogt Julia

Text mining and machine learning

ETH Zurich

von Mering Christian

Proteins and proteomes

University of Zurich

Wagner Andreas

Evolution and phylogeny

University of Zurich

Waterhouse Robert

Genes and genomes

University of Lausanne

Wegmann Daniel

Evolution and phylogeny

University of Fribourg

Wollscheid Bernd

Proteins and proteomes

ETH Zurich

Zavolan Mihaela

Systems biology

University of Basel

Zdobnov Evgeny

Genes and genomes

University of Geneva

Zoete Vincent

Structural biology

University of Lausanne

S

V

W Z


5 9

SIB Profile 2021

ACKNOWLEDGEMENTS

IMPRESSUM

We gratefully acknowledge the following funders, sponsors and partners for their financial support and encouragement in helping us fulfil our mission in 2020.

© 2021 – SIB Swiss Institute of Bioinformatics

The Swiss government and in particular: The State Secretariat for Education, Research and Innovation SERI The Swiss National Science Foundation (SNSF) Innosuisse Our institutional partners The European Commission The Leenaards Foundation The National Institutes of Health (NIH) The Research for Life Foundation

We also thank all industrial and academic partners who trust SIB’s expertise.

ILLUSTRATION BY

Aurel Märki, aurelmaerki.ch DESIGN AND LAYOUT BY

Bogsch & Bacco, bogsch-bacco.ch IMAGE CREDITS (from top to bottom and from left to right)

P. 2 Nicolas Righetti, lundi13.ch P. 3 Nicolas Righetti, lundi13.ch P. 7 Franziska Gruhl - SIB. All rights reserved Franziska Gruhl - SIB. All rights reserved Sutthaburawonk / iStock Fabio Rinaldi - SIB. All rights reserved P. 13 Valentin Luggen P. 17 ENYO Pharma P. 18 Nicolas Righetti, lundi13.ch P. 19 Nicolas Righetti, lundi13.ch P. 22 Anya Ivanona / Shutterstock P. 23 Wikimedia Morphart Creation / Shutterstock P. 30 UbjsP / Shutterstock P. 31 Alfred Pasieka / Science Photo Library Tek Image / Science Photo Library P. 32 CC BY-NC-ND 4.0 | SwissBiopics, SIB P. 33 nextstrain.org, CC BY 4.0 Matis75 / Shutterstock P.43 Modified from string-db.org/cgi/ covid.pl, CC BY 4.0 P.46 Felix Imhof P.47 Laguna Design / Science Photo Library University of Basel P.51 Stéphane Praz P.52 Modified from Migliozzi D et al. in Microsystems & Nanoengineering 2019, doi.org/10.1038/s41378-019-0104-z, CC BY 4.0 P.53 Biozentrum, University of Basel / Shutterstock Keystone




SIB Swiss Institute of Bioinformatics Quartier Sorge Bâtiment Amphipôle CH – 1015 Lausanne T. +41 21 692 40 50 www.sib.swiss


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.