SIB Profile 2021
Data scientists for life
SPECIAL FOCUS
AROUND COVID-19 AROUND MACHINE LEARNING
COVER IMAGE
A different view of the pandemic Frequencies of SARS-CoV-2 variants, with a focus on sequences collected in Switzerland* (i.e. about two thirds of all the sequences in the analysis as of 6 April) since the beginning of the pandemic. Each variant is displayed in a different colour, and their respective frequency in the analysed samples varies in time, from left (older) to right (most recent). * DATA FROM SWITZERLAND WERE CONTRIBUTED BY A RANGE OF GROUPS (SEE THE FULL UPDATED LIST ON NEXTSTRAIN.ORG/GROUPS/SWISS/NCOV/ SWITZERLAND), INCLUDING THE COMPUTATIONAL EVOLUTION GROUP (SIB & ETH ZURICH), AND PROCESSED WITH THE SIB RESOURCE V-PIPE DEVELOPED BY THE COMPUTATIONAL BIOLOGY GROUP (SIB & ETH ZURICH).
2
Forewords
C
OVID-19 has taught us the hard way how crucial the readiness of a sustained, long-term funded life-science infrastructure is. The more stable the infrastructure, the more agile the research community and its ability to quickly tackle major questions. SIB has been an actor and advocate for infrastructure sustainability for decades and has been building and consolidating partnerships to this end on a national and international level. In this defining year for society and science, the institute showed its ability to build upon the existing to adapt to new needs from the scientific community, and to position itself as an essential coordinator to accelerate research, in Switzerland as well as on the international scene. Never has such a neutral, nationwide and agile structure proved so important for addressing the challenges of today and tomorrow. •
Felix Gutzwiller President of the Foundation Council
“Never has such a neutral, nationwide and agile structure proved so important for addressing the challenges of today and tomorrow.”
A
longside this year’s overwhelming theme, SIB kept on producing exciting results on the machine learning (ML) front, from research to clinical applications. ML has been part of the routine bioinformatics data analysis toolkit for decades, supporting research in drug repurposing, biomarker discovery, etc. Today, the need for expertly curated databases, where SIB is a recognized leader, is also increasingly acknowledged worldwide in the context of artificial intelligence (AI). Indeed, AI cannot offer meaningful results without reliable input data: garbage in, garbage out. The creation of high-quality databases thus offers a robust basis for AI. As more biomedical data become available to research, such as through the Swiss Personalized Health Network (SPHN), such skills and the underlying knowledge are all the more crucial to ensuring that AI delivers the expected benefits for clinicians, and ultimately citizens. •
Jérôme Wojcik Chairman of the Board of Directors
3
SIB Profile 2021
F
rom many points of view, 2020 was the worst of years. A year where hugging our loved ones was no longer a sign of care. A challenging year, where things as natural as meeting colleagues had to be reinvented. But it was also the best of years, when we enjoyed the flexibility of working from home and explored the regions we each live in. On the scientific side, bioinformatics was suddenly everywhere and the Swiss National COVID-19 Science Task Force relied on the expertise of several of our Group Leaders. The international research community was more reliant than ever on our databases and tools to uncover insights about the virus. Together with our partners, we established the Swiss SARS-CoV-2 Data Hub to contribute to global data-sharing efforts. The crisis also brought new ideas and formats: thanks to a modular and interactive online concept, our community conference, the SIB Days, remained true to its ambition of representing the scientific diversity of Swiss bioinformatics, and attracted a record number of participants. On the learning front, our virtual bio informatics training offer reached new and more international audiences. Alongside all these changes, there was nevertheless continuity in the battles and activities that make up the core of our institute. SIB continued to lend its expertise to support a sustainable biodata infrastructure, on the Swiss and global levels. It kept tailoring the secure BioMedIT network to fit the needs of research groups in the context of the Swiss Personalized Health Network (SPHN). In brief, scientific excellence did not flinch, despite the pressure.
“Scientific excellence did not flinch, despite the pressure.” Once more, we must pay tribute to the hard work, trust and dedication of our teams and colleagues, which reinforces our enthusiasm in steering our institute forward. We also thank the Federal government for its essential support, through the State Secretariat for Education, Research and Innovation SERI: we are pleased to be able to convert this support into concrete benefits for society, through the work of our 800 members. •
Christine Durinx Joint Executive Director
Ron Appel Joint Executive Director
4
10 DATA SCIENTISTS FOR LIFE
40 SPECIAL FOCUS AROUND COVID-19
5
SIB Profile 2021
Table of contents 06
Bioinformatics: a definition
08
Converting biological questions into answers
10
Data scientists for life
12
SIB in brief
14
Supporting our partners’ needs
20
For a lasting life-science infrastructure
26
A network of scientific expertise
34
Organization and governance
40
Around COVID-19
42
Supporting international research
44
A transformative impact
48
Around machine learning
50
In Swiss bioinformatics
52
Focus on biomedical applications
55
Index of Group and Team Leaders
59
Acknowledgements
48 SPECIAL FOCUS AROUND MACHINE LEARNING
6
Bringing bioinformatics to society From precision medicine to drug design and DNA testing: bioinformatics is increasingly tied to health and societal issues. Through its public outreach activities, SIB informs the public about the discipline and its applications. 2020 was an unusual year in terms of events but no opportunity was missed to create ways to reach the public: • An e-workshop to understand SARS-CoV-2 (SEE P. 45); • A new version of ChromosomeWalk.ch to explore the world of biomedicine with 300 fascinating gene stories and an illustrated tour of precision medicine;
Bioinformatics is the application of computer technology to the understanding and effective use of biological and clinical data
• The Protein Spotlight stories revisited as monthly comic strips, to discover the role proteins play in the grand scheme of things.
More activities and news on Facebook, our dedicated outreach channel in French and English goo.gl/4c6xCZ
7
SIB Profile 2021
Bioinformatics: a definition Thanks to computer-based approaches, researchers can improve their understanding of complex systems. Life scientists and clinicians have always tried to assemble data and evidence to find the right answers to fundamental questions. Nowadays, there is no shortage of data. But a different kind of problem has emerged. New technologies are producing data at an unprecedented speed. Indeed, so much data – and of such variety – that they can no longer be interpreted by the human mind alone. Enter bioinformatics. Bioinformatics is the application of computer technology to the understanding and effective use of biolog ical and biomedical data. It is the discipline that stores, analyses and interprets the big data generated by life-science experiments, or collected in a clinical context. This multidisciplinary field is driven by experts from a variety of backgrounds: biologists, computer scientists, mathematicians, statisticians and physicists.
Bioinformatics encompasses: DATABASES for storing, retrieving and
organizing information to maximize the value of biological data; SOFTWARE TOOLS for modelling, visualizing, interpreting and comparing biological data; ANALYSIS of complex biological datasets or systems using novel statistical approaches or machine learning techniques; RESEARCH in a wide variety of biolog-
ical fields and leading to applications in diverse areas, from agriculture to precision medicine; (SEE P. 36) COMPUTING AND STORAGE INFRASTRUCTURE to process and
safeguard large amounts of data.
What sort of data are we talking about? Bioinformatics deals with a broad spectrum of complex data types. Sequence data from DNA, RNA or proteins
Expression data, such as the level of expression of a gene in a sample
Imaging data
Text And more...
8
Converting biological questions ... Massive amount of data and data types: genetics, text, biochemical, imaging, etc.
Hospitals and clinics
Research institutes
Private sector
International institutions
Life sciences and health actors
... into answers with various applications
Basic research
Medicine
Environmental sciences
Tailoring treatment to cancer patients Agriculture
9
SIB Profile 2021
SIB Swiss Institute of Bioinformatics
Secure services for sensitive data
Data management
Software engineering Biostatistics and and tailoring bioinformatics analysis
Process optimization
Training
Expert biocuration
Dedicated multidisciplinary experts
Understanding the origin of beetle diversity
Real-time tracking of pandemics
1 0
SIB Profile 2021
Data scientists for life This is who we are: multidisciplinary experts safeguarding data, sharing their value and making them speak to solve biological questions.
1 1
1 2
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
SIB in brief As a non-profit foundation, we lead the field of bioinformatics in Switzerland, in order to foster advances in life sciences and health.
82 784 189 24 160 groups
Infrastructure
Community
members, including
SIB provides the national and international life-science community with a state-of-the-art bioinformatics infrastructure, including resources, collaborative support and services. DATABASES AND SOFTWARE TOOLS
We create, maintain and disseminate worldwide a large portfolio of databases and software tools, including some of the world-leading resources for life sciences, enabling researchers to leverage knowledge about life and foster innovations. COMPETENCE CENTRES
We offer in-depth expertise and support in bioinformatics, from secure infrastructure for sensitive data and analyses of all kinds of biological data to software development and data management.
employees
institutional partners across Switzerland Over
databases and software tools developed by our members and accessible via the Expasy web portal Over
3,225 peer-reviewed articles published since SIB’s creation in 1998
As of 1 January 2021
SIB brings together worldclass researchers based in Switzerland and delivers training in bioinformatics. SCIENTIFIC COLLABORATION
Through knowledge exchange networks, collaborative projects and events, we strengthen cooperation on shared issues among bioinformatics research and service groups from Swiss schools of higher education and research institutes. TRAINING IN BIOINFORMATICS
To ensure that life scientists and clinicians make the best of the data, we provide them with a large portfolio of courses and workshops. We also foster exchanges and training among bioinformatics and computational biology PhD students.
A FEW WORDS FROM MEMBERS OF THE BOARD OF DIRECTORS Katja, you joined SIB’s Board of Directors (BoD) in January 2021. How important is the institute to the Swiss and global biodata landscape? Katja Bärenfaller My enthusiasm for SIB was certainly the major motivation to join it as a Group Leader and to become a member of its BoD. While working on proteomics and data mining, I realized, for instance, how essential well-curated and accessible datasets and data analysis tools, such as those supported by SIB, are. The huge value of the institute has also become evident in the COVID-19 pandemic, with SIB Experts and Resources taking part in the global effort (SEE P. 42). Since the expertise and the resources were already available, they could rapidly focus on this crucial topic. As one of 15 female Group Leaders at SIB, do you feel you are carrying more than the voice of Group Leaders on the BoD? KB It is important to challenge the persisting stereotype of male bioinformaticians, and to address the gender gap: I therefore find it very positive that both the Executive Directors and the Group Leaders’ representatives on the BoD are gender-balanced. In addition to carrying the voice of female Group Leaders, I am very motivated to strengthen the national relevance of SIB and its role and visibility in German-speaking Switzerland. Finally, I would also like to be a voice for different types of biodata and resources, since I used to work in plant science and was involved in the struggle to maintain model organism information resources.
1 3
SIB Profile 2021
From five groups in 1998 when SIB was created, to 82 today. Christophe, what do you foresee for the years to come in terms of new directions and challenges to be addressed by SIB? Christophe Dessimoz SIB will need to double down on its unique strengths, which I believe are 1) to provide, through globally recognized databases and resources, goldstandard data that are critical to serve as a quality source of datasets for machine-learning algorithms; 2) to bring bioinformatics resources and research together at a national and international level and continue to serve as a model in that respect; 3) to contribute to establishing standards of excellence in bioinformatics research, service, and infrastructure through its community activities and the collaborative projects it fosters, such as SVIP-O or BioMedIT (SEE P. 21).
What are some of the key initiatives undertaken to maintain a sense of belonging among SIB’s members? CD First, the systematic and proactive approach in terms of internal communication, both with members and employees, which has been greatly strengthened over recent years. Second, the SIB-wide events with a social dimension, and in particular the biennial “SIB Days” (SEE P. 46). The most recent edition, for instance, even though it was held virtually, saw a record attendance. Initiatives benefitting all members in terms of promoting scientific visibility and excellence are also key, such as the SIB Remarkable Outputs (SEE P. 32) or the Bioinformatics Awards. Finally, initiatives dedicated to specific interest groups, such as the Dev’Forums* or the SIB PhD Network. * Dev’Forums are technical, informal and short meetings to promote networking and experience exchanges across the community of SIB Developers
“It is important to challenge the persisting stereotype of male bioinformaticians, and to address the gender gap.”
Katja Bärenfaller Group Leader SIB, Swiss Institute of Allergy and Asthma Research (SIAF) Davos
Christophe Dessimoz Group Leader SIB, University of Lausanne, University College London Lausanne
1 4
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
Supporting our partners’ needs Discover our competences, explore our portfolio of leading databases and software and meet our experts.
EXPERT BIOCURATION
DATA MANAGEMENT
Generating high-quality, up-to-date annotations
Organizing data for long-term reuse
Our biocuration experts excel in the art of generating knowledge from a growing body of publications. We provide expert biocuration on various data types including proteomics, lipidomics and transcriptomics. This includes help with setting up expert-annotated resources for a wide range of applications, such as understanding protein function, facilitating clinical interpretation of cancer variants or enabling biomarker discovery.
SOFTWARE ENGINEERING AND TAILORING
Developing engaging and customized tools
We assist our partners by defining and implementing their Data Management Plans (DMP) for research proposals; reaching data interoperability targets, from local to international scales, within academic or regulated environments; ensuring the long-term management and storage of biological data.
Our software engineers and User eXperience specialists contribute to some of the world-leading databases and tools for life sciences, as well as tailored applications for personalized medicine in hospitals or industry settings. They assist our partners in creating user-friendly software – or adapting existing products to meet specific needs – based on the most up-to-date web technologies.
RAmener plus
1 5
SIB Profile 2021
SECURE SERVICES FOR SENSITIVE DATA
Dedicated, secure IT environment to process sensitive human data Our encrypted information technology infrastructure complies with all current data protection regulations (incl. GDPR) and IT security standards. Our partners can therefore process and host both sensitive data – such as genomic information or health records – and non-sensitive data – with complete confidence. We use modern virtualization technologies such as OpenStack to protect our computing environments.
TRAINING
PROCESS OPTIMIZATION
Boosting bioinformatics skills
Gaining efficiency, from analysis pipelines to quality control
Our comprehensive – and constantly evolving – course portfolio provides hands-on experience of the most up-to-date bioinformatics techniques and resources, including clinical applications for researchers or healthcare professionals. We offer about 100 course-days per year, making up over 50 courses provided by 80 trainers to 1,200 participants.
Our experts harmonize and optimize internal data management processes, analysis pipelines or software tools. We also organize and support clinical benchmarking activities with Swiss or international partners.
Find the full list of courses at sib.swiss/training
Making biological data speak
The national BioMedIT network, set up by SIB and operational at all three nodes in Basel (sciCORE, operated by the University of Basel), Lausanne (Core-IT, operated by SIB) and Zurich (SIS, operated by ETH Zurich), allows researchers to approach national collaborative projects involving human health data with trust and ease.
Our specific areas of expertise include: biomarker identification; de novo assembly of sequencing data; genome comparative data analysis; targeted, exome and whole genome sequencing analysis; metagenomic data analysis; omics analysis; data integration; gene prediction and annotation; machine learning.
Who takes part in SIB courses? 33% PhD candidates 25% Postdoctoral researchers
Our approach: collaborative, independent and reliable
s d'illustration? From one-off services to long-term collaborations, we turn our in-depth expertise into integrated solutions, in line with your goals and regulatory requirements, to make your projects a reality.
BIOSTATISTICS AND BIOINFORMATICS ANALYSIS
20% Senior scientists / Principal investigators 10% Other scientists 9% Research assistants / Technicians 3% Master's students
1 6
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
SOME OF OUR LATEST COLLABORATIONS
Find out how we put our expertise into practice through these recent partnerships with healthcare providers, pharma, start-ups and international consortia. Towards improved obesity treatment with an international research consortium The international, public-private research consortium ‘SOPHIA’ (Stratification of Obese Phenotypes to Optimize Future Obesity Therapy) aims to improve risk assessment of the complications of obesity and predict treatment response for people with obesity. Our Vital-IT and Statistical Genetics Groups, led by Mark Ibberson and Zoltán Kutalik respectively, are lending data management and analytical expertise to the project.
Taking part in External Quality Assessments (EQAs) with QCMD SIB became a new bioinformatics partner of the Quality Control for Molecular Diagnostics (QCMD) organization. The partnership, spearheaded at SIB by the Clinical Bioinformatics group, led by Valérie Barbié, will expand the bioinformatics analytical pipeline on the data submitted by international labora tories that are aiming to participate in certified EQAs as part of their accreditation processes. It will initially focus on the development of new computational tools to support nucleic acid sequence data analysis in the context of viral metagenomics and drug resistance.
“The combination of QCMD’s External Quality Assessments expertise and global network with SIB’s bioinformatics knowhow will help shape the future of our quality assessment schemes in rapidly developing areas of molecular diagnostics.”
Elaine McCulloch Technical Project Manager at QCMD
Rolling out the cancer diagnostics platform in its latest version with HUG In its latest version, OncoBench ® – a cancer diagnostics platform developed jointly by the Geneva University Hospitals (HUG) and SIB* – offers new key molecular insights. In addition to guiding the interpretation of sequencing data from patients, it now enables clinicians to analyse variations in the number of copies of particular genes (CNVs) – information that is linked with several cancer therapies.
“We are now able to quickly and reliably identify the relevant CNVs and integrate them with the mutation data. This is a real breakthrough that paves the way towards a true integrative analysis of a patient’s molecular data.”
Making human protein interaction data publicly available with ENYO Pharma SA Understanding how proteins interact with each other offers key insights into a range of physiological and pathological processes. New knowledge on this topic has been made publicly available and FAIR via the SIB Resource neXtProt, thanks to a collaboration between the CALIPHO group, co-led by Amos Bairoch and Lydie Lane, and ENYO Pharma SA, based in Lyon, France.
“We made the proteinprotein interaction data from ENYO easily discoverable to all, through the fantastic knowledge platform dedicated to human proteins that is neXtProt.”
Laurène Meyniel-Schicklin co-founder of ENYO
Yann Christinat Clinical Bioinformatician at HUG Clinical Pathology Division
*Clinical Bioinformatics group, led by Valérie Barbié and Vital-IT group, led by Mark Ibberson
1 7
SIB Profile 2021
Launching a new drug discovery programme with Cellestia Biotech The programme, which addresses unmet medical needs in cancer, autoimmune and inflammatory disorders (AIIDs), is being launched in collaboration with the ComputerAided Molecular Engineering group, led by Vincent Zoete.
and also
Repurposing a dopamine antagonist with Addex therapeutics Bringing AI to image analytics in cancer diagnosis with Lunaphore and HUG Read more about these two examples in our special focus on machine learning P. 52-53
Representation of the human interactome integrated into neXtProt. In red, proteins with knowledge brought by ENYO Pharma
1 8
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
MEET OUR TEAMS
Our Internal Groups, composed and headed by SIB Employees, harness their complementary expertise to collaborate with external partners and other SIB Groups on a daily basis.
Clinical Bioinformatics
Core-IT
Personalized Health Informatics
Valérie Barbié
Heinz Stockinger
Katrin Crameri
“We support health professionals from hospitals and the pharma industry to make the most out of an exponential flow of data, in order to enhance diagnostics and to foster optimal patient care and well-being. We do this through dedicated tools and methods, benchmarking and practice harmonization.”
“With our Sensitive Data Processing Platform, we support researchers to extract knowledge from biomedical human data for the benefit of society. We enable them to work with such data in a lawful and efficient way by using leading technologies and building on our expertise in IT, data protection and information security.”
EXAMPLES
EXAMPLE
Diagnostic applications (cancer, genetic diseases, etc.) for the medical and pharmaceutical domain. Collaborative databases to enable data sharing for research or clinical purposes.
Acting as a secure gateway to national (e.g. the Swiss Personalized Health Network via BioMedIT) and international (e.g. Innovative Medicines Initiative) data networks.
TAGS personalized medicine; oncology;
TAGS data protection; information security;
infectious disease; human genetics; interoperability; medicine and health; outreach; training
high-performance computing; interoperability; medicine and health; personalized medicine; training; software engineering; infrastructure provision
“We are convinced that in order to ensure high-quality care and patient safety in the long term, healthcare and research must go hand-inhand. In the context of the Swiss Personalized Health Network, we are thus making health-related data in Switzerland available to research in a lasting way. This is done through their FAIRification* and by building the national secure IT infrastructure BioMedIT.” EXAMPLE
BioMedIT, the national infrastructure for the secure handling of health data, which can be jointly used by Swiss universities, research institutions and hospitals. TAGS information security; interoperability;
medicine and health; personalized medicine; training
1 9
SIB Profile 2021
Some of our most taught topics and skills in 2020
Statistics
Single-cell techniques Enrichment Analysis Databases Coding practices
Bgee
COVID-19 Machine learning
R Python NGS
Reproducible research CRUNCH RNA-Seq Computational biology Data analysis Data management Proteins and proteomes Data visualization
Swiss-Prot
Training
Vital-IT
Alan Bridge
Patricia Palagi
Mark Ibberson
“As a competence centre for biocuration and knowledge management, we develop, annotate and maintain internationally renowned knowledge resources such as UniProtKB/ Swiss-Prot. Our resources provide an essential framework for biological data science – supporting integrated analyses of genomic, proteomic and metabolomic data to promote human health and wellbeing.”
“Thanks to the unique pool of leading experts making up SIB’s scientific network, we are able to provide a rich nationwide training offer across the spectrum of bioinformatics techniques, methods and tools, and thus support the fast-evolving needs of researchers.”
“As both computational biologists and software developers, we understand data and how to manage them, as well as the underlying biological questions. Our focus is on finding innovative approaches to data analysis, such as overcoming constraints related to sensitive data.”
KEY FIGURES 2020
EXAMPLE
EXAMPLES
Some of our flagship resources include: UniProtKB/Swiss-Prot, ENZYME, Rhea, SwissLipids, HAMAP, PROSITE and ViralZone. TAGS database curation; proteins and proteomes;
systems biology; biochemistry; ontology; lipidomics; metabolomics; proteomics; semantic web
18 78
SIB Groups engaged in teaching activities
experts and trainers
Setting up a federated data analysis system across several countries to enable access to large patient cohorts while addressing legal, ethical and FAIR principles*. TAGS structural biology; systems biology;
machine learning; data management; mass spectrometry; next-generation sequencing; data mining; genome reconstruction; software engineering; personalized medicine
*FAIR : a set of guiding principles to improve the Findability, Accessibility, Interoperability, and Reuse of digital assets
2 0
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
For a lasting life-science infrastructure A lasting, agile and qualitative bioinformatics infrastructure is essential to support research. Offering such resources is at the core of SIB’s mission.
SOIN PATIENT OR CITIZEN ORIENTED
Swiss PKcdw
SPHN-funded projects (including SVIP-O, see below), covering the full spectrum of topics and issues relating to health data digitalization. Read more: sphn.ch/network/project-overview *project co-funded by SPHN and PHRT (Personalized Health and Related Technologies)
BIOMEDICAL PLATFORMS
15% increase* in usage figures since the launch of the new Expasy.org
* as compared with the same period the previous year
Improving knowledge discovery with the new Expasy.org, the Swiss bioinformatics resource portal Created in 1993, Expasy, the SIB bioinformatics resource portal, underwent a major overhaul in 2020. Designed as a discovery tool connecting over 160 Swiss-made bioinformatics databases and software, including leading resources of particular importance to the life-science community, it is the ideal tool to support research and teaching in genomics, proteomics, structural biology, evolution, systems biology or text mining. (SEE P. 32)
Co-piloting the set-up of an international coalition to ensure the sustainability of essential biological databases Biological databases have become an intrinsic part of the life scientist’s toolkit and of biotechnological discoveries. However, an ELIXIR study* revealed a general lack of long-term funding for essential biodata resources in Europe. To ensure that such crucial infrastructure remains freely available to researchers around the globe, funding needs to be provided and coordinated on the global scale. SIB is co-piloting the set-up of the Global Biodata Coalition created to this end and gathering international research funders. *DOI: 10.1093/bioinformatics/btz959
ETHICS, LEGAL, SECURITY, PATIENT INFORMATION
Governance
SPO
E-Consent
DeID
SACR PSSS* Pediatrics
NATIONAL REPOSITORIES, TECHNOLOGY AND ANALYTICAL NETWORKS
SVIP-O C3-StuDY Frailty SHFN* PRECISE*
Immune-repertoire
QA4IQI*
SwissGenVar CREATE PRIMA
SwissBioRef
2 1
SIB Profile 2021
“After UniProtKB/ Swiss-Prot and STRING, Cellosaurus and Rhea are the third and fourth Swissmade resources to be added to the ELIXIR Core Data Resource portfolio: this is a key step in ensuring their sustainability and long-term access to high-quality biological data for users from around the world.” Christine Durinx SIB Executive Director
SOCIBP*
IMAGINE MedCo*
NLPforTC
L4CHLAB BIOINFORMATICS, MEDICAL INFORMATICS, BIG DATA ANALYTICAL PLATFORMS
Building a lasting infrastructure to boost personalized health research This is SIB’s mandate in the context of the Swiss Personalized Health Network (SPHN). In 2020, SIB continued to tailor the secure IT infrastructure network, BioMedIT, to provide all authorized researchers in Switzerland with easy access to collaborative analysis of confidential data. It also launched, in collaboration with ETH Zurich and HES-SO, the first public version of the nationwide collaborative platform to offer clinicians a harmonized interpretation of cancer variants (SVIP-O). READ THE LATEST SPHN FACTSHEET
Core Data Resource
An active part in tomorrow’s Swiss research infrastructure landscape for biology
Two resources receive the ‘highest quality, sustainability and reliability label’
What will the landscape look like for biology research infrastructure in 2025-2028? To answer this question, the Swiss Academy of Sciences (SCNAT) – appointed by the SERI – has formed a “Roundtable for Research Infrastructures in Biology (RoTaBio)” and has invited biologists from all fields and regions across Switzerland to participate in the development of a roadmap. Four infrastructure resources were identified in the course of an 18-month, bottom-up process starting in July 2019, in which SIB actively participated.
Cellosaurus – a cell lines encyclopedia – developed by Group Leader Amos Bairoch at the University of Geneva in the context of the CALIPHO group, and Rhea – a biochemical reactions knowledgebase – developed by the Swiss-Prot group led by Alan Bridge, joined the portfolio of ELIXIR Core Data Resources. This international recognition combines European data resources of fundamental importance to the wider life-science community and the longterm preservation of biological data.
2 2
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
DISCOVER OUR TOOLS AND DATABASES
To unlock life’s mysteries, scientists rely on a range of resources. Among the 160 databases and software tools developed by SIB Groups, 11 have been identified by our Scientific Advisory Board as of particular importance for life science. These benefit from SIB’s specific support. Discover some stories that exemplify their impact.
Bgee
EPD Genes and genomes
Proteins and proteomes
Eukaryotic Promoter Database
Gene expression expertise
Knowledgebase with expert curation and software tools DESCRIPTION Quality-controlled information on experimentally defined promoters of higher organisms, as well as web-based tools for promoter analysis. HIGHLIGHT Over 190,000 promoters downloadable, analysable over a web interface and viewable in the UCSC genome browser.
Knowledgebase with expert curation and software tool DESCRIPTION Gene expression data including all types of transcriptomes, allowing retrieval and comparison of expression patterns between animals, human, model organisms and diverse species of evolutionary or agronomical relevance. HIGHLIGHT Only resource to provide homologous gene expression comparisons between species.
TYPE
STORY
Lipids
Using EPD to design a gene therapy to fight blindness in mice
Structural biology
By activating silent genes to compensate for defective ones using a CRISPR-Cas9 approach, scientists could improve retinal function and attenuate retinal degeneration in mice. EPD was used to identify relevant promoter regions in the DNA and to design the synthesis of the guiding element of the CRISPRCas9 system.
TYPE
STORY
Using Bgee to help identify clinically relevant variants in patients’ genomes In a recent study, Bgee was used to map conserved sequences across vertebrates, thereby hinting at their functional role: this will help to identify clinically relevant variants in patients’ genomes in the future. DOI: 10.1093/nar/gkz1199
DOI: 10.1126/sciadv.aba5614 BGEE IN A NUTSHELL? WATCH THE INTERVIEW
2 3
SIB Profile 2021
SwissDrugDesign
SwissRegulon Portal Tools and data for regulatory genomics Software tools and knowledgebases Web portal for regulatory genomics, including genome-wide annotations of regulatory sites and motifs, the web server ISMARA for automated inference of regulatory networks and CRUNCH for automated analysis of ChIP-seq data, and REALPHY for reconstructing phylogenies from raw sequence data. HIGHLIGHT Allows users to upload raw microarray, RNA-seq or ChIP-seq data to automatically infer the core regulatory networks acting in their system of interest. TYPE
DESCRIPTION
Widening access to computer-aided drug design Software tools DESCRIPTION Web portal of computeraided drug design tools, from molecular docking (SwissDock) to pharmacokinetics and druglikeness (SwissADME), through virtual screening (SwissSimilarity), lead optimization (SwissBioisostere) and target prediction of small molecules (SwissTargetPrediction). HIGHLIGHT Comprehensive and integrated web-based drug design environment. TYPE
UniProtKB/Swiss-Prot Protein knowledgebase
V-pipe Viral genomics pipeline Software tool Pipeline integrating various open-source software packages for assessing viral genetic diversity from next-generation sequencing data. HIGHLIGHT Enabling reliable and compar-able viral genomics and epidemiological studies and facilitating clinical diagnostics of viruses. TYPE
DESCRIPTION
STORY
Using V-pipe to enable early-warning detection of SARS-CoV-2 variants in wastewater Detecting evidence of variants of concern, often earlier than in clinical samples, is made possible by a collaborative effort led by the Computational Biology group developing V-pipe. This study has generated interest from researchers worldwide seeking to replicate the approach locally, including from wastewater analysis projects in Austria, Germany, Greece and, indeed, the International Water Association. The results were published in a preprint that was made public in January 2021. DOI: 10.1101/2021.01.08.21249379v1
Knowledgebase with expert curation Hundreds of thousands of protein descriptions, including function, domain structure, subcellular location, post-translational modifications and functionally characterized variants. HIGHLIGHT Expert-curated part of UniProt, the most widely used protein information resource in the world, with over six million pageviews per month. An ELIXIR Core Data Resource. TYPE
DESCRIPTION
SwissOrthology One-stop shop for orthologs Phylogenomics databases and software tools DESCRIPTION Web portal of resources to infer orthologs, i.e. corresponding genes across different species, a key aspect to predicting gene function or reconstructing species trees. It includes OrthoDB, BUSCO as well as OMA and the Quest for Orthologs benchmark service. HIGHLIGHT World-leading orthology and comparative genomic resources. TYPE
STORY
Using OrthoDB to study the evolution of parental care in fish Little is known about the changes that take place in the brain when it comes to fatherhood. In a fish species where fathers provide care, researchers report dramatic changes in neurogenomic state as they reach fatherhood. OrthoDB was used as part of the analysis, which showed that these gene-level changes resembled those associated with pregnancy in mammalian mothers. DOI: 10.1038/s41467-019-12212-7
STORY
Using UniProtKB/Swiss-Prot to investigate schizophrenia A recent study used UniProt and Rhea, also developed by the Swiss-Prot group and now an ELIXIR Core Data Resource (SEE PREVIOUS PAGE), in the identification of potential metabolites as early biomarkers for neurodevelopmental defects, and therapeutic targets for schizophrenia.
SwissLipids
DOI: 10.1038/s42003-020-01124-8
A knowledge resource for lipids Knowledgebase Information about lipids, including lipid structures, metabolism and interactions, providing a framework for the integration of lipid and lipidomics data with biological knowledge and models. HIGHLIGHT Information on more than 590,000 lipid structures from over 640 lipid classes. TYPE
DESCRIPTION
2 4
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
neXtProt Human protein knowledgebase Knowledgebase with expert curation and associated tools DESCRIPTION Information on human proteins such as function, involvement in diseases, mRNA/protein expression, proteinprotein interactions, post-translational modifications, protein variations and their phenotypic effects. HIGHLIGHT High data coverage through integration of multiple sources and advanced semantic search functionalities. Tools specifically designed for the proteomics community.
e
EP D
Bg e
TYPE
STORY
The Human Proteome Project is ten years old and has been using neXtProt as its official reference database. The aim of this massive project, led by of the Human Proteome Organisation (HUPO), is to generate the map of the protein-based molecular architecture of the human body, and to ultimately lay a foundation for the development of diagnostic, prognostic, therapeutic, and preventive medical applications. (SEE P. 33)
STRING
Using neXtProt as the reference database for the Human Proteome Project
DOI: 10.1038/s41467-020-19045-9
SIB Resources as interconnected data sources: chord diagram showing data flow among our tools and databases (the flow has the same colour as the resource of origin). The image was produced using Circos circos.ca
Prot neXt
SW IS S -M O D EL
2 5
SIB Profile 2021
V-pi pe STRING Protein-protein interaction networks and functional enrichment analysis Knowledgebase and software tool Resource for known and predicted protein-protein interactions, including direct (physical) and indirect (functional) associations derived from various sources, such as genomic context, high-throughput experiments, (conserved) co-expression and the literature. HIGHLIGHT STRING networks cover over 5,000 different organisms with over 25 million high-confidence links between proteins. An ELIXIR Core Data Resource. TYPE
DESCRIPTION
t Pro iss/Sw KB ot iPr Un
SWISS-MODEL Protein structure homology-modelling Software tools and repository Automated protein structure homology-modelling platform for generating 3D models of a protein or a protein complex, using a comparative approach, and database of annotated models for key reference proteomes based on UniProtKB. HIGHLIGHT Easy-to-use web-based platform processing over five 5,000 model requests per day, providing model information for experts and non-specialists. TYPE
DESCRIPTION
Sw i s s R egul on
STORY
Using SWISS-MODEL to understand how SARSCoV-2 binds to our cells
S w iss O r th o l og y
In two highly cited articles published in the early days of the pandemic, the structure of SARS-CoV-2’s spike protein’s receptor binding domain was compared to other related viruses using models generated with SWISS-MODEL. The models led to the hypothesis that SARS-CoV-2 binds to ACE2, which was later confirmed to be true. DOI: 10.1038/s41586-020-2008-3 DOI: 10.1016/s0140-6736(20)30251-8
ids Lip iss Sw
esign rugD issD Sw
2 6
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
A network of scientific expertise Bioinformatics is an interdisciplinary field, where the encounter between genetics, physiology, chemistry and physics leads to many fields of activities and applications.
Genes and genomes
Proteins and proteomes
Evolution and phylogeny
Life’s instruction manual
More than meets the eye
Splitting ends
15
58
8
85
A genome is the sum of genetic material of an organism, including all of its genes. It is composed of DNA and contains all the information needed to create and maintain an organism, as well as the instructions on how this information should be expressed.
A proteome is the sum of proteins expressed by a cell, a tissue or an organism, at a given time. Proteins are the products of genes, and are involved in nearly every task carried out within an organism – from carrying oxygen to fighting off pathogens.
Bioinformatics develops tools to read genomes, store, analyse and interpret the resulting data.
Bioinformatics develops tools to understand the role of proteins.
Number of groups per domain (only the groups that gave these themes as their main activities are listed)
Key resources on Expasy (over 160 tools and databases developed)
15
29
Changes that occur in genomes tell life scientists how an organism has evolved over time. Comparisons made between genomes from different species or populations tell them how they are related to one another – this is the field of phylogenetics. Bioinformatics develops tools to compare the genomes of organisms, as well as computational methods to reconstruct their past and build their ‘family’ trees.
2 7
SIB Profile 2021
…AGRICULTURE 10 GROUPS
Structural biology
Systems biology
The third dimension
Never alone
6
18
Macromolecules such as DNA and proteins have specific 3D structures that are dictated by their sequence. A protein’s function is defined by its 3D structure, which in turn defines the way it interacts with other molecules. Bioinformatics develops software to create 3D models of proteins to study their interactions with other molecules, such as drugs.
14
33
Life occurs and is sustained by a mesh of interactions within and between cells, tissues, organisms, and their environment. Understanding how these complex systems function, allows scientists to predict what happens if one of the components changes or the conditions are altered. Bioinformatics methods help to predict metabolic pathways.
from predicting the spread of bird flu outbreaks and understanding the lifecycle of agricultural pests, to improving crop productivity.
…BASIC RESEARCH 46 GROUPS
from unravelling the evolutionary processes that have shaped today’s biodiversity, to solving the equation behind a lizard’s scale colour pattern.
…ENVIRONMENTAL SCIENCES 7 GROUPS
Machine learning and text mining
Competence centres and core facilities
Rise of the machines
The means to an end
6
3
Machine learning (ML) techniques allow computers to learn from data without explicit instructions, and to draw inferences from data patterns. Text mining algorithms, often based on ML, are designed to recognize patterns within text, such as biomedical terms. Bioinformatics is supported by and feeds into ML algorithms, with diverse applications including drug design, biomarker discovery and text mining to facilitate literature triage (SEE P. 48).
15
10
The quantity of data generated by the life sciences has grown exponentially over the years, and needs to be stored and processed. Researchers also need support in making sense of their data. Core facilities centralize research resources, and provide tools, technologies, services and expert consultation to this end. Bioinformatics core facilities are located in the major Swiss academic institutions.
from understanding how organisms adapt to climate change, to how microbial communities can be used to break down pollutants in oil spills.
…MEDICINE AND HEALTH 49 GROUPS
from designing optimized proteins in cancer immunotherapy, to creating biomedical decision-support tools.
The activities of our Training group are transversal to all these fields and domains
2 8
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
OUR COMMUNITY AT A GLANCE
Through partnerships with major Swiss schools of higher education and renowned Swiss research institutes, we are proud to federate a diverse and skilled community.
BASEL 173 MEMBERS 15 GROUPS
Figures as of 1 January 2021
784 180
BERN
SIB Members, incl. 189 employees (SEE P. 38)
29 MEMBERS 2 GROUPS
YVERDON 6 MEMBERS 1 GROUP
students taking part in the SIB PhD Training Network
FRIBOURG 17 MEMBERS 3 GROUPS
LAUSANNE 237 MEMBERS 25 GROUPS
GENEVA 138 MEMBERS 10 GROUPS
GENEVA
LAUSANNE
YVERDON
FRIBOURG
BERN
BASEL
2 9
SIB Profile 2021
A multilingual community1
Python ST GALLEN 1 MEMBER 1 GROUP
ZURICH 150 MEMBERS 18 GROUPS
Javascript R
SQL Java Bash
WÄDENSWIL 15 MEMBERS 2 GROUPS
Perl SPARQL PHP
C++ DAVOS
RDF
3 MEMBERS 1 GROUP
C Matlab 1
BELLINZONA
Creation of a Diversity working group
9 MEMBERS 2 GROUPS
How equal, diverse and inclusive is the Swiss bioinformatics community that SIB federates? What can be done to reinforce these attributes even further? Six SIB Members and Employees make up the group created in December to look into these questions.
LUGANO 4 MEMBERS 2 GROUPS
ZURICH
Source: COMPASS, the SIB Developers’ Community
WÄDENSWIL
LUGANO
BELLINZONA
ST GALLEN
DAVOS
3 0
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
RESEARCH HIGHLIGHTS
460 publications by SIB Groups in 2020
Understanding pollinators on the genomic level
Answering biological questions with federated queries across databases
Bumblebees are globally important pollinators in natural ecosystems and in agricultural food production. In this first genomic characterization of the genus, an international team co-led by SIB Researchers at the University of Lausanne sequenced 17 species. The results open up new possibilities to protect their biodiversity.
Providing biologists with a single entry-point to the wealth of information contained in data resources, and enabling them to answer typical research questions: this is the purpose of BioQuery, an interface co-developed by SIB Researchers at the University of Lausanne. It integrates data from leading databases including SIB Resources.
GROUP INVOLVED
GROUPS INVOLVED
Evolutionary-Functional Genomics, led by Robert Waterhouse Published in Molecular Biology and Evolution DOI: 10.1093/molbev/msaa240
Laboratory of Computational Evolutionary Biology, led by Christophe Dessimoz Evolutionary Bioinformatics, co-led by Marc Robinson-Rechavi & Frédéric Bastian
Published in Database DOI: 10.1093/database/baz106 WATCH THE IN SILICO TALK ABOUT THE PAPER
Detecting the environment-genetics interplay underlying a disease How much of our genome makes us susceptible to environmental risk factors, which in turn predispose us to certain pathologies – such as obesity? SIB Researchers at the University of Lausanne have developed a method to answer this question. This could, in the future, make it possible to prioritize subgroups based on their genetic risk, by assessing where disease intervention would be more effective. GROUP INVOLVED
Statistical Genetics, led by Zoltán Kutalik Published in Nature Communications DOI: 10.1038/s41467-020-15107-0
3 1
SIB Profile 2021
An improved view of influenza evolution Segmented viruses such as influenza have the property of exchanging the different parts of their parental genome as they replicate within hosts. This property makes it difficult to infer virus evolution and infection pathways. SIB Researchers at ETH Zurich have developed a framework, CoalRe, to take this phenomenon into account, leading to better estimates of effective population size and evolutionary rates.
Putting FAIR principles into action for multi-omics
The DNA regions in our brain that contribute to making us human
Making data more reproducible improves the quality of science. However, enabling data sharing and reuse still represents a practical challenge for researchers. A study co-led by a PhD student at the University of Lausanne and SIB offers a strategy and tools to maximize the value of complex, multi-omics data.
With only 1% difference, the human and chimpanzee protein-coding genomes are remarkably similar. Researchers at SIB and the University of Lausanne have developed a new approach to pinpoint, for the first time, adaptive human-specific changes in the way genes are regulated in the brain. These results open new perspectives in the study of human evolution, developmental biology and neurosciences.
GROUP INVOLVED
Vital-IT, led by Mark Ibberson GROUP INVOLVED
GROUP INVOLVED
Computational Evolution, led by Tanja Stadler Published in PNAS DOI: 10.1073/pnas.1918304117
Published in Scientific Data DOI: 10.1038/s41597-019-0171-x WATCH THE IN SILICO TALK ABOUT THE PAPER
Evolutionary Bioinformatics, co-led by Marc Robinson-Rechavi & Frédéric Bastian
Published in Science Advances DOI: 10.1126/sciadv.abc9863
WATCH THE IN SILICO TALK ABOUT THE PAPER
Full references of the papers mentioned are available at sib.swiss/about-sib/news/10036-news-2020
3 2
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
SIB REMARKABLE OUTPUTS 2020
Discover the 10 best achievements and work produced by our scientists over the last year. Staying abreast of the latest advances and bright ideas emerging in a field as diverse as bioinformatics is challenging. To provide the global bioinformatics community with a shortlist of work produced during the year by SIB Scientists that is particularly deserving of attention, the SIB Award Committee has launched the Remarkable Outputs initiative. These outputs can include peer-reviewed publications, preprints, resources, software tools, databases, videos, tutorials, outreach programmes, science advocacy, etc.
SIB Training courses online Youtube training playlist GROUP INVOLVED
Training, led by Patricia Palagi, Lausanne WHAT THE COMMITTEE SAID ABOUT THE WORK
“Bioinformatics education is one of SIB’s core missions. The SIB Training team carried out a truly impressive task in 2020 by responding quickly and dynamically to continue to serve the community.” (SEE P. 44)
SwissBioPics – an interactive library of cell images for the visualization of sub cellular location data
Expasy, the Swiss Bioinformatics Resource Portal
CoVariants: Tracking SARS-CoV-2 variants in real-time
expasy.org
covariants.org
GROUPS INVOLVED
GROUP INVOLVED
swissbiopics.org
Resource Usability and Support team / Core-IT, led by Heinz Stockinger, Lausanne
GROUP INVOLVED
Swiss-Prot, led by Alan Bridge, Geneva WHAT THE COMMITTEE SAID ABOUT THE WORK
“SwissBioPics offers intricate yet standardized images of various cells and their organelles. This is an outstanding resource that provides a visual representation of biological knowledge, and it is great artwork!”
WHAT THE COMMITTEE SAID ABOUT THE WORK
“Putting users at the centre with rich resource cross-links, the overhauled Expasy portal is an immense achievement that benefits all SIB Resources, increasing discoverability in a user-friendly way.” (SEE P. 20)
Microbial Evolution, led by Richard Neher, Basel WHAT THE COMMITTEE SAID ABOUT THE WORK
“Essential tool in communicating the evolving maps and characteristics of SARS-CoV-2 variants in real time. CoVariants is an excellent illustration of what bioinformatics can bring to the world.”
treeclimbR pinpoints the data-dependent resolution of hierarchical hypotheses github.com/fionarhuang/treeclimbR/ GROUPS INVOLVED
Statistical Bioinformatics, led by Mark Robinson & Bioinformatics / Systems Biology, led by Christian von Mering, Zurich WHAT THE COMMITTEE SAID ABOUT THE WORK
“treeclimbR offers novel features with a data-driven approach for testing hierarchically organized hypotheses and selecting an optimal resolution. It has a wide range of research applications.”
The Bgee suite: integrated curated expression atlas and comparative trans criptomics in animals DOI: 10.1093/nar/gkaa793 GROUP INVOLVED
Evolutionary Bioinformatics, co-led by Marc Robinson-Rechavi & Frédéric Bastian, Lausanne WHAT THE COMMITTEE SAID ABOUT THE WORK
“Bgee is an excellent resource with many applications. The milestone publication showcases how high-quality expression data are integrated, curated, and harmonized to be comparable across species.” (SEE P. 22)
3 3
SIB Profile 2021
Large-scale DNA-based phenotypic recording and deep learning enable highly accurate sequence-function mapping DOI: 10.1038/s41467-020-17222-4 GROUP INVOLVED
Machine Learning and Computational Biology Lab, led by Karsten Borgwardt, Zurich WHAT THE COMMITTEE SAID ABOUT THE WORK
“A great interdisciplinary study, with innovations on both experimental and computational sides, and highlighting the power of deep learning applied to big ‘omics’ data.”
SARS-CoV-2 genome sequences: a Swiss resource for genomic epidemiology nextstrain.org/groups/swiss/ncov/switzerland GROUPS INVOLVED
Microbial Evolution, led by Richard Neher / Computational Biology, led by Niko Beerenwinkel / Computational Evolution, led by Tanja Stadler, Basel WHAT THE COMMITTEE SAID ABOUT THE WORK
“With open real-time analysis of the evolution and spread of SARS-CoV-2, the power of Nexstrain really shone in 2020. It informs journalists and the public, contributing to a wider understanding of bioinformatics.” (SEE COVER)
DOI: 10.1038/s41586-020-1970-0
A high-stringency blueprint of the human proteome
GROUP INVOLVED
DOI: 10.1038/s41467-020-19045-9
Biomedical Informatics, led by Gunnar Rätsch, Zurich
GROUP INVOLVED
Genomic basis for RNA alterations in cancer
WHAT THE COMMITTEE SAID ABOUT THE WORK
“Impressive and very important work representing a major step forward in cancer research. It was performed by two international consortia, in which the contribution of SIB is valuable and significant.”
Computer and Laboratory Investigation of Proteins of Human Origin (CALIPHO), co-led by Amos Bairoch and Lydie Lane, Geneva WHAT THE COMMITTEE SAID ABOUT THE WORK
“The Human Proteome Project is an outstanding contribution to the field, validating the human proteome with experimental proteomics data, where the SIB Resource neXtProt played a central role.” (SEE P. 24)
3 4
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
Organization and governance Federating such pervasive a domain as bioinformatics, even across a modestly sized country like Switzerland, requires a unique organizational structure. COMPOSITION OF SIB’S GOVERNING BODIES
The Foundation Council Each of SIB’s partner institutions is represented on the Council. President Prof. Felix Gutzwiller Former Senator Founding Members Prof. Ron Appel SIB Executive Director Prof. Amos Bairoch Group Leader, SIB and University of Geneva Dr Philipp Bucher Affiliate Group Leader, SIB Prof. Denis Hochstrasser Former Vice-Rector, University of Geneva Prof. C. Victor Jongeneel Carl R. Woese Institute for Genomic Biology, University of Illinois, USA Prof. Manuel Peitsch Chief Scientific Officer Research at Philip Morris International Ex officio Members Prof. Cezmi A. Akdis Director, Swiss Institute of Allergy and Asthma Research (SIAF) Mr Thomas Baenninger Chief Financial Officer, Ludwig Institute for Cancer Research
Prof. Edouard Bugnion EPFL
Prof. Brigitte Galliot Vice-Rector, University of Geneva
Prof. François Bussy Vice-Rector for Research, International Relations and Continuing Education, University of Lausanne
Prof. Antoine Geissbühler Vice-Rector, University of Geneva Head of eHealth and Telemedicine Division, Geneva University Hospitals (HUG)
Prof. Daniel Candinas Vice-Rector Research, University of Bern Prof. Carlo Catapano Director, IOR Institute of Oncology Research Prof. Alex Dommann Head of Department “Materials meet Life”, Swiss Federal Laboratories for Materials Science and Technology (Empa) Prof. Boas Erez Rector, Università della Svizzera Italiana Prof. Nicolas Fasel Vice-Dean for Research and Innovation, Faculty of Biology and Medicine, University of Lausanne Prof. Katharina Fromm Vice-Rector, University of Fribourg Prof. Cem Gabay Dean, Faculty of Medicine, University of Geneva
Prof. Detlef Günther Vice-President Research and Corporate Relations, ETH Zurich Dr Corinne Jud Head of the Competence Division Method Development and Analytics, Agroscope Prof. Jérôme Lacour Dean, Faculty of Science, University of Geneva Dr Vincent Peiris Dean, School of Business and Engineering Vaud (HEIG-VD), HES-SO Prof. Jean-Marc Piveteau President, Zurich University of Applied Sciences (ZHAW) Prof. Giambattista Ravano Director of Research and Development and Knowledge, University of Applied Sciences and Arts of Southern Switzerland (SUPSI)
Prof. Alexandre Reymond Director, Centre for Integrative Genomics, Faculty of Biology and Medicine, University of Lausanne Prof. Patrick Ruch Head of Research, School of Business Administration (HEG-Geneva), HES-SO Prof. Falko Schlottig Director, FHNW School of Life Sciences Prof. Dirk Schübeler Co-Director, Friedrich Miescher Institute for Biomedical Research (FMI) Prof. Torsten Schwede Vice President of Research and Talent Promotion, University of Basel Prof. Elisabeth Stark Vice-President Research, University of Zurich Prof. Juerg Utzinger Director, Swiss Tropical and Public Health Institute Co-opted Member Prof. Alfonso Valencia ICREA Professor Life Sciences Department Director Barcelona Supercomputing Centre, Spain
The Board of Directors (BoD)
The Scientific Advisory Board (SAB)
The BoD consists of two Group Leaders elected jointly by the Council of Group Leaders and the BoD, two external members elected by the Foundation Council on the recommendation of the BoD, and the SIB Executive Directors. Members of the BoD are appointed for a renewable five-year period.
The SAB is made up of at least five members, who are internationally renowned scientists from the institute’s fields of activity.
Dr Jérôme Wojcik (Chairman) Industrial Data Scientist & Entrepreneur Prof. Ron Appel and Dr Christine Durinx Joint SIB Executive Directors
PD Dr Katja Bärenfaller Group Leader, SIB and Swiss Institute of Allergy and Asthma Research (SIAF) Ms Martine Brunschwig Graf Former National Councillor Prof. Christophe Dessimoz Group Leader, SIB and University of Lausanne
Council of Group Leaders The Council consists of the Group Leaders and the SIB Executive Directors.
Prof. Alfonso Valencia (Chairman) ICREA Professor Life Sciences Department Director Barcelona Supercomputing Centre, Spain
Prof. Claudine Médigue Head of the Laboratory of Bioinformatics Analyses for Genomics and Metabolism (LABGeM), Génoscope & CNRS, Evry, France
Prof. Soren Brunak Founder of the Centre for Biological Sequence Analysis, Technical University of Denmark, Denmark
Prof. Alexey I. Nesvizhskii Department of Pathology and Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, USA
Prof. Melissa Haendel Director of the Ontology Development Group, Oregon Health & Science University, Portland, USA
Prof. Christine Orengo Department of Structural and Molecular Biology, University College London, UK Prof. Ron Shamir Computational Genomics Group at the Blavatnik School of Computer Science, Tel Aviv University, Israel
3 5
SIB Profile 2021
As a non-profit foundation and with 24 partner institutions (SEE P. 28-29), SIB’s governance structure ensures both scientific independence and efficient internal functioning.
Foundation Council
Highest authority in the institute, with supervisory powers. Its responsibilities include changes to SIB’s statutes, nomination of Group Leaders, and approval of the annual budget and financial report.
Scientific Advisory Board Board of Directors Acts as a consultative body, providing recommendations to the Board of Directors and the Council of Group Leaders. Its main tasks consist in monitoring service and infrastructure activities, such as the SIB Resources. (SEE P. 22)
Define and implement the institute's strategic goals as well as ensuring the organization’s representation at the national and international level. Support functions include finance & grant services, legal & technology transfer, human resources and communication & scientific events.
External members from the political and industrial sectors
Executive Directors
Management and support teams
Takes the decisions necessary to achieve the aims of the institute, such as defining the scientific strategy and internal procedures, and allocating federal funds to service and infrastructure activities.
Group Leaders
Council of Group Leaders
Discusses all matters relating to SIB Groups as a whole, and proposes new Group Leaders for nomination.
SIB Internal Groups
SIB Affiliated Groups
Staffed and headed by SIB Employees, they focus on SIB’s core missions.
Academic groups from partner institutions across Switzerland. They include those groups maintaining and developing an SIB-supported infrastructure, such as an SIB Resource (SEE P. 22) or a core facility and can thus include SIB Employees as well.
(SEE P. 18-19)
3 6
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
FINANCES
SIB’s funds remained stable in 2020, thanks to the sustained support of its funders. Allocation to SIB’s core missions (As per 2020 audited figures) CHF 11.7 million CHF 10.8 million Other 3 0.1 Swiss universities 0.3 SNSF / Innosuisse 0.3 European funds 0.4 Private sector / Foreign universities
Other 3 0.2 Swiss hospitals 0.2 SNSF / Innosuisse 0.3 Private sector / Foreign universities 0.3
Detail of funding sources
39% Swiss government – SERI 9.7 million
22% Swiss government – Swiss universities
BioMedIT/SPHN 2 5.6 million
1.3
10% European funds
0.9
2.5 million
9% Swiss universities 2.3 million European funds
9% National Institutes
1.9
of Health (NIH) 2.3 million
5% Private sector / Foreign
National Institutes of Health (NIH) 2.3
universities 1.3 million
4% Swiss National Science
Swiss government 1 1.9
Foundation (SNSF) / Innosuisse 1.1 million
1% Other 3 0.3 million
1%
Swiss hospitals
0.2 million Total 25.3 million
CHF 2 million Private sector / Foreign universities 0.1 European funds 0.2
CHF 0.8 million
Swiss government 1 6.5
Databases & software tools
Swiss government BioMedIT/SPHN 2
Swiss universities 5.6
Competence centres
INFRASTRUCTURE 1 Swiss government funds are allocated to SIB Resources and Core facilities as per the recommendations of SIB’s external
Scientific Advisory Board. 2 SIB received CHF 1.1 million of government funds for the SPHN Data Coordination Centre.
0.2
Swiss universities
0.5
SNSF / Innosuisse
0.5
Swiss government 0.6
Swiss government 0.7
Training
Scientific collaboration COMMUNITY
In addition, SIB received CHF 4.5 million in 2020 for BioMedIT (SEE P. 21), out of which CHF 3.4 million were used for BioMedIT projects
and nodes. 3 Loss of income insurances etc.
3 7
SIB Profile 2021
SIB Resources: anchoring infrastructure in research With the exception of UniProtKB/ Swiss-Prot, which is composed exclusively of SIB Employees, SIB Resources are developed and maintained in research groups by a mix of collaborators from the university (funded by the latter and/or grants) and SIB Employees (funded by SERI). This integrated model (see graph below) allows SIB databases and software tools to be anchored in an academic environment close to users. This ensures that they remain at the cutting edge of technology by evolving in conjunction with science.
SIB’s competence centres as an efficient federal lever to provide bioinformatics expertise and services to the community Thanks to SERI’s support of CHF 937K this year, SIB can cover the management and some running costs of its Vital-IT and Clinical Bioinformatics groups to provide an unrivalled breadth of bioinformatic expertise and services to the community. From R&D to clinical applications, these internal teams actively engage in new collaborations, and secure grants (SEE P. 16-17), combining an annual budget of CHF 3.7 million.
Computer-based, expertise-driven activities
78% of SIB’s financial resources are allocated to the payment of salaries
An integrated model: out of 189 employees2 (159 FTEs)...
56
are embedded in university research groups (46.2 FTEs) mainly working on SIB Resources
133
are in internal groups or in management (112.8 FTEs)
Allocation by activities ■ 78% Infrastructure ■ 10% Community ■ 12% Management & support teams 1
1 SIB’s Management and support teams (see previous page). are financed by the Swiss government as well as through overheads
on external funds. The groups having entrusted SIB with the management of their funds benefit from specific support in legal affairs, human
resources and financial monitoring. As of 1 january 2021
2
3 8
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
EMPLOYEES
SIB Employees share a common passion: making a positive impact on society through biological and biomedical data science.
SIB has 189 employees of 26 different nationalities *
* As of 1 January 2021
“I joined SIB in 2003 as a software developer. Since then I have worked in data management, front-end and back-end development and UX Design. Although women in software development are still too rare, in my experience you do not have to change who you are to feel you belong, and be valued in a team.”
Séverine Duvaud Software developer & UX designer at SIB
Diversity, equality and inclusion: values we hold dear! Both as an employer, and as the ambassador of the Swiss bioinformatics community, SIB has a critical role to play in the workplace as well as in the scientific ecosystem. The institute is committed to leveraging the diversity of profiles and backgrounds among its employees and members by creating a culture of equality and inclusion, and enabling everyone to develop their potential and skills.
3 9
SIB Profile 2021
Geneva 76 employees 7 groups
Lausanne 86 employees 9 groups including Management and support There are 92 women (49%) and 97 men (51%) working at SIB
Basel 22 employees 4 groups
Zurich
43
5 employees 2 groups
Being a scientist at SIB: a range of roles
The median age at SIB is 43 years old, with a balanced pyramid of ages favouring knowledge exchange by bringing earlycareer scientists together with senior experts
Software developer, examples of career paths
Software developer 8%
6%
4% 4%
10%
Senior Software developer
22%
13%
16% 17%
■ 22% Software development ■ 17% Computational biology ■ 16% Biocuration ■ 13% Other ■ 10% Scientific IT and application support ■ 8% Research ■ 6% Group management ■ 4% Clinical data analysis ■ 4% Training
Other: Scientific coordination, Outreach, Grant, Resource and Project management, PhD students, Post-docs and Trainees – each role type representing less than 5 employees. Employees with two positions and two different hierarchical managers, are listed in the category in which most time is spent. In the case of a 50/50 split of time, they appear in both associated categories.
People management path
Technical expertise path
Team lead Software development
Lead Software developer
Head Software development
7
The median length of service is 7 years, with 38% of employees having been at SIB for over 10 years
Principal Software engineer
Director of the group
A flexible environment for a better work-life balance: 43% of employees work part-time
4 0
SIB Profile 2021
The societal and health earthquake of 2020 triggered a global scientific response that pushed the life-science infrastructure, as well as key activities of institutes such as SIB, beyond their limits. Everything had to be revisited and sometimes, reinvented.
Around COVID-19
4 1
4 2
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
Supporting international research How did SARS-CoV-2 arise? How is it spreading and evolving? What are its weak points? Early on, SIB Groups were able to provide a range of tools and resources to help answer such questions.
ANO6
Here are some of the actions undertaken by resources managed at SIB in the context of the emergency:
PRIM2
T
“The topic of sustainable research infrastructure, and the funding associated with it, has become more visible than ever. At a meeting of the Global Biodata Coalition (SEE P. 20), it was noted that the availability of key databases, such as those developed at SIB, is instrumental in enabling a fast scientific response to COVID-19.” Ron Appel SIB Executive Director
— The coronavirus entries in the universal protein knowledgebase Uniprot – an international collaboration between the Swiss-Prot group led by Alan Bridge, EMBL-EBI, and PIR – were made available as a pre-release independently and faster than the general UniProt release cycle. — Nextstrain (SEE COVER), co-developed by the group led by Richard Neher and to which other SIB Groups and resources (e.g. V-pipe) contribute, started incorporating SARS-CoV-2 genomes as soon as they were shared publicly. It enables clinical researchers to track in real time how the coronavirus genome evolves. (SEE P. 33)
he first coronavirus genome sequence was published on 10 January 2020. Our scientists reacted swiftly to offer support to clinical researchers, and to make sure that the science could move fast. This reaction involved developing new features in existing software tools or repurposing them (SEE V-PIPE P. 23), issuing new releases in record time through coordinated international efforts, etc. It was made possible by our scientists’ exceptional dedication, and by the fact that most resources were already available, funded, and with teams ready to operate the necessary developments.
Knowledge of the virus genome’s evolution, its proteins, their function and structure is key to understanding how it replicates and spreads, and to identifying potential targets for drugs or vaccines.
HOOK1
162% increase in ViralZone monthly usage figures* since the SARS-CoV-2 pages were created * as compared to the same period the previous year
— Through a dedicated portal of SWISS-MODEL, developed by the group led by Torsten Schwede, threedimensional models of the viral proteins could be created, which shed further light on their evolution, functional properties and potential weaknesses, in the context of drug development. — PROSITE was used to highlight the potential role of integrins in host-cell entry by the virus, making them interesting targets for COVID-19 treatment. — New SARS-CoV-2 dedicated pages on ViralZone provide further biological insights (see key figure), including a detailed comparison with the SARS virus genome as well as cross-links to complementary resources. — Literature triage services, such as CovidTriage developed by the group led by Patrick Ruch, as part of their SIBiLS resource, makes it possible to prioritize articles according to an ontology specific to COVID-19, thereby guiding scientists through the maze of published works on the virus.
MPP5
SIB Profile 2021
4 3
IL6ST IL6R MARK3
TBK1
ITGAL DDX1
SLC9A3R1
IL6
MARK2 MARK1
KPNA2 SMAD3
FURIN
BST2 PHB
HDAC2 TMPRSS2
PHB2
ACE2 SGTA
Revealing host-factors that are critical for the infection Joining European efforts to fight COVID-19 and future pandemics Scientists at SIB and the University of Basel are part of Exscalate4CoV (E4C), a public-private consortium of 18 top EU organizations. The SIB Resource SWISS-MODEL allows scientists to generate 3D-models of proteins that have not yet been experimentally elucidated. This enables accelerated virtual screening for potential drugs for the current outbreak, but also in the event of future pandemics – in particular when the pathogen is new to science and when no treatment exists.
Several high-profile studies used the SIB Resource STRING focusing on protein-protein networks in their data analysis of genome-wide screens, an essential step in identifying ‘weak spots’ in the host’s genome that could facilitate the infection. DOI: 10.1016/j.cell.2020.10.028 DOI: 10.1016/j.cell.2020.12.006
The COVID-19 oriented version of STRING enables users to explore the host-side of the disease, while keeping a focus on human proteins thought to interact with SARS-CoV-2.
ZNF318
Access the complete list of resources managed at SIB and supporting SARSCoV-2 research: sib.swiss/about-sib/ news/10660-sibresources-supportingsars-cov-2-research
Creation of a Swiss SARS-CoV-2 Data Hub facilitating data sharing for research Through the secure Swiss Pathogen Surveillance Platform (SPSP), SIB has set up the Swiss SARS-CoV-2 Data Hub. This hub will facilitate the viral sequence submissions of Swiss laboratories to the open database European Nucleotide Archive (ENA), to further enable largescale research on the virus. The platform is also conceived as an expandible infrastructure for the surveillance of other pathogens in the future. SPSP is one of several SIB Resources that joined the European COVID Data Platform to accelerate research on the virus through data and tool sharing.
4 4
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
A transformative impact From the disruption came rapid change. Some of these changes proved to be positive and long-lasting solutions.
W
e began 2020 looking forward to a full year of events, either organized or supported by SIB, from scientific conferences to various outreach events. But it soon became evident that the disruption was here to stay and that workarounds had to be found to maintain services, activities and interactivity within the bioinformatics community. Within a few weeks, our in-house conference, the SIB Days, was welcoming registrations for a completely new online format. On the training side, thanks to the tremendous efforts made by our teams and trainers (SEE P. 32), not only could our diverse offer continue to be provided, but by making most courses accessible remotely, it also benefitted a wider and more international audience. Digital outreach projects, such as an e-workshop to understand the biology of the virus, were also put in place.
Honing one’s bioinformatics skills from home, anywhere in the world Country of origin of participants to our online courses
Algeria
Denmark
Angola
Ecuador
Argentina
Egypt
Australia
Estonia
Austria
Ethiopia
Bahrain
Finland
Bangladesh
France
Belgium
Germany
Bolivia
Greece
Brazil
Hungary
902
Cameroon
India
Canada
Indonesia
China
Iran
Colombia
Iraq
Croatia
Ireland
attended the special webinar series on SIB Resources targeting SARS-CoV-2
Cuba
Israel
Cyprus
Italy
Czech Republic
Japan
Democratic People’s Republic of Korea
Kenya
participants
ACCESS THE ONLINE TRAINING PLAYLIST
Kuwait
4 5
SIB Profile 2021
Luxembourg Malaysia Mali Mauritius Mexico Morocco Netherlands Niger Nigeria North Macedonia Norway Pakistan Panama Peru Poland Portugal Republic of Korea Romania Russian Federation Saudi Arabia Singapore South Africa Spain Sudan Sweden Switzerland Turkey Uganda United Arab Emirates United Kingdom United Republic of Tanzania United States of America Uruguay
“We already had e-learning modules but this was on a whole new level: we had to find solutions fast to provide practical and interactive training to teach how to analyse data, how to program and how to use SIB Resources.” Patricia Palagi Team lead, Training group at SIB
Outreach resources to understand the biology of the coronavirus No option for meeting the public in person? Not a problem: together with the University of Lausanne, our outreach team worked on a workshop for classrooms, and associated public resources, to offer the opportunity to discover the SARS-CoV-2 coronavirus, in particular its genome and proteins, and how they are being studied by scientists. One example of research highlighted is the work by Sigrist et al., who used the PROSITE resource to uncover the potential role of integrins as alternative binding targets for the virus (SEE P. 42). DOI: 10.1016/j.antiviral.2020.104759
4 6
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
“Coming together for the SIB Days 2020 clearly demonstrated our connections as a community, as well as showcasing how bioinformatics is both an invaluable part of lifesciences research and a scientific discipline in its own right.”
Robert Waterhouse Group Leader SIB, University of Lausanne Member of the SIB Days 2020 Scientific Committee
How are bio informaticians facing the COVID-19 crisis? A virtual discussion at the SIB Days The themes discussed included: biocuration, resource sustainability, training and open science. The event also highlighted the opportunities and challenges of communicating with the media during public health crises. “Phylogenies are ‘beautifully dangerous’ ” said Emma Hodcroft, co-developer of Nextstrain. She explained how she and her colleagues developed written narratives on the resource website to better explain such complex concepts to a broad audience. WATCH THE PANEL DISCUSSION
Virtual SIB Days The first virtual edition of our internal conference (8-10 June) remained true to its ambition of representing the scientific diversity of Swiss bioinformatics, and to showcase the latest computational advances, from diseases to ecosystems. It brought together 390 SIB Scientists from across Switzerland, as well as international keynote speakers Victoria Nembaware (Sickle Africa Data Coordinating Centre) and Flora Graham (Nature Briefing). ACCESS THE VIDEOS OF THE RECORDED SESSIONS AT SIB DAYS 2020
SARS-CoV-2 spike glycoprotein complex, illustration. SARS-CoV-2 spike glycoprotein trimer (red, blue, yellow) complexed with neutralizing antibody EY6A Fab. Light fab chain shown in purple, the heavy chain in green.
4 7
SIB Profile 2021
“During such challenging times, open communication, a flexible attitude and supportive measures are all the more essential for keeping everyone’s spirits up. The SIB Staff Committee worked closely with the Executive Management, People & Culture and Communications & Scientific Events departments to ensure strong support for employees’ well-being.”
Gerardo Tauriello Staff Committee Team lead, Software development SIB, University of Basel
4 8
SIB Profile 2021
Around machine learning Artificial intelligence, machine learning, deep learning… For some, these represent the most exciting fields of computer science. For others, they are just hype. Let’s have a look at how these techniques support the work of SIB Scientists, and vice versa.
4 9
5 0
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
In Swiss bioinformatics Machine learning (ML) techniques have been used, developed and built on for decades by Swiss bioinformaticians, with applications in text mining, evolution, structural modelling and biocuration.
Artificial intelligence
E
Machine learning
One message cuts across all these areas: the importance of good data science based on domain expertise, and of the interaction between human and machine intelligence.
Deep learning
A technique by which a computer can “learn” from data, without using a complex set of different rules. This approach is mainly based on training a model from datasets.
A recent technique building on multiple layers of artificial neural networks, inspired by our brain’s own network of neurones.
GENERIC DEFINITIONS OF AI, ML AND DL FROM EN.WIKIPEDIA.ORG/WIKI/DEEP_LEARNING – AND MODIFIED BY SIB'S CARLOS PENA FOR DL
xamples of applications of ML techniques abound in Swiss bioinformatics and among SIB’s Groups: diagnosing diabetic retinopathy across different populations, identifying particularly aggressive cancers from scanned tissue slides, enabling early detection of neonatal jaundice, predicting affinity between bacteriophages and bacteria, and supporting literature triage for biocuration.
Mimicking the intelligence or behavioural pattern of humans or any other living entity.
5 1
SIB Profile 2021
Biocuration feeding from and into AI: a virtuous circle At Swiss-Prot, deep learning (DL) supports expert biocuration to accelerate literature triage and facilitate information extraction in human and machinereadable forms. At the same time, expertly curated knowledgebases such as UniProtKB/ Swiss-Prot provide key prior knowledge (i.e. a reliable training set) to feed DL algorithms in biology and medicine. Examples of applications include predicting protein function, structures, gene-disease links and protein-drug interactions. Expertly curated databases can even contribute to making ML models more explicable. Indeed, the complex biological information such databases contain (gene sets, pathways, ontologies and metabolic models) can be used to create interpretable models that reveal biological mechanisms (e.g. causality) as well as statistical associations (e.g. correlations). An example: doi.org/10.1038/nmeth.4627.
“Good data science is at the core of bioinformatics: strong scripting and statistical skills, combined with domain experts who have the substantive expertise to curate and make sense of the data. These are the key ingredients to ensure the trust of our end users, from clinicians to life scientists and chemists.”
A virtual panel discussion to dive into the topic How well does a model generalize beyond its training dataset? Do all ML models necessarily need to be explainable? How can trust from end users of ML-powered applications be fostered? Invited speakers presented use cases and perspectives on ML from biocuration, digital pathology, biomarker discovery and algorithm development. The speakers were: Alan Bridge, SIB Group Leader, Swiss-Prot; Andrew Janowczyk, Senior Research Scientist in the Precision Oncology Department (CHUV) and Senior Bioinformatician at SIB; Carlos Andrés Peña, SIB Group Leader in Computational Intelligence for Computational Biology (HEIG-VD) and Julia Vogt, SIB Group Leader, Medical Data Science (ETH Zurich).
WATCH THE PANEL DISCUSSION
At the SIB Days 2020, our biennial community conference, various pieces of work using ML were presented:
detecting jaundice, pneumonia, influenza and colorectal cancer; ensuring the privacy of electronic health records; predicting hospital readmission; literature triage for variants; predicting protein function;
Aitana Lebrand Team lead data science at SIB Co-chair of the virtual panel discussion
and more.
5 2
DATA SCIENTISTS FOR LIFE - AROUND COVID-19 - AROUND MACHINE LEARNING
Focus on biomedical applications Clinicians and biomedical scientists increasingly see the value of ML applications in their daily work.
NO2 N O Identifying new therapeutic indications for a molecule
I
n the biomedical sciences and precision medicine in particular, ML is becoming essential, in prevention, diagnosis and treatment. It makes it possible, for instance, to integrate a large variety of data types (e.g. images from CT scans and text from clinical reports) used to characterize each patient, as well as to identify hidden patterns in the resulting high-dimensional dataset. These can be used as biomarkers that predict susceptibility to a disease or as a diagnostic aid. But ML is also used to explore the functional side of metabolic pathways in the context of drug repurposing. Find out more about some of the recent projects led at SIB on these fronts.
Addex Therapeutics and SIB’s Vital-IT and Computer-Aided Molecular Engineering groups (led by Mark Ibberson and Vincent Zoete, respectively) have been awarded an Innosuisse grant to apply computational approaches developed by SIB, including deep learning and molecular modelling, to identify new therapeutic indications for ADX10061, a potent and selective dopamine D1 receptor antagonist. Dopamine is a major neurotransmitter in the central nervous system and D1 receptors are believed to play an important role in the control of diverse aspects of brain function, including cognition, motivation, motricity, sleep, and memory.
Bringing AI to image analytics in cancer diagnosis Characterizing the various cell types and morphological features present in a tumour can help clinicians with their prognosis and guide treatment decisions in cancer patients. SIB’s Clinical Bioinformatics group (led by Valérie Barbié), Lunaphore and the Geneva University Hospitals (HUG) are collaborating as part of an Innosuisse project to develop an integrative solution for the phenotypic analysis of tumors powered by automated multiplex staining, image analysis and machine learning. The solution will provide a major leap towards the analysis and generation of biomarkers for routine diagnostic usage in clinical pathology.
OH
Fluorescence image of biomarkers from a microfluidic multiplex
5 3
SIB Profile 2019 2021
Fighting drug-resistant bacteria with viruses Using viruses (bacteriophages) that specifically infect and kill bacteria during their life cycle is a promising re-emerging approach to curing multi-drug resistant infections. In collaboration with the University of Lausanne and the Inselspital, the Computational Intelligence for Computational Biology group (led by Carlos Peña) has developed ML models for predicting optimal phage-bacterium interactions based only on genomic data from both organisms, thereby guiding personalized phage therapy.
Predicting newborn babies’ risk of developing jaundice The Medical Data Science group (led by Julia Vogt) has developed a model – and prototype app for clinicians – that can provide an early prediction of a newborn baby’s risk of developing severe jaundice in the next two days. Knowing this risk could safeguard against discharging the mother and baby too early from the hospital. The tool requires the clinicians to type in only four variables, instead of the complete list of 45 surveyed ones – a key timesaver in an environment where every minute counts.
A smart summary of radiology reports to support diagnostics
Coloured transmission electron micro graph of a T4 bacteriophage (orange)
Newborn baby undergoing bili therapy, a blue light phototherapy used to combat neonatal jaundice (hyperbilirubinemia).
Being able, from a large collection of text-based reports, to offer a therapeutic answer to new patient cases: this is the goal of one of the projects of the Text Mining group (led by Patrick Ruch), as part of SOCIBP-SPO, an SPHN Driver project to improve precision oncology care and clinical outcomes. To do this, the team is developing a multiclass categorization tool using deep learning to mine huge collections of past clinical text reports, identify therapeutic responses, generate hypotheses and ultimately support the understanding of cancer mechanisms.
5 4
5 5
SIB Profile 2021
INDEX OF SIB GROUP AND TEAM LEADERS As of 1 January 2021
NAME
FIELDS OF ACTIVITY
LOCATION
Ahrens Christian
Proteins and proteomes
Agroscope
Anisimova Maria
Evolution and phylogeny
Zurich University of Applied Sciences (ZHAW)
Arguello Roman
Evolution and phylogeny
University of Lausanne
Baerenfaller Katja
Proteins and proteomes
SIAF – University of Zurich
Bairoch Amos
Proteins and proteomes
University of Geneva
Barbié Valérie
Competence centres and core facilities
SIB
Bastian Frédéric
Evolution and phylogeny
University of Lausanne
Baudis Michael
Genes and genomes
University of Zurich
Beerenwinkel Niko
Evolution and phylogeny
ETH Zurich, D-BSSE
Bergmann Sven
Genes and genomes
University of Lausanne
Bitbol Anne-Florence NEW
Evolution and phylogeny
EPFL
Boeva Valentina
Systems biology
ETH Zurich
Borgwardt Karsten
Text mining and machine learning
ETH Zurich
Bridge Alan
Proteins and proteomes
SIB
Bruggmann Rémy
Competence centres and core facilities
University of Bern
Buljan Marija
Systems biology
Empa
Carmona Santiago NEW
Systems biology
University of Lausanne
Cascione Luciano
Competence centres and core facilities
Institute of Oncology Research
A B
NEW
C
Cavalli Andrea
Structural biology
Università della Svizzera italiana
Chopard Bastien
Systems biology
University of Geneva
Ciriello Giovanni
Systems biology
University of Lausanne
Correia Bruno
Structural biology
EPFL
Crameri Katrin
Competence centres and core facilities
SIB
5 6
INDEX
NAME
FIELDS OF ACTIVITY
LOCATION
Dal Peraro Matteo
Structural biology
EPFL
Delaneau Olivier
Genes and genomes
University of Lausanne
Delorenzi Mauro
Competence centres and core facilities
University of Lausanne
Deplancke Bart
Genes and genomes
EPFL
Dermitzakis Emmanouil
Genes and genomes
University of Geneva
Dessimoz Christophe
Evolution and phylogeny
University of Lausanne
Evolution and phylogeny
University of Bern
Falquet Laurent
Genes and genomes
University of Fribourg
Fellay Jacques
Genes and genomes
EPFL
Gfeller David
Proteins and proteomes
University of Lausanne
Gonnet Gaston
Evolution and phylogeny
ETH Zurich
Goudet Jérôme
Evolution and phylogeny
University of Lausanne
Ibberson Mark
Competence centres and core facilities
SIB
Iber Dagmar
Systems biology
ETH Zurich, D-BSSE
Ivanek Robert
Systems biology
University of Basel & University Hospital Basel
D E
Excoffier Laurent
F
G I
K
Kahraman Abdullah NEW
Competence centres and core facilities
University Hospital Zurich
Kriventseva Evgenia
Genes and genomes
University of Geneva
Kutalik Zoltán
Genes and genomes
University of Lausanne
5 7
SIB Profile 2021
NAME
FIELDS OF ACTIVITY
LOCATION
Lane Lydie
Proteins and proteomes
University of Geneva
Lisacek Frédérique
Proteins and proteomes
University of Geneva
Genes and genomes
University of Lausanne
L
M
Malaspinas Anna-Sapfo Mazza Christian
Systems biology
University of Fribourg
Michielin Olivier
Structural biology
University of Lausanne
Miho Enkelejda
Systems biology
FHNW University of Applied Sciences and Arts Northwestern Switzerland
Milinkovitch Michel
Systems biology
University of Geneva
Mitri Sara
Evolution and phylogeny
University of Lausanne
Evolution and phylogeny
University of Basel
Palagi Patricia
Training
SIB
Panse Christian NEW
Competence centres and core facilities
ETH Zurich
Payne Joshua
Evolution and phylogeny
ETH Zurich
Pedrioli Patrick
Proteins and proteomes
ETH Zurich
Peña-Reyes Carlos-Andrés
Text mining and machine learning
HEIG-VD
Pivkin Igor
Systems biology
Università della Svizzera italiana
N
Neher Richard
P
R
Rätsch Gunnar
Text mining and machine learning
ETH Zurich
Rehrauer Hubert
Competence centres and core facilities
ETH Zurich, University of Zurich
Riedi Marcel
Competence centres and core facilities
University of Zurich
Rinaldi Fabio
Text mining and machine learning
SUPSI
Rinn Bernd
Competence centres and core facilities
ETH Zurich, D-BSSE
Robinson Mark
Genes and genomes
University of Zurich
Robinson-Rechavi Marc
Evolution and phylogeny
University of Lausanne
Ruch Patrick
Text mining and machine learning
HES-SO - Geneva School of Business Administration (HEG)
5 8
INDEX
NAME
FIELDS OF ACTIVITY
LOCATION
Schütz Frédéric NEW
Competence centres and core facilities
University of Lausanne
Schwede Torsten
University of Basel
Structural biology, Competence centres and core facilities
Sengstag Thierry
Competence centres and core facilities
University of Basel
Snijder Berend
Systems biology
ETH Zurich
Stadler Michael
Genes and genomes
Friedrich Miescher Institute for Biomedical Research
Stadler Tanja
Evolution and phylogeny
ETH Zurich, D-BSSE
Stekhoven Daniel
Competence centres and core facilities
ETH Zurich
Stelling Jörg
Systems biology
ETH Zurich, D-BSSE
Stockinger Heinz
Competence centres and core facilities
SIB
Sunagawa Shinichi
Genes and genomes
ETH Zurich
van Nimwegen Erik
Genes and genomes
University of Basel
Vogt Julia
Text mining and machine learning
ETH Zurich
von Mering Christian
Proteins and proteomes
University of Zurich
Wagner Andreas
Evolution and phylogeny
University of Zurich
Waterhouse Robert
Genes and genomes
University of Lausanne
Wegmann Daniel
Evolution and phylogeny
University of Fribourg
Wollscheid Bernd
Proteins and proteomes
ETH Zurich
Zavolan Mihaela
Systems biology
University of Basel
Zdobnov Evgeny
Genes and genomes
University of Geneva
Zoete Vincent
Structural biology
University of Lausanne
S
V
W Z
5 9
SIB Profile 2021
ACKNOWLEDGEMENTS
IMPRESSUM
We gratefully acknowledge the following funders, sponsors and partners for their financial support and encouragement in helping us fulfil our mission in 2020.
© 2021 – SIB Swiss Institute of Bioinformatics
The Swiss government and in particular: The State Secretariat for Education, Research and Innovation SERI The Swiss National Science Foundation (SNSF) Innosuisse Our institutional partners The European Commission The Leenaards Foundation The National Institutes of Health (NIH) The Research for Life Foundation
We also thank all industrial and academic partners who trust SIB’s expertise.
ILLUSTRATION BY
Aurel Märki, aurelmaerki.ch DESIGN AND LAYOUT BY
Bogsch & Bacco, bogsch-bacco.ch IMAGE CREDITS (from top to bottom and from left to right)
P. 2 Nicolas Righetti, lundi13.ch P. 3 Nicolas Righetti, lundi13.ch P. 7 Franziska Gruhl - SIB. All rights reserved Franziska Gruhl - SIB. All rights reserved Sutthaburawonk / iStock Fabio Rinaldi - SIB. All rights reserved P. 13 Valentin Luggen P. 17 ENYO Pharma P. 18 Nicolas Righetti, lundi13.ch P. 19 Nicolas Righetti, lundi13.ch P. 22 Anya Ivanona / Shutterstock P. 23 Wikimedia Morphart Creation / Shutterstock P. 30 UbjsP / Shutterstock P. 31 Alfred Pasieka / Science Photo Library Tek Image / Science Photo Library P. 32 CC BY-NC-ND 4.0 | SwissBiopics, SIB P. 33 nextstrain.org, CC BY 4.0 Matis75 / Shutterstock P.43 Modified from string-db.org/cgi/ covid.pl, CC BY 4.0 P.46 Felix Imhof P.47 Laguna Design / Science Photo Library University of Basel P.51 Stéphane Praz P.52 Modified from Migliozzi D et al. in Microsystems & Nanoengineering 2019, doi.org/10.1038/s41378-019-0104-z, CC BY 4.0 P.53 Biozentrum, University of Basel / Shutterstock Keystone
SIB Swiss Institute of Bioinformatics Quartier Sorge Bâtiment Amphipôle CH – 1015 Lausanne T. +41 21 692 40 50 www.sib.swiss