Recent technological developments have increased the cost-effectiveness and speed of measurements in molecular biology, so that many biological endeavours are now “data-driven”. The increased availability of genome-wide quantitative information is opening up new avenues for the study of gene function and gene evolution, potentially revealing the origins of molecular and functional diversity, and the principles of microbial ecology. For example, in the field of DNA sequencing, researchers have begun to move away from sequencing model organisms, and instead turn towards individual tissues/tumours as well as the environment. Similarly, proteomics technology is also rapidly gaining momentum, becoming the technology of choice for system-wide quantifications of proteins and their modifications. For the first time, this allows researchers to model (and measure) across scales, from functional cell subsystems and complete pathways to entire biological communities.

Targeted sequencing to unravel microbial communities

Here, we study the deep diversity of entire microbial communities, based on targeted sequencing of suitable marker genes (mostly ribosomal RNA subunit genes). This approach has become necessary because most naturally occurring microbes cannot be grown in the laboratory, which has led to a situation where the current molecular knowledge of microbes is vastly incomplete and quite biased. Only recently, researchers have begun to fill this gap by cultivation-independent molecular techniques such as metagenomics sequencing. We use specialized algorithms and databases to analyse the vast amounts of molecular sequence data that this produces.

Evolution of habitats and communities

Taking a comparative view across large sets of published microbial sampling studies, we systematically re-process, unify and quantify the underlying sequence data. For this, we develop procedures and databases that provide a consistent annotation of the sequences, and we assign hierarchical pathways and functional classes to the observed lineages based on completely sequenced genomes and annotated metagenomes.
By comparing independent environmental samples with one another, and by correlating their functional annotations to taxonomic assignments, our research aim is to exhaustively map functional capabilities onto the phylogenetic tree of life. This should reveal where and when certain phenotypes first developed, how they have been transferred between individual lineages, and what environments are the most promising to isolate genes of interest.

Proteome Quantification and Organization

The proteome (i.e. the set of expressed and functional proteins) constitutes the main ‘business end’ of the cell, carrying out the vast majority of catalytic, structural and regulatory functions. Our group is interested in how the proteome is made up in quantitative terms, and how the various components of the proteome interact with each other. To help study this, we produce and maintain online databases that are widely used and referenced in the life-science community (including STRING-db for protein networks, and PAX-db for protein abundance). We assess and compare protein networks and their quantitative makeup across evolutionary model organisms, aiming to uncover quantitative constraints, selective forces and high-level scaling laws.

More information about our research


“A network of yeast proteins expressed at different abundances”