We study molecular evolution of functional genomic elements, such as protein-coding genes, microRNAs etc., interpreting the genomic sequence variability created by the “Nature’s experimentation”. We use evolutionary models to digest the high-throughput sequencing data. The novel data, in turn, allows us to revise such evolutionary models. The technological advances in genomic sequencing and computational data analytics are reshaping many aspects of molecular biology and are expected to make a major impact on healthcare, from deeper testing of both the genetics of patients and that of pathogens.
Metagenomics is redefining microbiology, and it is expected to have a profound impact on future medicine. Shotgun (or WGS) metagenomics refers to sequencing of total DNA/RNA isolated from biological samples, capturing the entire community of microorganisms. The approach enables us to comprehensively sample all genes in all organisms present in a complex sample, including unculturable microorganisms that are otherwise difficult or impossible to assess. Nevertheless, while the sequence data generation has been industrialized, the data analytics remains challenging.
We do genomics of human viruses in close collaboration with our clinical colleagues. The scope ranges from finding recombination in enteroviruses and rhinoviruses using high-throughput phylogenomics, to studies of chronic viral infections with the human immunodeficiency virus (HIV), the human cytomegalovirus (CMV), and the Epstein-Barr virus (EBV) in the context of the human genetics of innate immunity genes, and to characterizing novel or underappreciated viruses, such as the Toscana virus or Astroviruses. We use deep-sequencing of clinical samples to determine the composition and evolutionary dynamics of the viral populations, which in turn can give clinicians the means to make more informed decisions regarding drug therapies.
We do genomics of insects and their arthropod relatives within the framework of the i5K consortium, which aims to coordinate the sequencing of 5,000 species. Insects are the most diverse and successful terrestrial animals, encompassing by far the largest number of species. Thus, insect genomes provide an exceptional opportunity to explore the evolutionary processes acting on genes and genomes and to understand how these processes translate into new functions and phenotypes. Insects compete with humans for food and transmit devastating diseases, such as malaria which it transmitted by mosquitos.
Gene families and orthology can be asserted by sequence analysis, providing the means to formulate hypotheses on gene functions from experimentation on model organisms. Comparative genomics and WGS metagenomics rely on the recognition of “equivalent” genes across organisms, and orthology, refining the concept of homology, addresses this need. We are developing the OrthoDB resource – the hierarchical catalogue of orthologs – and it is now among the top three orthology resources worldwide, in both methodology and coverage. It provides the most precise way to link our limited experimental knowledge of gene functions to a much wider genomics space. Using the OrthoDB-derived sets of Benchmarking Universal Single-Copy Orthologs we developed the BUSCO software for assessing genome assembly and annotation Orthologous gene anchors allow identifying longer orthologous genomic regions, synteny blocks, which can highlight gene arrangements constrained by natural selection, and help to identify other conserved non-coding sequences (CNCs, including the subclass of ultra-conserved elements – UCEs) that are numerous but remain largely uncharacterized. Harnessing the power of multiple species comparisons to detect genomic elements under purifying selection we developed a resource of Conserved Elements from Genomic Alignments (CEGA). Identifying the orthology of genomic elements also helps to computationally predict genes, e.g. precursor miRNA genes, as well as to study their targets. MicroRNAs, or miRNAs, post-transcriptionally repress the expression of protein-coding genes, and evaluating the strength of mRNA repression by miRNAs remains challenging. We developed software, miRmap, which covers novel and all previously proposed approaches for the biologically informative ranking of potential miRNA targets