Biocuration 2016

The Ninth International Biocuration Conference will be held in the hometown of the Swiss-Prot Database, Geneva, Switzerland, from April 10-14, 2016.

Workshops


There are 10 workshops and tutorials proposed for the Biocuration 2016 conference. Note that the available space may be limited for some workshops. In the conference registration, please indicate the workshops you would like to attend. Priority will be given on a first come, first served basis.


Overview

Time Sessions A Sessions B Sessions C Session D Session E
Morning WS1: Community Curation WS5: Cancer-Disease ontology T7: Introduction to SPARQL , example applications: neXtProt and UniProt WS9: Text Mining and Biocuration: Moving to Integration WS10: Information extraction for glycobiology: building a wishlist for a toolbox
WS2: Curation innovation session
Afternoon WS3: Training needs for biocuration WS6: Wikidata as a platform for biocuration T8: Tutorial: Gene Enrichment Analysis: Tips and Strategies for Success
WS4: Making your database or data standard more visible, informative, and discoverable

WS1: Community Curation

Chair: Mary Ann Tuli, Wormbase, Caltech, CA, USA

Summary: While we welcome the increase in the numbers of scientific papers, it has become impossible for professional curators to keep up with the valuable but labour intensive task that manual curation involves. In addition, current journal scanning methods have allowed us to flag older papers for curation, thus increasing the backlog further. A number of Community Curation projects exist and representatives from those projects will share their experiences in this workshop. The workshop will share methods and tools, outreach strategies, communications with journals, and results.

Duration: 2 hours

Target audience: Curators and tool developers (leaning more towards curators), journal editors.


WS2: Curation innovation session

Organisers: Pascale Gaudet1, Mike Cherry2, Susanna Lewis3

1 SIB Swiss Institute of Bioinformatics, Switzerland
2 Stanford University, CA, USA
3 Lawrence Berkeley National Laboratory, CA, USA

Summary: The objective of this workshop is producing a white paper detailing the ISB community’s recommendations for data resource sustainability. Funding agencies are interested in hearing from the International Biocuration Society to learn our thoughts, efforts and plans for developing, promoting and fostering innovative approaches to biocuration. In a recent perspective paper the valuable expertise of curators is acknowledged but approaches for improving how data can be found, accessed, integrated and reused (FAIR) are called for. We have been encouraged to provide a distillation of the ISB community’s insight and recommendations for managing the knowledge assets. During the workshop the initial draft document will be discussed and refined.

Duration: 2 hours

Target audience: All conference attendees


WS3: Training needs for biocuration

Organisers: Patricia Palagi1,5, Vicky Schneider2,5, Claire O’Donovan3,5, Marc Robinson-Rechavi1,4

1 SIB Swiss Institute of Bioinformatics, Switzerland
2 TGAC, United Kingdom
3 EMBL-European Bioinformatics Institute, United Kingdom
4 University of Lausanne, Switzerland
5 GOBLET

Summary: There is currently no recognised qualification in biocuration, either to provide new curators with a route for gaining the requisite skills to work within this field, or as a way for curators working in the field to gain recognition for the varied skill-sets that such a career provides them with. This workshop is intended to stimulate discussion on the training biocurators should have, and the role ISB, GOBLET and this community should have in defining the training needs and gaps, and set of competencies. We will work together to discuss this and other related questions, and propose concrete answers to advance biocuration training.

Duration: 2 hours

Target audience: All biocurators


WS4: Making your database or data standard more visible, informative, and discoverable

Organisers: Peter McQuilton and Susanna-Assunta Sansone, Oxford University, UK

Summary: The growing movement for reproducible research has led to a proliferation of community-developed standards, bringing with it new sociological and technological challenges. BioSharing aims to romote harmonisation and consistency, by providing a one-stop shop for content standards, databases and data policies in the life sciences. In this workshop we will discuss how to make your resource more visible, by adhering to the BioDBcore guidelines and addition of your resource to our BioSharing database registry; more informative, through using and linking to well defined content standards; and more discoverable by claiming the record for your resource on BioSharing and linking it to other relevant records in our registries.

Duration: 2 hours

Target audience: Biocurators in general, data standard and database maintainers in particular.


WS5: Cancer-Disease ontology

Organisers: Raja Mazumder1, Lynn Schriml2, Warren Kibbe3, Pascale Gaudet4

1 George Washington University, DC, USA
2 University of Maryland, MD, USA
3 National Cancer Institute, MD, USA
4 SIB Swiss Institute of Bioinformatics, Switzerland

Summary: Harmonizing the disparate usage of cancer terms across cancer data repositories creates a network of knowledge connecting information in a novel way and thus enabling cancer analysis across datasets and resources. This session will focus on furthering cancer term alignment activities initiated in 2014 (Wu et al. 2015) for pan-cancer data integration and analysis to build a comprehensive cancer disease ontology. This workshop will focus on building cancer term alignment between the main disease vocabularies. We will first describe the current status of our collaborative efforts to build the first DO_cancer_slim, to then introduce that the next steps:

  • aligning the larger (1000+) set of terms from ClinVar, COSMIC, EDRN, etc;
  • making this mapping resource available; and
  • aligning NCI thesaurus and DO.

Workshop Objectives: To identify:

  • patterns for term mapping for each cancer term resource (that we could implement);
  • community partners to collaborate on term mapping; (kickoff community curation activity)
  • opportunities for implementation of an integrated set of cancer terms; (implementation examples)
  • barriers and proposed solutions to integration;
  • first steps to integrating NCI thesaurus and DO (share NCI-DO mappings)
  • outline Action Items, Objectives and Participants for follow-up 2-day meeting and workshop to be held in Washington DC (most likely 2016 Summer or Fall)

Workshop slides

Duration: 4 hours

Target audience: Anyone interested in cancer research.

Note: This workshop is limited to 50 attendees.

WS6: Wikidata as a platform for biocuration

Organisers: Benjamin Good1, Andrew I. Su1

Co-Organisers: Sebastian Burgstaller-Muehlbacher1​​, Timothy Putman1, Elvira Mitraka​​3, Andra Waagmeester​​2, Justin Leong4, Paul Pavlidis4, Lynn Schriml3

1 The Scripps Research Institute, La Jolla, CA, USA
2 micelio.be, Belgium
3 University of Maryland, USA
4 The University of British Columbia, Canada

Summary: This workshop would address the challenge of creating a knowledge commons for biology. This commons would provide: semantic integration of information spanning many domains, an interface for community contribution and curation of data, support for multiple languages, and API-level access for editing and retrieving data in external applications. The goal of the workshop is to introduce and propose Wikidata, the open public knowledge base now provided by the MediaWiki foundation, as a platform upon which to build this commons. We would provide an introduction to Wikidata (and its relationship to Wikipedia), explain how it is already being used by members of the biocuration community, and open up a discussion about how it might be used in the context of other ongoing biocuration efforts.

Duration: 4 hours

Target audience: The target audience for this workshop would include those that curate biological databases and are interested in: encouraging community curation, increasing use of curated data, semantic integration with other resources, and internationalization. It would be of particular interest to those with some knowledge and experience with the Semantic Web.

Note: This workshop is limited to 50 attendees.

T7: Tutorial: Introduction to SPARQL , example applications: neXtProt and UniProt

Organisers: Pierre-André Michel, Alain Gateau, Daniel Teixeira and Jerven Bolleman
SIB Swiss Institute of Bioinformatics, Switzerland

Summary: This workshop will be divided in three parts. The first part is an introduction to SPARQL (approx 2 h). The second part will focus on using and creating SPARQL queries on the neXtProt SPARQL engine (search.nextprot.org). This part will explore how to create complex queries making the best use of the wealth of information available in neXtProt as well as taking advantage of federated queries across "remote" resources such as DrugBank, ChEMBL and PDB. The last part focuses on the UniProt SPARQL engine (sparql.uniprot.org) introducing the UniProt data model and showing example queries.

Duration: 4 hours

Target audience: All conference attendeeUsers interested in learning SPARQL and in developing queries for neXtProt and UniProt.

Note: This workshop is limited to 25 attendees.

T8: Tutorial: Gene Enrichment Analysis: Tips and Strategies for Success

Organisers: Judith Blake1 and Paul Thomas2

1 Jackson Laboratory, ME, USA
2 University of Southern California, CA, USA

Summary: It is a certainty that functional genomics data uploaded to a gene enrichment analysis program will generate some output of results. What are the possible parameters of data analysis input? How can the output facilitate understanding of your data? This workshop will particularly focus on the use of gene ontology annotations for term and gene set enrichment analysis, although other methods for enrichment analysis will be included. A variety of tools and strategies will be presented for within and across species interpretation of whole genome functional genomics studies.

Duration: 4 hours

Target audience: Biocurators and data managers often work directly with experimentalists to facilitate appropriate use of data analysis tools, or they may also use these tools in their work directly.

Note: This workshop is limited to 25 attendees.

WS9: Text Mining and Biocuration: Moving to Integration

Organisers: Patrick Ruch, Julien Gobeill
SIB Swiss Institute of Bioinformatics & HES-SO/HEG Geneva, Switzerland

Note: Introduction and discussion panel will be in common with the workshop ‘Information extraction for glycobiology: building a wishlist for a toolbox’.

Summary: The workshop will combine different presentations related to advances and challenges in the field of Text Mining applied to Biocuration, with a special focus on chemistry data. Thus, together with regular presentations on Text Mining for the curation of molecular biology databases, joint sessions will be organized with the workshop on information extraction for glycobiology, organized by Frédérique Lisacek. In these joint sessions, the idea is to design and explore the creation of different resources (ontologies, standards, text mining services…) to support the annotation of glycobiology. The problem statement will be formulated at the beginning of the day with the presentation of the glycobiology use cases. Then at the end of the day, a round table will be jointly organized to tentatively draft a shared workplan.

Tenative schedule: Introduction to text mining, biocuration, glycan annotation, chemistry/patent curation, posters and demo, discussion panel

Duration: all day

Target audience: Primarily biologists and database curators as well as those active in biomedical data management in pharma and biotech companies.

Note: This workshop is limited to 30 attendees.

WS10: Information extraction for glycobiology: building a wishlist for a toolbox

Chair: Frederique Lisacek
SIB Swiss Institute of Bioinformatics, Switzerland

Note: Introduction and discussion panel will be in common with the workshop ‘Text Mining and Biocuration: Moving to Integration’.

Summary: There are few databases covering glycobiology. To develop this, we need guidelines to support both manual and automated curation. For the latter, glycobiology has very specific issues:

  • a glycan structure is often described in images/graphics or in idiosyncratic formulas (e.g., NeuAc(a2-3)Gal(b1-4)GlcNAc) that are not standardised in free text
  • several complementary methods are often used and deciphering the M&M section is not straightforward
  • interaction data are very hard to capture and we can hardly benefit from ppi experience
  • functional annotation very often means searching for a motif within a glycan structure that defines the actual binding

Duration: all day

Target audience: Glycobiologists, proteomicists, text and data miners and glycoinformatics specialists.

Note: This workshop is limited to 30 attendees.