Vital-IT research software

Current:

DMP Canvas Generator: a tool aiming to help scientists generate Data Management Plans for SNSF funded projects

DMP Canvas Generator is a tool aiming to help scientists generate Data Management Plans for SNSF funded projects. A web form composed of seven sections guides the user through the definition of the requirements for the management of its project data. The produced Word document is compliant with the SNSF instructions for DMP creation and consists of generic paragraphs corresponding to the user's inputs. The produced document structure follows the structure of the SNSF DMP questionnaire. The document must be further modified before submission to reflect the specific aspects of the project.

Access to this resource is possible with a Switch AAI account.

MetaNetX: Automated Model Construction and Genome Annotation for Large-Scale Metabolic Networks

Website

MetaNetX is a repository of genome-scale metabolic networks (GSMNs) and biochemical pathways from a number of major resources imported into a common namespace of chemical compounds, reactions, cellular compartments—namely MNXref—and proteins. The MetaNetX.org website provides access to these integrated data as well as a variety of tools that allow users to import their own GSMNs, map them to the MNXref reconciliation, and manipulate, compare, analyze, simulate (using flux balance analysis) and export the resulting GSMNs. MNXref and MetaNetX are regularly updated and freely available.

Publication: Sébastien Moretti, Olivier Martin, T Van Du Tran, Alan Bridge, Anne Morgat, Marco Pagni MetaNetX/MNXref - reconciliation of metabolites and biochemical reactions to bring together genome-scale metabolic networks. Nucleic acids research, 44 (2016) doi:10.1093/nar/gkv1117

MyHits: an interactive resource for analyzing protein sequences

Hits is a free database devoted to protein domains. It is also a collection of tools for the investigation of the relationships between protein sequences and motifs described on them. These motifs are defined by an heterogeneous collection of predictors, which currently includes regular expressions, generalized profiles and hidden Markov models.

Publication: Pagni M, Ioannidis V, Cerutti L, Zahn-Zabal M, Jongeneel CV, Hau J, Martin O, Kuznetsov D, Falquet L. MyHits: improvements to an interactive resource for analyzing protein sequences. Nucleic Acids Res. 2007 Jul; 35(Web Server issue):W433-7 publication link

OpenFlu: a database for human and animal influenza virus

The OpenFlu database (OpenFluDB) is part of a collaborative effort to share observations on the evolution of Influenza virus in both animals and humans. It contains genomic and protein sequences as well as epidemiological data from more than 25'000 isolates.

The isolate annotations include:

virus type, subtype and lineage
host
geographical location
experimentally tested antiviral resistance

Protein sequences are automatically derived from nucleotide sequences. From these, putative enhanced pathogenicity and human adaptation propensity are computed.

Each virus isolate can be associated with the laboratories that collected, sequenced and submitted it.

Several analysis tools are available and enable rapid and efficient mining.

multiple sequence alignment (MUSCLE)
phylogenetic analysis
sequence similarity maps

The OpenFluDB contents is supplied by direct user submission as well as by a daily automatic procedure importing data from public repositories (GenBank). Additionally, a simple mechanism facilitates the export of OpenFluDB records to GenBank.

Publication: Liechti R, Gleizes A, Kuznetsov D, Bougueleret L, Le Mercier Ph, Bairoch A, Xenarios I. OpenFluDB, a database for human and animal influenza virus. Database, 2010:baq004 (2010) doi:10.1093/database/baq004

pfsearch3: search a protein or DNA sequence library for sequence segments matching a profile

pftools

The new pfsearchV3 program replaces the original pfsearch program distributed with the pftools. It uses modern CPU instructions to exploit the capabilities of multicore processors and a new heuristic filter to rapidly score and select possible candidate matches. On a modern dual-threaded 8 cores computer these improvements increase the speed of pfsearch by 2 orders of magnitude.

SwissLipids: a knowledgebase for lipid biology

Website

Motivation: Lipids are a large and diverse group of biological molecules with roles in membrane formation, energy storage and signaling. Cellular lipidomes may contain tens of thousands of structures, a staggering degree of complexity whose significance is not yet fully understood. High-throughput mass spectrometry-based platforms provide a means to study this complexity, but the interpretation of lipidomic data and its integration with prior knowledge of lipid biology suffers from a lack of appropriate tools to manage the data and extract knowledge from it.

Results: To facilitate the description and exploration of lipidomic data and its integration with prior biological knowledge, we have developed a knowledge resource for lipids and their biology—SwissLipids. SwissLipids provides curated knowledge of lipid structures and metabolism which is used to generate an in silico library of feasible lipid structures. These are arranged in a hierarchical classification that links mass spectrometry analytical outputs to all possible lipid structures, metabolic reactions and enzymes. SwissLipids provides a reference namespace for lipidomic data publication, data exploration and hypothesis generation. We are continually updating the SwissLipids hierarchy with new lipid categories and new expert curated knowledge.

Publication: Aimo L, Liechti R, Hyka-Nouspikel N, Niknejad A, Gleizes A, Götz L, Kuznetsov D, David FP, van der Goot FG, Riezman H, Bougueleret L, Xenarios I, Bridge A. The SwissLipids knowledgebase for lipid biology. Bioinformatics., 31:2860-6. (2015) doi:10.1093/bioinformatics/btv285

Legacy:

boolSim: algorithms based on reduced ordered binary decision diagrams (ROBDDs) for Boolean modeling of gene regulatory networks

boolSim/genYsis is a software written by Abhishek Garg. It implements algorithms based on reduced ordered binary decision diagrams (ROBDDs) to evaluate attractors and perform perturbation experiments on Boolean networks, using synchronous or asynchronous network dynamics (https://doi.org/10.1093/bioinformatics/btn336).

This version of boolSim was modified by Julien Dorier to:

interpret networks with complex logical rules.
accept networks in sbml-qual format (https://dx.doi.org/10.1186/1752-0509-7-135 and https://dx.doi.org/10.2390/biecoll-jib-2015-270
output sets of reachable states from a user defined set of initial states.

Version 1.2.0

Binary distribution

download Linux 64-bit (with libc>=2.12)
download Mac OS X 64-bit (>= 10.6)

Documentation

open PDF

Version 1.1.0

Binary distribution

download Linux 64-bit (with libc>=2.12)
download Mac OS X 64-bit (>= 10.6)

Documentation

open PDF

genYsis Toolbox

This is the original genYsis toolbox stored here for archival purpose. Please use the updated version boolSim above.

genYsis Toolbox (v3.0 beta) binaries are available for the following platforms:

Linux (64 bits gcc version 4.4.5, Debian 4.4.5-8)
Mac OS X (64 bits gcc version 4.2.1, Mac OS X 10.8.5)

After downloading and extracting the Linux distribution, execute the following two commands in the 'genYsis' directory:

 chmod +x boolSim
 chmod +x genYsis

CentrioleScreen: functional genomic screen in human cells reveals novel regulators of centriole biogenesis

Website

We performed a genome wide siRNA screen to identify genes required for proper centriole number in human cells. Correct centriole number ensures the faithful formation of primary cilia in resting cells and promotes the robust assembly of the bipolar spindle in mitotic cells. Normally, cells are born with two centrioles that duplicate during the cell cycle, such that there are four centrioles by the time of mitosis, two in each centrosome. Cells with a lower number of centrioles (“underduplication phenotype”) exhibit problems notably in spindle orientation, fidelity of chromosome segregation and cilium formation. By contrast, cells with a higher number of centrioles (“overamplification phenotype”) can form multipolar spindles and exhibit unfaithful chromosome segregation. Despite forward genetic and functional genomic screens that have been conducted in invertebrate systems, the mechanisms governing proper centriole number remains incompletely understood in human cells. These and related considerations led us to develop and execute a genome-wide functional genomic screen in this system to identify components that regulate centriole number.

This web interface is a resource that allows users to have full access to the data from the screen. Users can navigate through images and corresponding numerical values (e.g. for centrosome and nuclear features) for the 76138 independent experiments that were conducted, testing on average 4 distinct siRNAs per gene. Users can search for genes of interest as well as rank the entire data according to various criteria. Furthermore, users can add comments about the experiments, thus contributing to making this interface a rich resource for the exploration of cell biology by the scientific community.

Publication: Balestra et al., Discovering Regulators of Centriole Biogenesis through siRNA-Based Functional Genomics in Human Cells, Developmental Cell (2013) [pubmed: 23769972]

dbc454: taxonomy-independent, i.e. unsupervised, clustering of metagenomic amplicon sequences

Download

Taxonomy-independent, i.e. unsupervised, clustering of metagenomic amplicon sequences is essential for the definition of Operational Taxonomic Units. For this application, reproducibility and robustness should be the most sought after qualities, but have thus far largely been overlooked.

The method is described and benchmarked in: Density-based hierarchical clustering of pyro-sequences on a large scale—the case of fungal ITS1 (doi: 10.1093/bioinformatics/btt149)

FastEpistasis: a software tool capable of computing tests of epistasis for a large number of SNP pairs, is an efficient parallel extension to the PLINK epistasis module

FastEpistasis, a software tool capable of computing tests of epistasis for a large number of SNP pairs, is an efficient parallel extension to the PLINK epistasis module. It tests epistatic effects in the normal linear regression of a quantitative response on marginal effects of each SNP and an interaction effect of the SNP pair, where SNPs are coded as additive effects, taking user defined values or the default 0, 1 and 2. The test for epistasis reduces to testing whether the interaction term is significantly different from zero.

FastEpistasis optimizes the computations by splitting the analysis tasks into three separate applications: pre-, core- and post-computation.

The precomputation phase loads PLINK binary format data files, reformats the data for faster computations and reduces the number of conditions to test for in the core phase.
The core computational phase is designed to embarrassingly parallelize the computations, iterating through SNP pairs and efficiently carrying out the tests for epistasis. The computations are based on applying the QR decomposition to derive least squares estimates of the interaction coefficient and its standard error. The core computation software comes in several versions to take advantage of different high performance architectures - a Shared Memory Processor (SMP) version and a clustered Message Passing Interface (MPI) version.
An optional post-computation phase is provided to aggregate results from each processor or core, include detailed SNP information, compute p-values from each test, and convert to text files.

Sources are available at: https://gitlab.sib.swiss/tschuepb/FastEpistasis

Knoto-ID: a tool to study the entanglement of open protein chains using the concept of knotoids

Download Knoto-ID

The backbone of most proteins forms an open curve. To study their entanglement, a common strategy consists in searching for the presence of knots in their backbones using topological invariants. However, this approach requires to close the curve into a loop, which alters its geometry. Knoto-ID allows evaluating the entanglement of open curves without the need to close them, using the recent concept of knotoids which is a generalization of classical knot theory to open curves. Knoto-ID can analyse the global topology of the full chain as well as the local topology by exhaustively studying all subchains or only determining the knotted core. The use of Knoto-ID is not limited to proteins, it can be used to analyse any open curve in 3D space such as chromosomes, synthetic polymers, random walks, etc.

If you use this software for a publication, please cite:
J. Dorier, D. Goundaroulis, F. Benedetti and A. Stasiak, "Knoto-ID: a tool to study the entanglement of open protein chains using the concept of knotoids", Bioinformatics 34, 3402-3404 (2018).

If you use the knotoid classification given in files examples/knotoid_names_sphere.txt or examples/knotoid_names_planar.txt, please cite:
D. Goundaroulis, J. Dorier and A. Stasiak, "A systematic classification of knotoids on the plane and on the sphere", arXiv:1902.07277 [math.GT].

Features

Knoto-ID is a collection of command line tools to study knot and knotoid diagrams on the sphere and on the plane. Its main features are:

Accept protein structures in Protein Data Bank (PDB) format, 3D curves in xyz format, knot and knotoid diagrams in extended Gauss code format or PD code format.
Draw knot(oid) diagrams.
Evaluate the following polynomial invariant: classical Jones polynomial for knots, Jones polynomial for knotoids and Turaev loop bracket for planar knotoids.
Evaluate polynomial invariant for multiple projection direction and produce projection maps on the plane of spherical coordinates or directly on a 3D globe (in interactive webGL format). with the 3D curve used as input.
Find knotted cores of 3D curves.
Output a 3D curve (in interactive webGL format) and highlight its knotted core.
Generate fingerprint matrices (for open curves) and disk matrices (for closed curves) to summarize entanglement of all subchains of a 3D curve.

Please read the user guide distributed with Knoto-ID for more information.

optimusqual: boolean regulatory network reconstruction using literature based knowledge

Download (Linux 64-bit with libc>=2.12)

Implementation of the method described in Dorier et al. 2016.

Support

For questions concerning the software, please contact @email

SQUAD: a software for the dynamic modelling of regulatory networks using the Standardized Qualitative Approach

Download for 32-bit Linux or Windows (only works with Java 1.6!)

SQUAD is a software for the dynamic modelling of regulatory networks using the Standardized Qualitative Approach published by Mendoza and Xenarios. The software is described in Di Cara et al., 2007.

The method has three novel aspects with respect to other approaches. First, the user needs to provide only the connectivity of a regulatory network; no rate values, interaction strengths, or kinetic data is needed as input. Second, the stable steady states of activation of the continuous model are automatically found, without the need of running the system from several initial states. And third, the resulting equations have diverse tunable parameters so as to provide the possibility to fit the continuous model to existing experimental data. The algorithm behind SQUAD has already been shown to correctly describe the qualitative behavior of a large regulatory network.

Vital-IT research software

Current:

DMP Canvas Generator: a tool aiming to help scientists generate Data Management Plans for SNSF funded projects

MetaNetX: Automated Model Construction and Genome Annotation for Large-Scale Metabolic Networks

MyHits: an interactive resource for analyzing protein sequences

OpenFlu: a database for human and animal influenza virus

pfsearch3: search a protein or DNA sequence library for sequence segments matching a profile

SwissLipids: a knowledgebase for lipid biology

Legacy:

boolSim: algorithms based on reduced ordered binary decision diagrams (ROBDDs) for Boolean modeling of gene regulatory networks

Version 1.2.0

Version 1.1.0

genYsis Toolbox

CentrioleScreen: functional genomic screen in human cells reveals novel regulators of centriole biogenesis

dbc454: taxonomy-independent, i.e. unsupervised, clustering of metagenomic amplicon sequences

FastEpistasis: a software tool capable of computing tests of epistasis for a large number of SNP pairs, is an efficient parallel extension to the PLINK epistasis module

Knoto-ID: a tool to study the entanglement of open protein chains using the concept of knotoids

optimusqual: boolean regulatory network reconstruction using literature based knowledge

SQUAD: a software for the dynamic modelling of regulatory networks using the Standardized Qualitative Approach

A comprehensive digital map of human metabolism to improve disease treatment

Core Facilities focus group

Unveiling our strategic objectives for a better future

Vital-IT Computational Biology

Clinical Bioinformatics

Using data science to illuminate the causes of inflammation in arthritis

Predicting clinical antimicrobial resistance

SIB Days 2018: Swiss Bioinformatics connected

Related topics

Vital-IT research software

Current:

DMP Canvas Generator: a tool aiming to help scientists generate Data Management Plans for SNSF funded projects

MetaNetX: Automated Model Construction and Genome Annotation for Large-Scale Metabolic Networks

MyHits: an interactive resource for analyzing protein sequences

OpenFlu: a database for human and animal influenza virus

pfsearch3: search a protein or DNA sequence library for sequence segments matching a profile

SwissLipids: a knowledgebase for lipid biology

Legacy:

boolSim: algorithms based on reduced ordered binary decision diagrams (ROBDDs) for Boolean modeling of gene regulatory networks

Version 1.2.0

Version 1.1.0

genYsis Toolbox

CentrioleScreen: functional genomic screen in human cells reveals novel regulators of centriole biogenesis

dbc454: taxonomy-independent, i.e. unsupervised, clustering of metagenomic amplicon sequences

FastEpistasis: a software tool capable of computing tests of epistasis for a large number of SNP pairs, is an efficient parallel extension to the PLINK epistasis module

Knoto-ID: a tool to study the entanglement of open protein chains using the concept of knotoids

optimusqual: boolean regulatory network reconstruction using literature based knowledge

SQUAD: a software for the dynamic modelling of regulatory networks using the Standardized Qualitative Approach

Read next

A comprehensive digital map of human metabolism to improve disease treatment

Core Facilities focus group

Unveiling our strategic objectives for a better future

Vital-IT Computational Biology

Clinical Bioinformatics

Using data science to illuminate the causes of inflammation in arthritis

Predicting clinical antimicrobial resistance

SIB Days 2018: Swiss Bioinformatics connected

Related topics