Dr. Mario Lauria
Systems Biology Lab
Telethon Institute of Genetics and Medicine (TIGEM), Naples, Italy
2008 Main Track
Title: Reverse engineering of gene networks: Overview & applications
Inferring or reverse-engineering gene networks can be defined as the process of identifying gene interactions from experimental data through computational analysis. Gene expression data from microarrays are typically used for this purpose. In this talk we will give an overview of our research covering some of the different reverse-engineering methods proposed in the field. Specifically, we will describe a Mutual Information based algorithm as a an example of the probabilistic approach to the problem. We will then introduce a multiple linear regression based algorithm that exemplifies the machine learning approach to gene inference. We will highlight some of the tradeoffs of the described algorithms, and will showcase their application to representative biological problems. We will then describe a synthetic biological circuit we have developed as an innovative approach to the benchmarking of both reverse engineering and modeling approaches.
Prof. Giorgio Valentini
DSI, Dipartimento di Scienze dell'Informazione- Università degli Studi di Milano
2008 Main Track
Title: Unsupervised stability-based ensembles to discover reliable structures in complex bio-molecular data (invited survey talk)
The assessment of the reliability of clusters discovered in bio-molecular data is a central issue in several bioinformatics problems, ranging from the definition of new taxonomies of malignancies based on bio-molecular data, to the validation of clusters of co-regulated or co-expressed genes, or the discovery ofunctional relationships from protein-protein interaction data. Recently, several methods based on the concept of stability have been proposed to estimate the reliability of clusters. In this conceptual framework a clustering ensemble is obtained through bootstrapping techniques, noise injection into the data or random projections into lower dimensional subspaces. A measure of the reliability of a given clustering is obtained through specific stability/reliability scores based on the similarity of the clusterings composing the ensemble. Classical stability-based methods do not provide an assessment of the statistical significance of the clustering solutions and are not able to directly detect multiple structures (e.g. hierarchical structures) simultaneously present in the data. We discuss statistical approaches based on the chi-square distribution and on the Bernstein inequality, showing that stability-based methods can be successfully applied to the statistical assessment of the reliability of clusters, and to discover multiple structures underlying complex bio-molecular data.
Prof. Nicolas Le Novere,
Computational Neurobiology, EMBL-EBI, Wellcome-Trust
Genome Campus, Hinxton,UK
session Computational Intelligence
Biological Data Visualization http://www.pa.icar.cnr.it/sscibb2008/
Title: The Systems Biology Graphical Notation (invited)
Standard graphical representations have played a crucial role in science and engineering throughout last century. Without electrical symbolic, it is very likely that our industrial society would not have evolved at the same pace. Similarly, specialised notations such as the Feynmann notation or the process flow diagrams did a lot for the adoption of concepts in their own fields. With the advent of Systems Biology, and more recently of Synthetic Biology, the need for precise and unambiguous descriptions of biochemical interactions has become more pressing. While some ideas have been advanced over the last decade, with a few detailed proposals, no actual community standard has emerged. The Systems Biology Graphical Notation (SBGN) is a graphical representation crafted over several years by a community of biochemists, modellers and computer scientists. Three orthogonal and complementary languages have been created, the Process Diagrams, the Entity Relationship Diagrams and the Activity Flow Diagrams. Using these three idioms a scientist can represent any network of biochemical interactions, which can then be interpreted in an unambiguous ways. The set of symbols used is limited, and the grammar quite simple, to allow its usage in textbook and its teaching directly in high school. The fiirst level of SBGN Process Diagram has been publicly released. Software support for SBGN Process Diagram was developed concurrently with its specification in order to speed-up public adoption. Shared by the communities of biochemists, genomicians, theoreticians and computational biologists, SBGN languages will foster efficient storage, exchange and reuse of information on signalling pathways, metabolic networks and gene regulatory maps.