Conceptual modelling of genomic information.
MOTIVATION: Genome sequencing projects are making available complete records of the genetic make-up of organisms. These core data sets are themselves complex, and present challenges to those who seek to store, analyse and present the information. However, in addition to the sequence data, high throughput experiments are making available distinctive new data sets on protein interactions, the phenotypic consequences of gene deletions, and on the transcriptome, proteome, and metabolome. The effective description and management of such data is of considerable importance to bioinformatics in the post-genomic era. The provision of clear and intuitive models of complex information is surprisingly challenging, and this paper presents conceptual models for a range of important emerging information resources in bioinformatics. It is hoped that these can be of benefit to bioinformaticians as they attempt to integrate genetic and phenotypic data with that from genomic sequences, in order to both assign gene functions and elucidate the different pathways of gene action and interaction. RESULTS: This paper presents a collection of conceptual (i.e. implementation-independent) data models for genomic data. These conceptual models are amenable to (more or less direct) implementation on different computing platforms.