The gene distribution of the human genome.
Linear correlations exist between the GC levels of third codon positions (GC3) of individual human genes and the GC levels of long genomic sequences and DNA molecules (50-100 kb in size) embedding the genes. These linear relationships allow the positioning of the GC3 histogram of cDNA sequences from the databases relative to the CsCl profile of human DNA. In turn, this allows an estimate of the relative concentrations of genes in genomic regions of different GC content. An estimate obtained by using current sequence data and Gaussian decompositions of the GC3 histogram and of the CsCl profile indicates that the GC-richest (non-ribosomal) component of the human genome is at least 17 times as gene-rich as the GC-poor regions. Moreover, our results suggest that the most recent physical maps of the human genome consisting of overlapping YACs cover less than 50% of the genes.