<?xml version="1.0" encoding="UTF-8"?>

<rdf:RDF
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
   xmlns="http://purl.org/rss/1.0/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:prism="http://prismstandard.org/namespaces/1.2/basic/"
   xmlns:dcterms="http://purl.org/dc/terms/"

>
<channel rdf:about="http://www.citeulike.org/about">
<pubDate>Sat, 05 Jul 2008 12:55:43 BST</pubDate>


	<title>CiteULike: dpollard's alignment</title>
	<description>CiteULike: dpollard's alignment</description>


	<link>http://www.citeulike.org/user/dpollard/tag/alignment</link>
	<dc:publisher>CiteULike.org</dc:publisher>
	<dc:language>en-gb</dc:language>
	<dc:rights>Copyright &#169; 2004-2008 citeulike.org</dc:rights>
	<items>
    <rdf:Seq>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/2288308"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/2318094"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/1891904"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/1818073"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/1746568"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/1415748"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/800570"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/581203"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/525518"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/1388733"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/1320157"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/1161181"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/1082428"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/1074003"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/976858"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/455156"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/878406"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/849684"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/816850"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/698675"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/690141"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/625347"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/610970"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/580530"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/dpollard/article/573481"/>

	</rdf:Seq>
	</items>
	</channel>


<item rdf:about="http://www.citeulike.org/user/dpollard/article/2288308">
    <title>Alignment Uncertainty and Genomic Analysis</title>
    <link>http://www.citeulike.org/user/dpollard/article/2288308</link>
    <description>&lt;i&gt;Science, Vol. 319, No. 5862. (25 January 2008), pp. 473-476.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;The statistical methods applied to the analysis of genomic data do not account for uncertainty in the sequence alignment. Indeed, the alignment is treated as an observation, and all of the subsequent inferences depend on the alignment being correct. This may not have been too problematic for many phylogenetic studies, in which the gene is carefully chosen for, among other things, ease of alignment. However, in a comparative genomics study, the same statistical methods are applied repeatedly on thousands of genes, many of which will be difficult to align. Using genomic data from seven yeast species, we show that uncertainty in the alignment can lead to several problems, including different alignment methods resulting in different conclusions. 10.1126/science.1151532</description>
    <dc:title>Alignment Uncertainty and Genomic Analysis</dc:title>

    <dc:creator>Karen Wong</dc:creator>
    <dc:creator>Marc Suchard</dc:creator>
    <dc:creator>John Huelsenbeck</dc:creator>
    <dc:identifier>doi:10.1126/science.1151532</dc:identifier>
    <dc:source>Science, Vol. 319, No. 5862. (25 January 2008), pp. 473-476.</dc:source>
    <dc:date>2008-01-25T07:09:02-00:00</dc:date>
    <prism:publicationYear>2008</prism:publicationYear>
    <prism:publicationName>Science</prism:publicationName>
    <prism:volume>319</prism:volume>
    <prism:number>5862</prism:number>
    <prism:startingPage>473</prism:startingPage>
    <prism:endingPage>476</prism:endingPage>
    <prism:category>alignment</prism:category>
    <prism:category>alignment_accuracy</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/2318094">
    <title>The effect of the guide tree on multiple sequence alignments and subsequent phylogenetic analyses.</title>
    <link>http://www.citeulike.org/user/dpollard/article/2318094</link>
    <description>&lt;i&gt;Pac Symp Biocomput (2008), pp. 25-36.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Many multiple sequence alignment methods (MSAs) use guide trees in conjunction with a progressive alignment technique to generate a multiple sequence alignment but use differing techniques to produce the guide tree and to perform the progressive alignment. In this paper we explore the consequences of changing the guide tree used for the alignment routine. We evaluate four leading MSA methods (ProbCons, MAFFT, Muscle, and ClustalW) as well as a new MSA method (FTA, for &#34;Fixed Tree Alignment&#34;) which we have developed, on a wide range of simulated datasets. Although improvements in alignment accuracy can be obtained by providing better guide trees, in general there is little effect on the &#34;accuracy&#34; (measured using the SP-score) of the alignment by improving the guide tree. However, RAxML-based phylogenetic analyses of alignments based upon better guide trees tend to be much more accurate. This impact is particularly significant for ProbCons, one of the best MSA methods currently available, and our method, FTA. Finally, for very good guide trees, phylogenies based upon FTA alignments are more accurate than phylogenies based upon ProbCons alignments, suggesting that further improvements in phylogenetic accuracy may be obtained through algorithms of this type.</description>
    <dc:title>The effect of the guide tree on multiple sequence alignments and subsequent phylogenetic analyses.</dc:title>

    <dc:creator>S Nelesen</dc:creator>
    <dc:creator>K Liu</dc:creator>
    <dc:creator>D Zhao</dc:creator>
    <dc:creator>CR Linder</dc:creator>
    <dc:creator>T Warnow</dc:creator>
    <dc:source>Pac Symp Biocomput (2008), pp. 25-36.</dc:source>
    <dc:date>2008-02-01T06:13:52-00:00</dc:date>
    <prism:publicationYear>2008</prism:publicationYear>
    <prism:publicationName>Pac Symp Biocomput</prism:publicationName>
    <prism:issn>1793-5091</prism:issn>
    <prism:startingPage>25</prism:startingPage>
    <prism:endingPage>36</prism:endingPage>
    <prism:category>accuracy</prism:category>
    <prism:category>alignment</prism:category>
    <prism:category>alignment_accuracy</prism:category>
    <prism:category>method</prism:category>
    <prism:category>multiple_alignment</prism:category>
    <prism:category>phylogeny</prism:category>
    <prism:category>reconstruction</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/1891904">
    <title>MORPH: Probabilistic Alignment Combined with Hidden Markov Models of cis-Regulatory Modules</title>
    <link>http://www.citeulike.org/user/dpollard/article/1891904</link>
    <description>&lt;i&gt;PLoS Computational Biology, Vol. 3, No. 11. (1 November 2007), e216.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;The discovery and analysis of cis-regulatory modules (CRMs) in metazoan genomes is crucial for understanding the transcriptional control of development and many other biological processes. Cross-species sequence comparison holds much promise for improving computational prediction of CRMs, for elucidating their binding site composition, and for understanding how they evolve. Current methods for analyzing orthologous CRMs from multiple species rely upon sequence alignments produced by off-the-shelf alignment algorithms, which do not exploit the presence of binding sites in the sequences. We present here a unified probabilistic framework, called MORPH, that integrates the alignment task with binding site predictions, allowing more robust CRM analysis in two species. The framework sums over all possible alignments of two sequences, thus accounting for alignment ambiguities in a natural way. We perform extensive tests on orthologous CRMs from two moderately diverged species Drosophila melanogaster and D. mojavensis, to demonstrate the advantages of the new approach. We show that it can overcome certain computational artifacts of traditional alignment tools and provide a different, likely more accurate, picture of cis-regulatory evolution than that obtained from existing methods. The burgeoning field of cis-regulatory evolution, which is amply supported by the availability of many related genomes, is currently thwarted by the lack of accurate alignments of regulatory regions. Our work will fill in this void and enable more reliable analysis of CRM evolution.</description>
    <dc:title>MORPH: Probabilistic Alignment Combined with Hidden Markov Models of cis-Regulatory Modules</dc:title>

    <dc:creator>Saurabh Sinha</dc:creator>
    <dc:creator>Xin He</dc:creator>
    <dc:identifier>doi:10.1371/journal.pcbi.0030216</dc:identifier>
    <dc:source>PLoS Computational Biology, Vol. 3, No. 11. (1 November 2007), e216.</dc:source>
    <dc:date>2007-11-10T02:00:31-00:00</dc:date>
    <prism:publicationYear>2007</prism:publicationYear>
    <prism:publicationName>PLoS Computational Biology</prism:publicationName>
    <prism:volume>3</prism:volume>
    <prism:number>11</prism:number>
    <prism:startingPage>e216</prism:startingPage>
    <prism:category>alignment</prism:category>
    <prism:category>alignment_accuracy</prism:category>
    <prism:category>binding_site_alignment</prism:category>
    <prism:category>brant_presented</prism:category>
    <prism:category>cis_regulatory_elements</prism:category>
    <prism:category>eisen_journal_club</prism:category>
    <prism:category>method</prism:category>
    <prism:category>round_robin</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/1818073">
    <title>Enhancing the quality of phylogenetic analysis using fuzzy hidden Markov model alignments.</title>
    <link>http://www.citeulike.org/user/dpollard/article/1818073</link>
    <description>&lt;i&gt;Medinfo, Vol. 12, No. Pt 2. (2007), pp. 1245-1249.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Any effective phylogeny inference based on molecular data begins by performing efficient multiple sequence alignments. So far, the Hidden Markov Model (HMM) method for multiple sequence alignment has been proved competitive to the classical deterministic algorithms with respect to phylogenetic analysis; nevertheless, its stochastic nature does not help it cope with the existing dependence among the sequence elements. This paper deals with phylogenetic analysis of protein and gene data using multiple sequence alignments produced by fuzzy profile Hidden Markov Models. Fuzzy profile HMMs are a novel type of profile HMMs based on fuzzy sets and fuzzy integrals, which generalize the classical stochastic HMM by relaxing its independence assumptions. In this paper, alignments produced by the fuzzy HMM model are used in phylogenetic analysis of protein data, enhancing the quality of phylogenetic trees. The new methodology is implemented in HPV virus phylogenetic inference. The results of the analysis are compared against those obtained by the classical profile HMM model and depict the superiority of the fuzzy profile HMM in this field.</description>
    <dc:title>Enhancing the quality of phylogenetic analysis using fuzzy hidden Markov model alignments.</dc:title>

    <dc:creator>C Collyda</dc:creator>
    <dc:creator>S Diplaris</dc:creator>
    <dc:creator>P Mitkas</dc:creator>
    <dc:creator>N Maglaveras</dc:creator>
    <dc:creator>C Pappas</dc:creator>
    <dc:source>Medinfo, Vol. 12, No. Pt 2. (2007), pp. 1245-1249.</dc:source>
    <dc:date>2007-10-25T01:22:43-00:00</dc:date>
    <prism:publicationYear>2007</prism:publicationYear>
    <prism:publicationName>Medinfo</prism:publicationName>
    <prism:volume>12</prism:volume>
    <prism:number>Pt 2</prism:number>
    <prism:startingPage>1245</prism:startingPage>
    <prism:endingPage>1249</prism:endingPage>
    <prism:category>alignment</prism:category>
    <prism:category>alignment_accuracy</prism:category>
    <prism:category>method</prism:category>
    <prism:category>phylogeny</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/1746568">
    <title>Incorporating evolution of transcription factor binding sites into annotated alignments.</title>
    <link>http://www.citeulike.org/user/dpollard/article/1746568</link>
    <description>&lt;i&gt;J Biosci, Vol. 32, No. 5. (August 2007), pp. 841-850.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Identifying transcription factor binding sites (TFBSs) is essential to elucidate putative regulatory mechanisms.A common strategy is to combine cross-species conservation with single sequence TFBS annotation to yield &#34;conserved TFBSs&#34;. Most current methods in this field adopt a multi-step ap proach that segregates the two aspects.Again,it is widely accepted that the evolutionary dynamics of binding sites differ from those of the surrounding sequence. Hence, it is desirable to have an approach that explicitly takes this factor into account.Although a plethora of approaches have been proposed for the prediction of conserved TFBSs,very few explicitly model TFBS evolutionary properties, while additionally being multi-step. Recently, we introduced a novel approach to simultaneously align and annotate conserved TFBSs in a pair of sequences.Building upon the standard Smith-Waterman algorithm for local alignments, SimAnn introduces additional states for profiles to output extended alignments or annotated alignments.That is, alignments with parts annotated as gap lessly aligned TFBSs (pair-profile hits)are generated.Moreover,the pair- profile related parameters are derived in a sound statistical framework. In this article,we extend this approach to explicitly incorporate evolution of binding sites in the SimAnn framework.We demonstrate the extension in the theoretical derivations through two position-specific evolutionary models,previously used for modelling TFBS evolution.In a simulated setting,we provide a proof of concept that the approach works given the underlying assumptions,as compared to the original work.Finally,using a real dataset of experimentally verified binding sites in human-mouse sequence pairs,we compare the new approach (eSimAnn) to an existing multi-step tool that also considers TFBS evolution. Although it is widely accepted that binding sites evolve differently from the surrounding sequences,most comparative TFBS identification methods do not explicitly consider this.Additionally, prediction of conserved binding sites is carried out in a multi-step approach that segregates alignment from TFBS annotation. In this paper, we demonstrate how the simultaneous alignment and annotation approach of SimAnn can be further extended to incorporate TFBS evolutionary relationships.We study how alignments and binding site predictions interplay at varying evolutionary distances and for various profile qualities.</description>
    <dc:title>Incorporating evolution of transcription factor binding sites into annotated alignments.</dc:title>

    <dc:creator>AS Bais</dc:creator>
    <dc:creator>S Grossmann</dc:creator>
    <dc:creator>M Vingron</dc:creator>
    <dc:source>J Biosci, Vol. 32, No. 5. (August 2007), pp. 841-850.</dc:source>
    <dc:date>2007-10-09T17:18:49-00:00</dc:date>
    <prism:publicationYear>2007</prism:publicationYear>
    <prism:publicationName>J Biosci</prism:publicationName>
    <prism:issn>0250-5991</prism:issn>
    <prism:volume>32</prism:volume>
    <prism:number>5</prism:number>
    <prism:startingPage>841</prism:startingPage>
    <prism:endingPage>850</prism:endingPage>
    <prism:category>alignment</prism:category>
    <prism:category>binding_site</prism:category>
    <prism:category>conserved</prism:category>
    <prism:category>method</prism:category>
    <prism:category>prediction</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/1415748">
    <title>Measuring the accuracy of genome-size multiple alignments</title>
    <link>http://www.citeulike.org/user/dpollard/article/1415748</link>
    <description>&lt;i&gt;Genome Biology, Vol. 8 (26 June 2007), R124.&lt;/i&gt;</description>
    <dc:title>Measuring the accuracy of genome-size multiple alignments</dc:title>

    <dc:creator>Amol Prakash</dc:creator>
    <dc:creator>Martin Tompa</dc:creator>
    <dc:identifier>doi:10.1186/gb-2007-8-6-r124</dc:identifier>
    <dc:source>Genome Biology, Vol. 8 (26 June 2007), R124.</dc:source>
    <dc:date>2007-06-27T10:59:45-00:00</dc:date>
    <prism:publicationYear>2007</prism:publicationYear>
    <prism:publicationName>Genome Biology</prism:publicationName>
    <prism:issn>1465-6906</prism:issn>
    <prism:volume>8</prism:volume>
    <prism:startingPage>R124</prism:startingPage>
    <prism:category>alignment</prism:category>
    <prism:category>alignment_accuracy</prism:category>
    <prism:category>whole_genome</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/800570">
    <title>Detecting the limits of regulatory element conservation and divergence estimation using pairwise and multiple alignments</title>
    <link>http://www.citeulike.org/user/dpollard/article/800570</link>
    <description>&lt;i&gt;BMC Bioinformatics, Vol. 7 (14 August 2006), 376.&lt;/i&gt;</description>
    <dc:title>Detecting the limits of regulatory element conservation and divergence estimation using pairwise and multiple alignments</dc:title>

    <dc:creator>Daniel Pollard</dc:creator>
    <dc:creator>Alan Moses</dc:creator>
    <dc:creator>Venky Iyer</dc:creator>
    <dc:creator>Michael Eisen</dc:creator>
    <dc:identifier>doi:10.1186/1471-2105-7-376</dc:identifier>
    <dc:source>BMC Bioinformatics, Vol. 7 (14 August 2006), 376.</dc:source>
    <dc:date>2006-08-14T05:59:05-00:00</dc:date>
    <prism:publicationYear>2006</prism:publicationYear>
    <prism:publicationName>BMC Bioinformatics</prism:publicationName>
    <prism:issn>1471-2105</prism:issn>
    <prism:volume>7</prism:volume>
    <prism:startingPage>376</prism:startingPage>
    <prism:category>alignment</prism:category>
    <prism:category>alignment_accuracy</prism:category>
    <prism:category>binding_site_alignment</prism:category>
    <prism:category>divergence_estimation</prism:category>
    <prism:category>tree_topology</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/581203">
    <title>Multiple sequence alignment accuracy and evolutionary distance estimation.</title>
    <link>http://www.citeulike.org/user/dpollard/article/581203</link>
    <description>&lt;i&gt;BMC Bioinformatics, Vol. 6 (2005)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;BACKGROUND: Sequence alignment is a common tool in bioinformatics and comparative genomics. It is generally assumed that multiple sequence alignment yields better results than pair wise sequence alignment, but this assumption has rarely been tested, and never with the control provided by simulation analysis. This study used sequence simulation to examine the gain in accuracy of adding a third sequence to a pair wise alignment, particularly concentrating on how the phylogenetic position of the additional sequence relative to the first pair changes the accuracy of the initial pair's alignment as well as their estimated evolutionary distance. RESULTS: The maximal gain in alignment accuracy was found not when the third sequence is directly intermediate between the initial two sequences, but rather when it perfectly subdivides the branch leading from the root of the tree to one of the original sequences (making it half as close to one sequence as the other). Evolutionary distance estimation in the multiple alignment framework, however, is largely unrelated to alignment accuracy and rather is dependent on the position of the third sequence; the closer the branch leading to the third sequence is to the root of the tree, the larger the estimated distance between the first two sequences. CONCLUSION: The bias in distance estimation appears to be a direct result of the standard greedy progressive algorithm used by many multiple alignment methods. These results have implications for choosing new taxa and genomes to sequence when resources are limited.</description>
    <dc:title>Multiple sequence alignment accuracy and evolutionary distance estimation.</dc:title>

    <dc:creator>MS Rosenberg</dc:creator>
    <dc:identifier>doi:10.1186/1471-2105-6-278</dc:identifier>
    <dc:source>BMC Bioinformatics, Vol. 6 (2005)</dc:source>
    <dc:date>2006-04-10T18:05:54-00:00</dc:date>
    <prism:publicationYear>2005</prism:publicationYear>
    <prism:publicationName>BMC Bioinformatics</prism:publicationName>
    <prism:issn>1471-2105</prism:issn>
    <prism:volume>6</prism:volume>
    <prism:category>alignment</prism:category>
    <prism:category>alignment_accuracy</prism:category>
    <prism:category>divergence_estimation</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/525518">
    <title>The many faces of sequence alignment.</title>
    <link>http://www.citeulike.org/user/dpollard/article/525518</link>
    <description>&lt;i&gt;Brief Bioinform, Vol. 6, No. 1. (March 2005), pp. 6-22.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Starting with the sequencing of the mouse genome in 2002, we have entered a period where the main focus of genomics will be to compare multiple genomes in order to learn about human biology and evolution at the DNA level. Alignment methods are the main computational component of this endeavour. This short review aims to summarise the current status of research in alignments, emphasising large-scale genomic comparisons and suggesting possible directions that will be explored in the near future.</description>
    <dc:title>The many faces of sequence alignment.</dc:title>

    <dc:creator>S Batzoglou</dc:creator>
    <dc:identifier>doi:10.1093/bib/6.1.6</dc:identifier>
    <dc:source>Brief Bioinform, Vol. 6, No. 1. (March 2005), pp. 6-22.</dc:source>
    <dc:date>2006-03-01T16:58:55-00:00</dc:date>
    <prism:publicationYear>2005</prism:publicationYear>
    <prism:publicationName>Brief Bioinform</prism:publicationName>
    <prism:issn>1467-5463</prism:issn>
    <prism:volume>6</prism:volume>
    <prism:number>1</prism:number>
    <prism:startingPage>6</prism:startingPage>
    <prism:endingPage>22</prism:endingPage>
    <prism:category>alignment</prism:category>
    <prism:category>review</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/1388733">
    <title>Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome</title>
    <link>http://www.citeulike.org/user/dpollard/article/1388733</link>
    <description>&lt;i&gt;Genome Res., Vol. 17, No. 6. (1 June 2007), pp. 760-774.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation, alignment, and evolutionary constraint analyses of 23 mammalian species for all ENCODE targets. Alignments were generated using four different methods; comparisons of these methods reveal large-scale consistency but substantial differences in terms of small genomic rearrangements, sensitivity (sequence coverage), and specificity (alignment accuracy). We describe the quantitative and qualitative trade-offs concomitant with alignment method choice and the levels of technical error that need to be accounted for in applications that require multisequence alignments. Using the generated alignments, we identified constrained regions using three different methods. While the different constraint-detecting methods are in general agreement, there are important discrepancies relating to both the underlying alignments and the specific algorithms. However, by integrating the results across the alignments and constraint-detecting methods, we produced constraint annotations that were found to be robust based on multiple independent measures. Analyses of these annotations illustrate that most classes of experimentally annotated functional elements are enriched for constrained sequences; however, large portions of each class (with the exception of protein-coding sequences) do not overlap constrained regions. The latter elements might not be under primary sequence constraint, might not be constrained across all mammals, or might have expendable molecular functions. Conversely, 40% of the constrained sequences do not overlap any of the functional elements that have been experimentally identified. Together, these findings demonstrate and quantify how many genomic functional elements await basic molecular characterization. 10.1101/gr.6034307</description>
    <dc:title>Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome</dc:title>

    <dc:creator>Elliott Margulies</dc:creator>
    <dc:creator>Gregory Cooper</dc:creator>
    <dc:creator>George Asimenos</dc:creator>
    <dc:creator>Daryl Thomas</dc:creator>
    <dc:creator>Colin Dewey</dc:creator>
    <dc:creator>Adam Siepel</dc:creator>
    <dc:creator>Ewan Birney</dc:creator>
    <dc:creator>Damian Keefe</dc:creator>
    <dc:creator>Ariel Schwartz</dc:creator>
    <dc:creator>Minmei Hou</dc:creator>
    <dc:creator>James Taylor</dc:creator>
    <dc:creator>Sergey Nikolaev</dc:creator>
    <dc:creator>Juan Montoya-Burgos</dc:creator>
    <dc:creator>Ari Loytynoja</dc:creator>
    <dc:creator>Simon Whelan</dc:creator>
    <dc:creator>Fabio Pardi</dc:creator>
    <dc:creator>Tim Massingham</dc:creator>
    <dc:creator>James Brown</dc:creator>
    <dc:creator>Peter Bickel</dc:creator>
    <dc:creator>Ian Holmes</dc:creator>
    <dc:creator>James Mullikin</dc:creator>
    <dc:creator>Abel Ureta-Vidal</dc:creator>
    <dc:creator>Benedict Paten</dc:creator>
    <dc:creator>Eric Stone</dc:creator>
    <dc:creator>Kate Rosenbloom</dc:creator>
    <dc:creator>James Kent</dc:creator>
    <dc:creator>Gerard Bouffard</dc:creator>
    <dc:creator>Xiaobin Guan</dc:creator>
    <dc:creator>Nancy Hansen</dc:creator>
    <dc:creator>Jacquelyn Idol</dc:creator>
    <dc:creator>Valerie Maduro</dc:creator>
    <dc:creator>Baishali Maskeri</dc:creator>
    <dc:creator>Jennifer Mcdowell</dc:creator>
    <dc:creator>Morgan Park</dc:creator>
    <dc:creator>Pamela Thomas</dc:creator>
    <dc:creator>Alice Young</dc:creator>
    <dc:creator>Robert Blakesley</dc:creator>
    <dc:creator>Donna Muzny</dc:creator>
    <dc:creator>Erica Sodergren</dc:creator>
    <dc:creator>David Wheeler</dc:creator>
    <dc:creator>Kim Worley</dc:creator>
    <dc:creator>Huaiyang Jiang</dc:creator>
    <dc:creator>George Weinstock</dc:creator>
    <dc:creator>Richard Gibbs</dc:creator>
    <dc:creator>Tina Graves</dc:creator>
    <dc:creator>Robert Fulton</dc:creator>
    <dc:creator>Elaine Mardis</dc:creator>
    <dc:creator>Richard Wilson</dc:creator>
    <dc:creator>Michele Clamp</dc:creator>
    <dc:creator>James Cuff</dc:creator>
    <dc:creator>Sante Gnerre</dc:creator>
    <dc:creator>David Jaffe</dc:creator>
    <dc:creator>Jean Chang</dc:creator>
    <dc:creator>Kerstin Lindblad-Toh</dc:creator>
    <dc:creator>Eric Lander</dc:creator>
    <dc:creator>Angie Hinrichs</dc:creator>
    <dc:creator>Heather Trumbower</dc:creator>
    <dc:creator>Hiram Clawson</dc:creator>
    <dc:creator>Ann Zweig</dc:creator>
    <dc:creator>Robert Kuhn</dc:creator>
    <dc:creator>Galt Barber</dc:creator>
    <dc:creator>Rachel Harte</dc:creator>
    <dc:creator>Donna Karolchik</dc:creator>
    <dc:creator>Matthew Field</dc:creator>
    <dc:creator>Richard Moore</dc:creator>
    <dc:creator>Carrie Matthewson</dc:creator>
    <dc:creator>Jacqueline Schein</dc:creator>
    <dc:creator>Marco Marra</dc:creator>
    <dc:creator>Stylianos Antonarakis</dc:creator>
    <dc:creator>Serafim Batzoglou</dc:creator>
    <dc:creator>Nick Goldman</dc:creator>
    <dc:creator>Ross Hardison</dc:creator>
    <dc:creator>David Haussler</dc:creator>
    <dc:creator>Webb Miller</dc:creator>
    <dc:creator>Lior Pachter</dc:creator>
    <dc:creator>Eric Green</dc:creator>
    <dc:creator>Arend Sidow</dc:creator>
    <dc:identifier>doi:10.1101/gr.6034307</dc:identifier>
    <dc:source>Genome Res., Vol. 17, No. 6. (1 June 2007), pp. 760-774.</dc:source>
    <dc:date>2007-06-14T00:04:53-00:00</dc:date>
    <prism:publicationYear>2007</prism:publicationYear>
    <prism:publicationName>Genome Res.</prism:publicationName>
    <prism:volume>17</prism:volume>
    <prism:number>6</prism:number>
    <prism:startingPage>760</prism:startingPage>
    <prism:endingPage>774</prism:endingPage>
    <prism:category>alignment</prism:category>
    <prism:category>constraint</prism:category>
    <prism:category>encode</prism:category>
    <prism:category>mammal</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/1320157">
    <title>STAMP: a web tool for exploring DNA-binding motif similarities.</title>
    <link>http://www.citeulike.org/user/dpollard/article/1320157</link>
    <description>&lt;i&gt;Nucleic Acids Res (3 May 2007)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;STAMP is a newly developed web server that is designed to support the study of DNA-binding motifs. STAMP may be used to query motifs against databases of known motifs; the software aligns input motifs against the chosen database (or alternatively against a user-provided dataset), and lists of the highest-scoring matches are returned. Such similarity-search functionality is expected to facilitate the identification of transcription factors that potentially interact with newly discovered motifs. STAMP also automatically builds multiple alignments, familial binding profiles and similarity trees when more than one motif is inputted. These functions are expected to enable evolutionary studies on sets of related motifs and fixed-order regulatory modules, as well as illustrating similarities and redundancies within the input motif collection. STAMP is a highly flexible alignment platform, allowing users to 'mix-and-match' between various implemented comparison metrics, alignment methods (local or global, gapped or ungapped), multiple alignment strategies and tree-building methods. Motifs may be inputted as frequency matrices (in many of the commonly used formats), consensus sequences, or alignments of known binding sites. STAMP also directly accepts the output files from 12 supported motif-finders, enabling quick interpretation of motif-discovery analyses. STAMP is available at http://www.benoslab.pitt.edu/stamp.</description>
    <dc:title>STAMP: a web tool for exploring DNA-binding motif similarities.</dc:title>

    <dc:creator>Shaun Mahony</dc:creator>
    <dc:creator>Panayiotis V Benos</dc:creator>
    <dc:source>Nucleic Acids Res (3 May 2007)</dc:source>
    <dc:date>2007-05-22T23:40:05-00:00</dc:date>
    <prism:publicationYear>2007</prism:publicationYear>
    <prism:publicationName>Nucleic Acids Res</prism:publicationName>
    <prism:issn>1362-4962</prism:issn>
    <prism:category>alignment</prism:category>
    <prism:category>method</prism:category>
    <prism:category>motif</prism:category>
    <prism:category>pwm</prism:category>
    <prism:category>website</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/1161181">
    <title>Incorporating Indel Information into Phylogeny Estimation for Rapidly Emerging Pathogens</title>
    <link>http://www.citeulike.org/user/dpollard/article/1161181</link>
    <description>&lt;i&gt;BMC Evolutionary Biology, Vol. 7 (14 March 2007), 40.&lt;/i&gt;</description>
    <dc:title>Incorporating Indel Information into Phylogeny Estimation for Rapidly Emerging Pathogens</dc:title>

    <dc:creator>Benjamin Redelings</dc:creator>
    <dc:creator>Marc Suchard</dc:creator>
    <dc:identifier>doi:10.1186/1471-2148-7-40</dc:identifier>
    <dc:source>BMC Evolutionary Biology, Vol. 7 (14 March 2007), 40.</dc:source>
    <dc:date>2007-03-14T19:28:41-00:00</dc:date>
    <prism:publicationYear>2007</prism:publicationYear>
    <prism:publicationName>BMC Evolutionary Biology</prism:publicationName>
    <prism:issn>1471-2148</prism:issn>
    <prism:volume>7</prism:volume>
    <prism:startingPage>40</prism:startingPage>
    <prism:category>alignment</prism:category>
    <prism:category>gene</prism:category>
    <prism:category>indels</prism:category>
    <prism:category>method</prism:category>
    <prism:category>phylogeny</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/1082428">
    <title>Multiple alignment by sequence annealing.</title>
    <link>http://www.citeulike.org/user/dpollard/article/1082428</link>
    <description>&lt;i&gt;Bioinformatics, Vol. 23, No. 2. (15 January 2007)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;MOTIVATION: We introduce a novel approach to multiple alignment that is based on an algorithm for rapidly checking whether single matches are consistent with a partial multiple alignment. This leads to a sequence annealing algorithm, which is an incremental method for building multiple sequence alignments one match at a time. Our approach improves significantly on the standard progressive alignment approach to multiple alignment. RESULTS: The sequence annealing algorithm performs well on benchmark test sets of protein sequences. It is not only sensitive, but also specific, drastically reducing the number of incorrectly aligned residues in comparison to other programs. The method allows for adjustment of the sensitivity/specificity tradeoff and can be used to reliably identify homologous regions among protein sequences. AVAILABILITY: An implementation of the sequence annealing algorithm is available at http://bio.math.berkeley.edu/amap/</description>
    <dc:title>Multiple alignment by sequence annealing.</dc:title>

    <dc:creator>AS Schwartz</dc:creator>
    <dc:creator>L Pachter</dc:creator>
    <dc:source>Bioinformatics, Vol. 23, No. 2. (15 January 2007)</dc:source>
    <dc:date>2007-02-01T19:46:44-00:00</dc:date>
    <prism:publicationYear>2007</prism:publicationYear>
    <prism:publicationName>Bioinformatics</prism:publicationName>
    <prism:issn>1460-2059</prism:issn>
    <prism:volume>23</prism:volume>
    <prism:number>2</prism:number>
    <prism:category>alignment</prism:category>
    <prism:category>method</prism:category>
    <prism:category>protein</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/1074003">
    <title>Simultaneous alignment and annotation of cis-regulatory regions.</title>
    <link>http://www.citeulike.org/user/dpollard/article/1074003</link>
    <description>&lt;i&gt;Bioinformatics, Vol. 23, No. 2. (15 January 2007)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;MOTIVATION: Current methods that annotate conserved transcription factor binding sites in an alignment of two regulatory regions perform the alignment and annotation step separately and combine the results in the end. If the site descriptions are weak or the sequence similarity is low, the local gap structure of the alignment poses a problem in detecting the conserved sites. It is therefore desirable to have an approach that is able to simultaneously consider the alignment as well as possibly matching site locations. RESULTS: With SimAnn we have developed a tool that serves exactly this purpose. By combining the annotation step and the alignment of the two sequences into one algorithm, it detects conserved sites more clearly. It has the additional advantage that all parameters are calculated based on statistical considerations. This allows for its successful application with any binding site model of interest. We present the algorithm and the approach for parameter selection and compare its performance with that of other, non-simultaneous methods on both simulated and real data. AVAILABILITY: A command-line based C++ implementation of SimAnn is available from the authors upon request. In addition, we provide Perl scripts for calculating the input parameters based on statistical considerations.</description>
    <dc:title>Simultaneous alignment and annotation of cis-regulatory regions.</dc:title>

    <dc:creator>AS Bais</dc:creator>
    <dc:creator>S Grossmann</dc:creator>
    <dc:creator>M Vingron</dc:creator>
    <dc:source>Bioinformatics, Vol. 23, No. 2. (15 January 2007)</dc:source>
    <dc:date>2007-01-29T08:42:16-00:00</dc:date>
    <prism:publicationYear>2007</prism:publicationYear>
    <prism:publicationName>Bioinformatics</prism:publicationName>
    <prism:issn>1460-2059</prism:issn>
    <prism:volume>23</prism:volume>
    <prism:number>2</prism:number>
    <prism:category>alignment</prism:category>
    <prism:category>binding_site</prism:category>
    <prism:category>prediction</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/976858">
    <title>Logarithmic gap costs decrease alignment accuracy</title>
    <link>http://www.citeulike.org/user/dpollard/article/976858</link>
    <description>&lt;i&gt;BMC Bioinformatics, Vol. 7 (05 December 2006), 527.&lt;/i&gt;</description>
    <dc:title>Logarithmic gap costs decrease alignment accuracy</dc:title>

    <dc:creator>Reed Cartwright</dc:creator>
    <dc:identifier>doi:10.1186/1471-2105-7-527</dc:identifier>
    <dc:source>BMC Bioinformatics, Vol. 7 (05 December 2006), 527.</dc:source>
    <dc:date>2006-12-06T14:56:51-00:00</dc:date>
    <prism:publicationYear>2006</prism:publicationYear>
    <prism:publicationName>BMC Bioinformatics</prism:publicationName>
    <prism:issn>1471-2105</prism:issn>
    <prism:volume>7</prism:volume>
    <prism:startingPage>527</prism:startingPage>
    <prism:category>accuracy</prism:category>
    <prism:category>alignment</prism:category>
    <prism:category>indels</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/455156">
    <title>LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA.</title>
    <link>http://www.citeulike.org/user/dpollard/article/455156</link>
    <description>&lt;i&gt;Genome Res, Vol. 13, No. 4. (April 2003), pp. 721-731.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;To compare entire genomes from different species, biologists increasingly need alignment methods that are efficient enough to handle long sequences, and accurate enough to correctly align the conserved biological features between distant species. We present LAGAN, a system for rapid global alignment of two homologous genomic sequences, and Multi-LAGAN, a system for multiple global alignment of genomic sequences. We tested our systems on a data set consisting of greater than 12 Mb of high-quality sequence from 12 vertebrate species. All the sequence was derived from the genomic region orthologous to an approximately 1.5-Mb region on human chromosome 7q31.3. We found that both LAGAN and Multi-LAGAN compare favorably with other leading alignment methods in correctly aligning protein-coding exons, especially between distant homologs such as human and chicken, or human and fugu. Multi-LAGAN produced the most accurate alignments, while requiring just 75 minutes on a personal computer to obtain the multiple alignment of all 12 sequences. Multi-LAGAN is a practical method for generating multiple alignments of long genomic sequences at any evolutionary distance. Our systems are publicly available at http://lagan.stanford.edu.</description>
    <dc:title>LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA.</dc:title>

    <dc:creator>M Brudno</dc:creator>
    <dc:creator>CB Do</dc:creator>
    <dc:creator>GM Cooper</dc:creator>
    <dc:creator>MF Kim</dc:creator>
    <dc:creator>E Davydov</dc:creator>
    <dc:creator>ED Green</dc:creator>
    <dc:creator>A Sidow</dc:creator>
    <dc:creator>S Batzoglou</dc:creator>
    <dc:creator></dc:creator>
    <dc:identifier>doi:10.1101/gr.926603</dc:identifier>
    <dc:source>Genome Res, Vol. 13, No. 4. (April 2003), pp. 721-731.</dc:source>
    <dc:date>2006-01-04T08:56:20-00:00</dc:date>
    <prism:publicationYear>2003</prism:publicationYear>
    <prism:publicationName>Genome Res</prism:publicationName>
    <prism:issn>1088-9051</prism:issn>
    <prism:volume>13</prism:volume>
    <prism:number>4</prism:number>
    <prism:startingPage>721</prism:startingPage>
    <prism:endingPage>731</prism:endingPage>
    <prism:category>alignment</prism:category>
    <prism:category>noncoding</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/878406">
    <title>Vertebrate gene finding from multiple-species alignments using a two-level strategy.</title>
    <link>http://www.citeulike.org/user/dpollard/article/878406</link>
    <description>&lt;i&gt;Genome Biol, Vol. 7 Suppl 1 (2006)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;BACKGROUND: One way in which the accuracy of gene structure prediction in vertebrate DNA sequences can be improved is by analyzing alignments with multiple related species, since functional regions of genes tend to be more conserved. RESULTS: We describe DOGFISH, a vertebrate gene finder consisting of a cleanly separated site classifier and structure predictor. The classifier scores potential splice sites and other features, using sequence alignments between multiple vertebrate species, while the structure predictor hypothesizes coding transcripts by combining these scores using a simple model of gene structure. This also identifies and assigns confidence scores to possible additional exons. Performance is assessed on the ENCODE regions. We predict transcripts and exons across the whole human genome, and identify over 10,000 high confidence new coding exons not in the Ensembl gene set. CONCLUSION: We present a practical multiple species gene prediction method. Accuracy improves as additional species, up to at least eight, are introduced. The novel predictions of the whole-genome scan should support efficient experimental verification.</description>
    <dc:title>Vertebrate gene finding from multiple-species alignments using a two-level strategy.</dc:title>

    <dc:creator>D Carter</dc:creator>
    <dc:creator>R Durbin</dc:creator>
    <dc:identifier>doi:10.1186/gb-2006-7-s1-s6</dc:identifier>
    <dc:source>Genome Biol, Vol. 7 Suppl 1 (2006)</dc:source>
    <dc:date>2006-09-29T23:21:40-00:00</dc:date>
    <prism:publicationYear>2006</prism:publicationYear>
    <prism:publicationName>Genome Biol</prism:publicationName>
    <prism:issn>1465-6914</prism:issn>
    <prism:volume>7 Suppl 1</prism:volume>
    <prism:category>alignment</prism:category>
    <prism:category>gene</prism:category>
    <prism:category>prediction</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/849684">
    <title>Quantification of the variation in percentage identity for protein sequence alignments</title>
    <link>http://www.citeulike.org/user/dpollard/article/849684</link>
    <description>&lt;i&gt;BMC Bioinformatics, Vol. 7 (19 September 2006), 415.&lt;/i&gt;</description>
    <dc:title>Quantification of the variation in percentage identity for protein sequence alignments</dc:title>

    <dc:creator>Ps Raghava</dc:creator>
    <dc:creator>Geoffrey Barton</dc:creator>
    <dc:identifier>doi:10.1186/1471-2105-7-415</dc:identifier>
    <dc:source>BMC Bioinformatics, Vol. 7 (19 September 2006), 415.</dc:source>
    <dc:date>2006-09-19T12:09:59-00:00</dc:date>
    <prism:publicationYear>2006</prism:publicationYear>
    <prism:publicationName>BMC Bioinformatics</prism:publicationName>
    <prism:issn>1471-2105</prism:issn>
    <prism:volume>7</prism:volume>
    <prism:startingPage>415</prism:startingPage>
    <prism:category>accuracy</prism:category>
    <prism:category>alignment</prism:category>
    <prism:category>gene</prism:category>
    <prism:category>method</prism:category>
    <prism:category>protein</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/816850">
    <title>MACO: A gapped-alignment scoring tool for comparing transcription factor binding sites.</title>
    <link>http://www.citeulike.org/user/dpollard/article/816850</link>
    <description>&lt;i&gt;In Silico Biol, Vol. 6, No. 4. (29 May 2006)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;We have implemented a novel gapped-alignment algorithm to compare Position Frequency Matrices (PFM) for Transcription Factor Binding Sites. The application compares an input PFM with those collected from public databases and outputs similarity scores, sequence alignments and related PFM clusters. MACO is freely accessible on a web server located at www.nicemice.cn/bioinfo/MACO. Source code is distributed upon request to the authors.</description>
    <dc:title>MACO: A gapped-alignment scoring tool for comparing transcription factor binding sites.</dc:title>

    <dc:creator>Gang Su</dc:creator>
    <dc:creator>Binchen Mao</dc:creator>
    <dc:creator>Jin Wang</dc:creator>
    <dc:source>In Silico Biol, Vol. 6, No. 4. (29 May 2006)</dc:source>
    <dc:date>2006-08-25T16:55:51-00:00</dc:date>
    <prism:publicationYear>2006</prism:publicationYear>
    <prism:publicationName>In Silico Biol</prism:publicationName>
    <prism:issn>1386-6338</prism:issn>
    <prism:volume>6</prism:volume>
    <prism:number>4</prism:number>
    <prism:category>alignment</prism:category>
    <prism:category>matrix</prism:category>
    <prism:category>pwm</prism:category>
    <prism:category>similarity</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/698675">
    <title>Close sequence comparisons are sufficient to identify human cis-regulatory elements.</title>
    <link>http://www.citeulike.org/user/dpollard/article/698675</link>
    <description>&lt;i&gt;Genome Res (12 June 2006)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons. To address this problem, we identified evolutionarily conserved noncoding regions in primate, mammalian, and more distant comparisons using a uniform approach (Gumby) that facilitates unbiased assessment of the impact of evolutionary distance on predictive power. We benchmarked computational predictions against previously identified cis-regulatory elements at diverse genomic loci and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using an in vivo enhancer assay in transgenic mice. Human regulatory elements were identified with acceptable sensitivity (53%-80%) and true-positive rate (27%-67%) by comparison with one to five other eutherian mammals or six other simian primates. More distant comparisons (marsupial, avian, amphibian, and fish) failed to identify many of the empirically defined functional noncoding elements. Our results highlight the practical utility of close sequence comparisons, and the loss of sensitivity entailed by more distant comparisons. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole-genome comparative analysis that explains most of the observations from empirical benchmarking. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for in vivo testing at embryonic time points.</description>
    <dc:title>Close sequence comparisons are sufficient to identify human cis-regulatory elements.</dc:title>

    <dc:creator>Shyam Prabhakar</dc:creator>
    <dc:creator>Francis Poulin</dc:creator>
    <dc:creator>Malak Shoukry</dc:creator>
    <dc:creator>Veena Afzal</dc:creator>
    <dc:creator>Edward M Rubin</dc:creator>
    <dc:creator>Olivier Couronne</dc:creator>
    <dc:creator>Len A Pennacchio</dc:creator>
    <dc:identifier>doi:10.1101/gr.4717506</dc:identifier>
    <dc:source>Genome Res (12 June 2006)</dc:source>
    <dc:date>2006-06-16T19:46:25-00:00</dc:date>
    <prism:publicationYear>2006</prism:publicationYear>
    <prism:publicationName>Genome Res</prism:publicationName>
    <prism:issn>1088-9051</prism:issn>
    <prism:category>alignment</prism:category>
    <prism:category>annotation</prism:category>
    <prism:category>conserved</prism:category>
    <prism:category>constraint</prism:category>
    <prism:category>evolution</prism:category>
    <prism:category>human</prism:category>
    <prism:category>mammal</prism:category>
    <prism:category>prediction</prism:category>
    <prism:category>rate</prism:category>
    <prism:category>selection</prism:category>
    <prism:category>transcription</prism:category>
    <prism:category>turnover</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/690141">
    <title>MCALIGN2: Faster, accurate global pairwise alignment of non-coding DNA sequences based on explicit models of indel evolution</title>
    <link>http://www.citeulike.org/user/dpollard/article/690141</link>
    <description>&lt;i&gt;BMC Bioinformatics, Vol. 7 (08 June 2006), 292.&lt;/i&gt;</description>
    <dc:title>MCALIGN2: Faster, accurate global pairwise alignment of non-coding DNA sequences based on explicit models of indel evolution</dc:title>

    <dc:creator>Jun Wang</dc:creator>
    <dc:creator>Peter Keightley</dc:creator>
    <dc:creator>Toby Johnson</dc:creator>
    <dc:identifier>doi:10.1186/1471-2105-7-292</dc:identifier>
    <dc:source>BMC Bioinformatics, Vol. 7 (08 June 2006), 292.</dc:source>
    <dc:date>2006-06-08T22:15:15-00:00</dc:date>
    <prism:publicationYear>2006</prism:publicationYear>
    <prism:publicationName>BMC Bioinformatics</prism:publicationName>
    <prism:issn>1471-2105</prism:issn>
    <prism:volume>7</prism:volume>
    <prism:startingPage>292</prism:startingPage>
    <prism:category>alignment</prism:category>
    <prism:category>evolution</prism:category>
    <prism:category>indel</prism:category>
    <prism:category>model</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/625347">
    <title>Evolutionary turnover of mammalian transcription start sites.</title>
    <link>http://www.citeulike.org/user/dpollard/article/625347</link>
    <description>&lt;i&gt;Genome Res (10 May 2006)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Alignments of homologous genomic sequences are widely used to identify functional genetic elements and study their evolution. Most studies tacitly equate homology of functional elements with sequence homology. This assumption is violated by the phenomenon of turnover, in which functionally equivalent elements reside at locations that are nonorthologous at the sequence level. Turnover has been demonstrated previously for transcriptionfactor-binding sites. Here, we show that transcription start sites of equivalent genes do not always reside at equivalent locations in the human and mouse genomes. We also identify two types of partial turnover, illustrating evolutionary pathways that could lead to complete turnover. These findings suggest that the signals encoding transcription start sites are highly flexible and evolvable, and have cautionary implications for the use of sequence-level conservation to detect gene regulatory elements.</description>
    <dc:title>Evolutionary turnover of mammalian transcription start sites.</dc:title>

    <dc:creator>Martin C Frith</dc:creator>
    <dc:creator>Jasmina Ponjavic</dc:creator>
    <dc:creator>David Fredman</dc:creator>
    <dc:creator>Chikatoshi Kai</dc:creator>
    <dc:creator>Jun Kawai</dc:creator>
    <dc:creator>Piero Carninci</dc:creator>
    <dc:creator>Yoshihide Hayshizaki</dc:creator>
    <dc:creator>Albin Sandelin</dc:creator>
    <dc:identifier>doi:10.1101/gr.5031006</dc:identifier>
    <dc:source>Genome Res (10 May 2006)</dc:source>
    <dc:date>2006-05-12T16:39:35-00:00</dc:date>
    <prism:publicationYear>2006</prism:publicationYear>
    <prism:publicationName>Genome Res</prism:publicationName>
    <prism:issn>1088-9051</prism:issn>
    <prism:category>alignment</prism:category>
    <prism:category>conserved</prism:category>
    <prism:category>evolution</prism:category>
    <prism:category>gene</prism:category>
    <prism:category>mammal</prism:category>
    <prism:category>promoter</prism:category>
    <prism:category>transcription</prism:category>
    <prism:category>turnover</prism:category>
    <prism:category>vertebrate</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/610970">
    <title>Evaluating phylogenetic footprinting for human-rodent comparisons.</title>
    <link>http://www.citeulike.org/user/dpollard/article/610970</link>
    <description>&lt;i&gt;Bioinformatics, Vol. 22, No. 4. (15 February 2006), pp. 430-437.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;MOTIVATION: 'Phylogenetic footprinting' is a widely applied approach to identify regulatory regions and potential transcription factor binding sites (TFBSs) using alignments of non-coding orthologous regions from two or more organisms. A systematic evaluation of its validity and usability based on known TFBSs is needed to use phylogenetic footprinting most effectively in the identification of unknown TFBSs. RESULTS: In this paper we use 2678 human, mouse and rat TFBSs from the TRANSFAC database for this evaluation. To ensure the retrieval of correct orthologous sequences, we combine gene annotation and sequence homology searches. Demanding a sequence identity of at least 65% is most effective in discriminating TFBSs from non-functional sequence parts, while different alignment algorithms only have a minor influence on TFBS identification by human-rodent comparisons. With this threshold approximately 72% of the known TFBSs are found conserved, a number which varies significantly between different transcription factors and also depends on the function of the regulated gene. TFBSs for certain transcription factors do not require strict sequence conservation but instead may show a high pattern conservation, limiting somewhat the validity of purely sequence-based phylogenetic footprinting.</description>
    <dc:title>Evaluating phylogenetic footprinting for human-rodent comparisons.</dc:title>

    <dc:creator>T Sauer</dc:creator>
    <dc:creator>E Shelest</dc:creator>
    <dc:creator>E Wingender</dc:creator>
    <dc:source>Bioinformatics, Vol. 22, No. 4. (15 February 2006), pp. 430-437.</dc:source>
    <dc:date>2006-05-01T21:43:56-00:00</dc:date>
    <prism:publicationYear>2006</prism:publicationYear>
    <prism:publicationName>Bioinformatics</prism:publicationName>
    <prism:issn>1367-4803</prism:issn>
    <prism:volume>22</prism:volume>
    <prism:number>4</prism:number>
    <prism:startingPage>430</prism:startingPage>
    <prism:endingPage>437</prism:endingPage>
    <prism:category>alignment</prism:category>
    <prism:category>binding</prism:category>
    <prism:category>conserved</prism:category>
    <prism:category>evolution</prism:category>
    <prism:category>mammal</prism:category>
    <prism:category>motif</prism:category>
    <prism:category>rate</prism:category>
    <prism:category>transcription</prism:category>
    <prism:category>vertebrate</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/580530">
    <title>Reference based annotation with GeneMapper.</title>
    <link>http://www.citeulike.org/user/dpollard/article/580530</link>
    <description>&lt;i&gt;Genome Biol, Vol. 7, No. 4. (5 April 2006)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;ABSTRACT : We introduce GeneMapper, a program for transferring annotations from a well annotated genome to other genomes. Drawing on high quality curated annotations, GeneMapper enables rapid and accurate annotation of newly sequenced genomes and is suitable for both finished and draft genomes. GeneMapper uses a profile based approach for mapping genes into multiple species, improving upon the standard pairwise approach. GeneMapper is freely available for academic use.</description>
    <dc:title>Reference based annotation with GeneMapper.</dc:title>

    <dc:creator>Sourav Chatterji</dc:creator>
    <dc:creator>Lior Pachter</dc:creator>
    <dc:identifier>doi:10.1186/gb-2006-7-4-r29</dc:identifier>
    <dc:source>Genome Biol, Vol. 7, No. 4. (5 April 2006)</dc:source>
    <dc:date>2006-04-08T22:23:11-00:00</dc:date>
    <prism:publicationYear>2006</prism:publicationYear>
    <prism:publicationName>Genome Biol</prism:publicationName>
    <prism:issn>1465-6914</prism:issn>
    <prism:volume>7</prism:volume>
    <prism:number>4</prism:number>
    <prism:category>alignment</prism:category>
    <prism:category>annotation</prism:category>
    <prism:category>gene</prism:category>
    <prism:category>method</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/dpollard/article/573481">
    <title>Sigma: multiple alignment of weakly-conserved non-coding DNA sequence.</title>
    <link>http://www.citeulike.org/user/dpollard/article/573481</link>
    <description>&lt;i&gt;BMC Bioinformatics, Vol. 7, No. 1. (16 March 2006)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;ABSTRACT: BACKGROUND: Existing tools for multiple-sequence alignment focus on aligning protein sequence or protein-coding DNA sequence, and are often based on extensions to Needleman-Wunsch-like pairwise alignment methods. We introduce a new tool, Sigma, with a new algorithm and scoring scheme designed specifically for non-coding DNA sequence. This problem acquires importance with the increasing number of published sequences of closely-related species. In particular, studies of gene regulation seek to take advantage of comparative genomics, and recent algorithms for finding regulatory sites in phylogenetically-related intergenic sequence require alignment as a preprocessing step. Much can also be learned about evolution from intergenic DNA, which tends to evolve faster than coding DNA. Sigma uses a strategy of seeking the best possible gapless local alignments (a strategy earlier used by DiAlign), at each step making the best possible alignment consistent with existing alignments, and scores the significance of the alignment based on the lengths of the aligned fragments and a background model which may be supplied or estimated from an auxiliary file of intergenic DNA. RESULTS: Comparative tests of sigma with five earlier algorithms on synthetic data generated to mimic real data show excellent performance, with Sigma balancing high ;;sensitivity&#8221; (more bases aligned) with effective filtering of ;;incorrect&#8221; alignments. With real data, while ;;correctness&#8221; can't be directly quantified for the alignment, running the PhyloGibbs motif finder on pre-aligned sequence suggests that Sigma's alignments are superior. CONCLUSIONS: By taking into account the peculiarities of non-coding DNA, Sigma fills a gap in the toolbox of bioinformatics.</description>
    <dc:title>Sigma: multiple alignment of weakly-conserved non-coding DNA sequence.</dc:title>

    <dc:creator>Rahul Siddharthan</dc:creator>
    <dc:identifier>doi:10.1186/1471-2105-7-143</dc:identifier>
    <dc:source>BMC Bioinformatics, Vol. 7, No. 1. (16 March 2006)</dc:source>
    <dc:date>2006-04-02T22:07:36-00:00</dc:date>
    <prism:publicationYear>2006</prism:publicationYear>
    <prism:publicationName>BMC Bioinformatics</prism:publicationName>
    <prism:issn>1471-2105</prism:issn>
    <prism:volume>7</prism:volume>
    <prism:number>1</prism:number>
    <prism:category>alignment</prism:category>
    <prism:category>noncoding</prism:category>
</item>



</rdf:RDF>

