CiteULike is a free online bibliography manager. Register and you can start organising your references online.

Gaps in structurally similar proteins: towards improvement of multiple sequence alignment. Export

Proteins, Vol. 54, No. 1. (1 January 2004), pp. 71-87.

Citation Format

[Posts]

View FullText article


analogAI's tags for this article

protein-sequence sequencealignment structuralalignment

X Reviews [Write a review of this article]

X Notes for this article

analogAI has 0 private notes and 1 public note for this article.

may be the answer on how to deal with gappy alignments.

analogAI (public note) - 2006-06-29 05:45:13

X Find related articles from these CiteULike users

X Find related articles with these CiteULike tags

X Posting History

X Abstract

An algorithm was developed to locally optimize gaps from the FSSP database. Over 2 million gaps were identified from all versus all FSSP structure comparisons, and datasets of non-identical gaps and flanking regions comprising between 90,000 and 135,000 sequence fragments were extracted for statistical analysis. Relative to background frequencies, gaps were enriched in residue types with small side chains and high turn propensity (D, G, N, P, S), and were depleted in residue types with hydrophobic side chains (C, F, I, L, V, W, Y). In contrast, regions flanking a gap exhibited opposite trends in amino acid frequencies, i.e., enrichment in hydrophobic residues and a high degree of secondary structure. Log-odds scores of residue type as a function of position in or around a gap were derived from the statistics. Three simple experiments demonstrated that these scores contained significant predictive information. First, regions where gaps were observed in single sequences taken from HOMSTRAD structure-based multiple sequence alignments generally scored higher than regions where gaps were not observed. Second, given the correct pairwise-aligned cores, the actual positions of gaps could be reproduced from sequence more accurately using the structurally-derived statistics than by using random pairwise alignments. Finally, revision of the Clustal-W residue-specific gap opening parameters with this new information improved the agreement of Clustal-W alignments with the structure-based alignments. At least three applications for these results are envisioned: improvement of gap penalties in pairwise (or multiple) sequence alignment, prediction of regions of single sequences likely (or unlikely) to contain indels, and more accurate placement of gaps in automated pairwise structure alignment.


X BibTeX record

X RIS record


Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.