Tags

walshtp's library 129 articles

 
 

Conveyor: a workflow engine for bioinformatic analyses

  [CiTO]
Bioinformatics, Vol. 27, No. 7. (01 April 2011), pp. 903-911, doi:10.1093/bioinformatics/btr040
posted to bioinformatics workflow by walshtp  on 2011-04-06 10:02:16 ** along with 16 people and 1 group abhishek_tiwari adthrasher daforerog darian druvus dullhunk golharam guhjy Hanzhij jamselem oinizan sebastien_vigneau sjcockell sujaikumar weiz Zephyrus Journal picks

Abstract

Motivation: The rapidly increasing amounts of data available from new high-throughput methods have made data processing without automated pipelines infeasible. As was pointed out in several publications, integration of data and analytic resources into workflow systems provides a solution to this problem, simplifying the task of data analysis. Various applications for defining and running workflows in the field of bioinformatics have been proposed and published, e.g. Galaxy, Mobyle, Taverna, Pegasus or Kepler. One of the main aims of such workflow systems ...

 

A Quick Guide to Organizing Computational Biology Projects

  [CiTO]
PLoS Comput Biol In PLoS Comput Biol, Vol. 5, No. 7. (31 July 2009), e1000424, doi:10.1371/journal.pcbi.1000424
posted to bioinformatics by walshtp  on 2011-04-02 22:35:31 ** along with 133 people and 9 groups aaltenburger abhishek_tiwari agomez aky123 alebalbin annakcroft antonkratz apaydin aprasad arpaton artaban421 ayansamanta barry bayesian bdessailly bertelsen blackbart brianb burak cabbagesofdoom cdsouthan charris5 chrahn christianholz chvlyl cisevol codonusage CorinYeats daforerog daveGerrard dfdong dgront Diego_Prada dimka djkt dolchan druvus dswan dutilh ekotelnikova fenghezi fgibson flbarroso freesci frohike fuenfgeld fxdm gorifabio gtuckerkellogg guhjy GustavoLacerda Hanzhij heathervincent heliopais hroest hryk icecrown Ilzins installe ishmael jamesmorris jcaddy jfr jjray jlangdon JoeBanks kemamy2 kevinchannon kou_jinsei krapnik krissp kshameer leemond31 LeilaTA leliavski lmichan lxm1117 lynnefox maren mawds mfenner mfrichar mittinatten mmwoodman mordiano mrvaidya nedwards netzwerkerin nicklynch npalma nperoni nuin okapi operon pawelsobko pekrau perkeo pmendes pmung primordialstew ptrobajo rdiaz remoteFuture renatomilani reyez RFMcC robfsouza rolandkrause rossmounce rschaeff rtogawa rvosa scmelton scryrps sdaehne Sergey_gerbek shikin siggi84 silberbauer skembel somak songpku sriesenfeld sterovetta Stew stuart_p_rossiter tomhebbron tomtullius VGreiff wei_tang wwweagle Yanno Zephyrus BergmanLab Bioinformatics Bioinformatics - CRUK ccsbs FAB-lab McCammon Orengo Group Journal Picks Pedro Martinez EvoDevo lab Barcelona structural_bioinformatics
 

Managing and Analyzing Next-Generation Sequence Data

  [CiTO]
PLoS Comput Biol In PLoS Comput Biol, Vol. 5, No. 6. (26 June 2009), e1000369, doi:10.1371/journal.pcbi.1000369
posted to bigdata ngs by walshtp  on 2011-04-02 22:33:17 ** along with 93 people and 6 groups abhishek_tiwari agbiotec alebalbin andigoni aprasad aschriner aswinsainarain avantikalal ayansamanta burak chvlyl codonusage cthachuk daforerog darian daveGerrard djkt druvus dswan elzed farhat frohike gdv GeeSharpMinor Geknitics giovenko golharam guhjy GustavoLacerda Hanzhij heliopais hpaces humburg hzoltan idoerg idonaldson irishoconnor ishmael jameswasmuth jasontsai jcaddy jessopher jforment jfr jgarbe jmeppley Jporci justinhjohnson kevinemamy kongl kshameer leelarcombe leemond31 mfenner mfrichar michaelbarton mikaels muratsincan MVEverett mwinz n00c natstreet nklee nuin operon orzenil PabloMarin pekrau phoenixzxl polivares ppgardne provero rdiaz reyez RFMcC robertorun rschulz rtogawa sameersoi samuell Schopfel scryrps sergiodealencar shikin stubrown sujaikumar tharris torfinnnome TorstenWaldminghaus TRHvidsten xueliangwei Yanno Zephyrus BergmanLab Bioinformatics Bioinformatics Core Service BlaxterLab Journal picks SciLifeLab Stockholm
 

A Practical Comparison of De Novo Genome Assembly Software Tools for Next-Generation Sequencing Technologies

  [CiTO]
PLoS ONE, Vol. 6, No. 3. (14 March 2011), e17915, doi:10.1371/journal.pone.0017915
posted to ngs sequence tools by walshtp  on 2011-04-02 10:45:15 ** along with 31 people and 6 groups abhishek_tiwari aheilbut aldens avilella cisevol CorinYeats daisukekomura druvus dullhunk farhat fuadgwadry GustavoLacerda gwallau hspitia indignacious jmeppley kamilkonowalik kshameer mblaxter mfrichar n00c natldawson natstreet oinizan operon orzenil rvosa seb1 simonalpha sujaikumar zhaodj BergmanLab BlaxterLab FAB-lab Journal picks NGS Orengo Group Journal Picks

Abstract

The advent of next-generation sequencing technologies is accompanied with the development of many whole-genome sequence assembly methods and software, especially for de novo fragment assembly. Due to the poor knowledge about the applicability and performance of these software tools, choosing a befitting assembler becomes a tough task. Here, we provide the information of adaptivity for each program, then above all, compare the performance of eight distinct tools against eight groups of simulated datasets from Solexa sequencing platform. Considering the computational time, ...

 

Exact and complete short read alignment to microbial genomes using GPU programming

  [CiTO]
Bioinformatics, Vol. 27, No. 10. (30 March 2011), pp. 1351-1358, doi:10.1093/bioinformatics/btr151
posted to gpu ngs by walshtp  on 2011-03-31 09:23:45 ** along with 15 people and 3 groups adthrasher avilella djkt dposada druvus dswan dullhunk farhat flbarroso johanneskoester muratsincan neils operon scole simonalpha 01_pachinko_pagan Journal picks PollardWall

Abstract

Motivation: The introduction of next generation sequencing techniques and especially the high-throughput systems Solexa (Illumina Inc.) and SOLiD (ABI) made the mapping of short reads to reference sequences a standard application in modern bioinformatics. Short read alignment is needed for reference based re-sequencing of complete genomes as well as for gene expression analysis based on transcriptome sequencing. Several approaches were developed during the last years allowing for a fast alignment of short sequences to a given template. Methods available to date ...

 

Sequencing delivers diminishing returns for homology detection: implications for mapping the protein universe

  [CiTO]
Bioinformatics, Vol. 26, No. 21. (1 November 2010), pp. 2664-2671, doi:10.1093/bioinformatics/btq527
posted to bioinformatics protein_sequence_alignment sequence sequence_search by walshtp  on 2010-11-26 10:36:49 ** along with 6 people and 1 group hiec kshameer operon sjcockell wheedhee yylin Bioinformatics

Abstract

Motivation: Databases of sequenced genomes are widely used to characterize the structure, function and evolutionary relationships of proteins. The ability to discern such relationships is widely expected to grow as sequencing projects provide novel information, bridging gaps in our map of the protein universe.Results: We have plotted our progress in protein sequencing over the last two decades and found that the rate of novel sequence discovery is in a sustained period of decline. Consequently, PSI-BLAST, the most widely used method to ...

 

Search and clustering orders of magnitude faster than BLAST.

  [CiTO]
Bioinformatics (Oxford, England), Vol. 26, No. 19. (1 October 2010), pp. 2460-2461, doi:10.1093/bioinformatics/btq461
posted to bioinformatics sequence_clustering sequence_search ublast usearch by walshtp  on 2010-10-25 11:09:10 ** along with 26 people and 4 groups aavilahe cisevol dakelley davidjstudholme druvus dullhunk dutilh fstrozzi gabemalaquias gwallau jonathaneisen Jporci kinestetika kpriehle maria_hauser mbalint moulavi natstreet oinizan operon pierrepoulain scole sebk sergiodealencar sjcockell wltrimble BergmanLab iSEEM Journal picks OPIG

Abstract

Biological sequence data is accumulating rapidly, motivating the development of improved high-throughput methods for sequence classification. UBLAST and USEARCH are new algorithms enabling sensitive local and global search of large sequence databases at exceptionally high speeds. They are often orders of magnitude faster than BLAST in practical applications, though sensitivity ...

 

Fast and accurate protein substructure searching with simulated annealing and GPUs.

  [CiTO]
BMC bioinformatics, Vol. 11, No. 1. (2010), 446, doi:10.1186/1471-2105-11-446
posted to gpu protein_structure_search by walshtp  on 2010-09-03 13:19:26 ** along with 3 people and 1 group hiec naoya sillitoe Orengo Group Journal Picks

Abstract

BACKGROUND: Searching a database of protein structures for matches to a query structure, or occurrences of a structural motif, is an important task in structural biology and bioinformatics. While there are many existing methods for structural similarity searching, faster and more accurate approaches are still required, and few current methods are capable of substructure (motif) searching. RESULTS: We developed an improved heuristic for tableau-based protein structure and substructure searching using simulated annealing, that is as fast or faster and comparable in ...

 

High quality protein sequence alignment by combining structural profile prediction and profile alignment using SABER-TOOTH.

  [CiTO]
BMC bioinformatics, Vol. 11, No. 1. (2010), 251, doi:10.1186/1471-2105-11-251

Abstract

BACKGROUND: Protein alignments are an essential tool for many bioinformatics analyses. While sequence alignments are accurate for proteins of high sequence similarity, they become unreliable as they approach the so-called 'twilight zone' where sequence similarity gets indistinguishable from random. For such distant pairs, structure alignment is of much better quality. Nevertheless, sequence alignment is the only choice in the majority of cases where structural data is not available. This situation demands development of methods that extend the applicability of accurate sequence ...

 

The case for cloud computing in genome informatics

  [CiTO]
Genome Biology, Vol. 11, No. 5. (5 May 2010), 207, doi:10.1186/gb-2010-11-5-207
posted to bigdata bioinformatics cloudcomputing nextgen by walshtp  on 2010-05-05 16:02:34 **/Average rating 4.0 along with 51 people and 6 groups abhishek_tiwari agbiotec agomez andrea_bio AndrewPoirrette antonio-pgarcia avilella cerami chvlyl CoffeeCat daforerog diamantis druvus dullhunk dutilh farhat galaxyproject gthorisson hagechouchin443 Hanzhij hark irishoconnor jamesmorris jfr johnomics justinhjohnson kshameer LucioAlencar manikath MarkFiers maximilianh mgssal mikel_egana MReyad mschatz n00c nicklynch oinizan operon pekrau phoenixzxl rdiaz rossmounce rvosa seb1 seong-hyeuknam ShantanuPal shikin tnhh tonamswish torfinnnome Bioinformatics BlaxterLab FAB-lab Galaxy Journal picks SciLifeLab Stockholm

Abstract

With DNA sequencing now getting cheaper more quickly than data storage or computation, the time may have come for genome informatics to migrate to the cloud. ...

 

Directionality in protein fold prediction.

  [CiTO]
BMC bioinformatics, Vol. 11, No. 1. (7 April 2010), 172, doi:10.1186/1471-2105-11-172
posted to no-tag by walshtp  on 2010-04-07 14:21:29 ** along with 5 people and 1 group apaydin cdeane sebk sillitoe TRHvidsten Orengo Group Journal Picks

Abstract

ABSTRACT: BACKGROUND: Ever since the ground-breaking work of Anfinsen et al. in which a denatured protein was found to refold to its native state, it has been frequently stated by the protein fold prediction community that all the information required for protein folding lies in the amino acid sequence. Recent in vitro experiments and in silico computational studies, however, have shown that cotranslation may affect the folding pathway of some proteins, especially those of ancient folds. In this paper aspects of ...

 

Improving pairwise sequence alignment accuracy using near-optimal protein sequence alignments

  [CiTO]
BMC Bioinformatics, Vol. 11, No. 1. (2010), 146, doi:10.1186/1471-2105-11-146
posted to no-tag by walshtp  on 2010-03-22 12:58:37 ** along with 4 people and 1 group Ethence karthikraman Scis0000002 sillitoe Orengo Group Journal Picks

Abstract

BACKGROUND:While the pairwise alignments produced by sequence similarity searches are a powerful tool for identifying homologous proteins - proteins that share a common ancestor and a similar structure; pairwise sequence alignments often fail to represent accurately the structural alignments inferred from three-dimensional coordinates. Since sequence alignment algorithms produce optimal alignments, the best structural alignments must reflect suboptimal sequence alignment scores. Thus, we have examined a range of suboptimal sequence alignments and a range of scoring parameters to understand better which sequence ...

 

ProbABEL package for genome-wide association analysis of imputed data

  [CiTO]
BMC Bioinformatics, Vol. 11, No. 1. (2010), 134, doi:10.1186/1471-2105-11-134
posted to probabel by walshtp  on 2010-03-16 13:38:38 ** along with 4 people and 1 group druvus guhjy kshameer misonneh Bioinformatics

Abstract

BACKGROUND:Over the last few years, genome-wide association (GWA) studies became a tool of choice for the identification of loci associated with complex traits. Currently, imputed single nucleotide polymorphisms (SNP) data are frequently used in GWA analyzes. Correct analysis of imputed data calls for the implementation of specific methods which take genotype imputation uncertainty into account.RESULTS:We developed the ProbABEL software package for the analysis of genome-wide imputed SNP data and quantitative, binary, and time-till-event outcomes under linear, logistic, and Cox proportional hazards ...

 

Fast statistical alignment.

  [CiTO]
PLoS computational biology, Vol. 5, No. 5. (29 May 2009), e1000392, doi:10.1371/journal.pcbi.1000392
posted to alignment fsa msa multiple sequence by walshtp  on 2010-02-23 12:09:21 ** along with 25 people and 1 group AaronDarling AlexBateman aprasad biomcgary bmduggan dakelley djkt druvus farhat Gorzomagnificent guhjy irishoconnor n00c natstreet nuin operon rdowell rec3141 reyez Richmonp rossmounce santiago seb1 timflutre yylin Journal picks

Abstract

We describe a new program for the alignment of multiple biological sequences that is both statistically motivated and fast enough for problem sizes that arise in practice. Our Fast Statistical Alignment program is based on pair hidden Markov models which approximate an insertion/deletion process on a tree and uses a sequence annealing algorithm to combine the posterior probabilities estimated from these models into a multiple alignment. FSA uses its explicit statistical model to produce multiple alignments which are accompanied by estimates ...

 

Large-scale comparison of protein sequence alignment algorithms with structure alignments

  [CiTO]
Proteins, Vol. 40, No. 1. (2000), pp. 6-22, doi:10.1002/(sici)1097-0134(20000701)40:1<6::aid-prot30>3.0.co;2-7
posted to alignment multiple sequence by walshtp on 2010-02-21 13:04:22 **

Abstract

Abstract 10.1002/(SICI)1097-0134(20000701)40:1<6::AID-PROT30>3.3.CO;2-Z Sequence alignment programs such as BLAST and PSI-BLAST are used routinely in pairwise, profile-based, or intermediate-sequence-search (ISS) methods to detect remote homologies for the purposes of fold assignment and comparative modeling. Yet, the sequence alignment quality of these methods at low sequence identity is not known. We have used the CE structure alignment program (Shindyalov and Bourne, Prot Eng 1998;11:739) to derive sequence alignments for all superfamily and family-level related proteins in the SCOP domain database. CE aligns structures ...

 

A Quick Guide for Developing Effective Bioinformatics Programming Skills

  [CiTO]
PLoS Comput Biol, Vol. 5, No. 12. (24 December 2009), e1000589, doi:10.1371/journal.pcbi.1000589
posted to bioinformatics programming by walshtp  on 2009-12-24 13:29:11 **/Average rating 4.0 along with 163 people and 9 groups A_Carrasco abhishek_tiwari agomez aky123 alebalbin alexandreconde amueller AndrewPoirrette annampage aprasad arjun_citeulike artaban421 aslupe avilella ayansamanta barry bayesian bdessailly bertelsen Borelli borislavujo cabbagesofdoom cbg cdsouthan chrahn cisevol codonusage corgan CorinYeats daed daforerog daichi_saito dakelley davidsanchezm dcapurro dfdong druvus dullhunk egonw elzed emanuelheitlinger epermal farhat fercosber fgibson flbarroso forman freddemasi fuenfgeld garuby GeeSharpMinor gena golharam gorifabio guang920 guhjy GustavoLacerda Hanzhij heathervincent heliopais henlimbai hiec hyunjin4jc ikarus97 Ilzins irishoconnor ishmael jamesmorris jfr jgarbe jmanning2k joachimbaran jonathancooper JoseBrox justinhjohnson kangism karthikraman kgrothoff kiekyon kinestetika klauso kou_jinsei krapnik kristiholmes kshameer kuitang lmichan lynnefox MarkPezzo masa2005 MatthiasH mawds mcyi4mr2 mfajer mfenner mfrichar mgruber mikel_egana mordiano mrvaidya muratsincan MVEverett n00c nailest nanonan nedwards neils nghoffma niallhaslam nickvandewiele notorious_sos nuin obgynge oinizan olepais operon orzenil owenlancaster PabloMarin pascalfrey pdgf-88 perkeo phoenixzxl polivares PolymeraseI primordialstew rdinnage renatomilani RFMcC richardhwest rossmounce rschaeff rvosa santiago schmidtm Scis0000002 scmelton scryrps sebastien_vigneau Sergey_gerbek shikin silberbauer skembel sobolevnrm sriesenfeld Stew stubrown sujaikumar svenboekhoff teichman timflutre TMichael Usernamed Vincent_Rouilly vshahrez williamrhenson wkretzsch wwweagle xiaoheilong Yanno yochju yuifu Zephyrus BergmanLab Bioinformatics Bioinformatics - CRUK BlaxterLab CIBERLITERATURA UNAM FAB-lab Journal picks Orengo Group Journal Picks UGDG
 

BLAST+: architecture and applications.

  [CiTO]
BMC bioinformatics, Vol. 10, No. 1. (2009), 421, doi:10.1186/1471-2105-10-421
posted to bioinformatics blast software by walshtp  on 2009-12-15 18:25:34 ** along with 41 people and 3 groups accopeland alicezelman chriscole cisevol druvus dutilh gawbul georgeg guhjy GustavoLacerda heliopais hiec jeargle justinhjohnson jwm karthikraman kiekyon kinestetika kshameer kvalyi leelarcombe maximilianh mbalint michaelbarton n00c nailest natstreet ndiaz nlapalu nuin operon pansapiens phoenixzxl richardbickerton rschaeff Schmidtc sillitoe SUN_RUPING torfinnnome wenboj zhangce BergmanLab Journal picks Orengo Group Journal Picks

Abstract

Sequence similarity searching is a very important bioinformatics task. While Basic Local Alignment Search Tool (BLAST) outperforms exact methods through its use of heuristics, the speed of the current BLAST software is suboptimal for very long queries or database sequences. There are also some shortcomings in the user-interface of the current command-line applications. ...

 

Optimizing substitution matrix choice and gap parameters for sequence alignment

  [CiTO]
BMC Bioinformatics, Vol. 10, No. 1. (2 December 2009), 396, doi:10.1186/1471-2105-10-396
posted to sequence_alignment by walshtp  on 2009-12-02 18:29:32 ** along with 5 people and 2 groups avilella barriot dakelley jarretinha sillitoe Journal picks Orengo Group Journal Picks

Abstract

BACKGROUND:While substitution matrices can readily be computed from reference alignments, it is challenging to compute optimal or approximately optimal gap penalties. It is also not well understood which substitution matrices are the most effective when alignment accuracy is the goal rather than homolog recognition. Here a new parameter optimization procedure, POP, is described and applied to the problems of optimizing gap penalties and selecting substitution matrices for pair-wise global protein alignments.RESULTS:POP is compared to a recent method due to Kim and ...

 

Sector and Sphere: the design and implementation of a high-performance data cloud

  [CiTO]
Physical and Engineering Sciences, Vol. 367, No. 1897. (28 June 2009), pp. 2429-2445, doi:10.1098/rsta.2009.0053
posted to cloud cloudcomputing hpc sector sphere by walshtp  on 2009-11-13 09:39:02 ** along with 12 people abhishek_tiwari armstrongmsg chamikara fanc hryk jenvor konstantinosangistalis mikeliddell morita_souhei nickpherly saymen stevenatnorsardotno

Abstract

Cloud computing has demonstrated that processing very large datasets over commodity clusters can be done simply, given the right programming model and infrastructure. In this paper, we describe the design and implementation of the Sector storage cloud and the Sphere compute cloud. By contrast with the existing storage and compute clouds, Sector can manage data not only within a data centre, but also across geographically distributed data centres. Similarly, the Sphere compute cloud supports user-defined functions (UDFs) over data both within ...

 

PDBe: Protein Data Bank in Europe

  [CiTO]
Nucl. Acids Res., Vol. 38, No. suppl_1. (1 January 2010), pp. D308-317, doi:10.1093/nar/gkp916
posted to database ebi pdb pdbe protein_structure wwpdb by walshtp  on 2009-11-04 15:26:46 ** along with 3 people and 1 group dullhunk flbarroso richardbickerton Journal picks

Abstract

The Protein Data Bank in Europe (PDBe) (http://www.ebi.ac.uk/pdbe/) is actively working with its Worldwide Protein Data Bank partners to enhance the quality and consistency of the international archive of bio-macromolecular structure data, the Protein Data Bank (PDB). PDBe also works closely with its collaborators at the European Bioinformatics Institute and the scientific community around the world to enhance its databases and services by adding curated and actively maintained derived data to the existing structural data in the PDB. We have developed ...

 

Striped Smith–Waterman speeds database searches six times over other SIMD implementations

  [CiTO]
Bioinformatics, Vol. 23, No. 2. (15 January 2007), pp. 156-161, doi:10.1093/bioinformatics/btl582

Abstract

Motivation: The only algorithm guaranteed to find the optimal local alignment is the Smith–Waterman. It is also one of the slowest due to the number of computations required for the search. To speed up the algorithm, Single-Instruction Multiple-Data (SIMD) instructions have been used to parallelize the algorithm at the instruction level. ...

 

CloudBurst: highly sensitive read mapping with MapReduce

  [CiTO]
Bioinformatics, Vol. 25, No. 11. (1 June 2009), pp. 1363-1369, doi:10.1093/bioinformatics/btp236
posted to hadoop mapreduce nextgen by walshtp  on 2009-11-02 21:38:05 **/Average rating 3.0 along with 53 people and 3 groups abhishek_tiwari agbiotec AlexBateman alimeh andrea_bio antonio-pgarcia bozdagd brianb cantalapiedra coela CoffeeCat cthachuk cxmmw685 cymacs dakelley djkt druvus dswan fgibson golharam guhjy heliopais humburg hzoltan idonaldson jandot jfr jmeppley joshdsullivan justinhjohnson malawski meanerelk mitko mliroz mscscpp n00c natstreet nedwards Neeperando neils nuin operon orzenil pauljaparrigor pedrobmarcos RFMcC rhc rschulz semrich steeleam svenrahmann vagoskar Yanno Bioinformatics Cloud Computing Papers KCL & UCL Bioinformatics

Abstract

Motivation: Next-generation DNA sequencing machines are generating an enormous amount of sequence data, placing unprecedented demands on traditional single-processor read-mapping algorithms. CloudBurst is a new parallel read-mapping algorithm optimized for mapping next-generation sequence data to the human genome and other reference genomes, for use in a variety of biological analyses including SNP discovery, genotyping and personal genomics. It is modeled after the short read-mapping program RMAP, and reports either all alignments or the unambiguous best alignment for each read with any ...

 

Userscripts for the Life Sciences

  [CiTO]
BMC Bioinformatics, Vol. 8, No. 1. (2007), 487, doi:10.1186/1471-2105-8-487
posted to no-tag by walshtp  on 2009-10-22 14:03:09 ** along with 25 people and 2 groups craigecht csjonline daforerog davidsanchezm druvus dullhunk egonw Gaetan guhjy guillermosanchez3 inthemiddle kshameer laughcry lmichan mattions mircea mrvaidya neils pdgf-88 perkeo sci91078 sleepingcell wisdom_love Yanno Zephyrus CIBERLITERATURA UNAM Journal picks

Abstract

BACKGROUND:The web has seen an explosion of chemistry and biology related resources in the last 15 years: thousands of scientific journals, databases, wikis, blogs and resources are available with a wide variety of types of information. There is a huge need to aggregate and organise this information. However, the sheer number of resources makes it unrealistic to link them all in a centralised manner. Instead, search engines to find information in those resources flourish, and formal languages like Resource Description Framework ...

 

Breaking the hierarchy - a new cluster selection mechanism for hierarchical clustering methods

  [CiTO]
Algorithms for Molecular Biology, Vol. 4, No. 1. (2009), 12, doi:10.1186/1748-7188-4-12
posted to algorithm clustering by walshtp on 2009-10-19 21:11:22 ** along with 1 person reyez

Abstract

BACKGROUND:Hierarchical clustering methods like Ward's method have been used since decades to understand biological and chemical data sets. In order to get a partition of the data set, it is necessary to choose an optimal level of the hierarchy by a so-called level selection algorithm. In 2005, a new kind of hierarchical clustering method was introduced by Palla et al. that differs in two ways from Ward's method: it can be used on data on which no full similarity matrix is ...

 

Algorithm::Evolutionary, a flexible Perl module for evolutionary computation

  [CiTO]
Soft Computing - A Fusion of Foundations, Methodologies and Applications, doi:10.1007/s00500-009-0504-3
posted to evolutionary_computation perl programming by walshtp on 2009-10-05 12:16:17 **

Abstract

Abstract  This paper describes Algorithm::Evolutionary ( A::E ), a Perl module released under an open source license and designed for the exploration and exploitation of evolutionary algorithms. We describe the design decisions taken to enhance flexibility, how performance was improved by using several implementation tweaks, and what kind of design patterns were applied for its development. This work also tries to dispel the myth of low performance of scripting languages by comparing it with a state-of-the-art library (ECJ) written in Java. Besides, ...

 

Alignment Uncertainty and Genomic Analysis

  [CiTO]
Science, Vol. 319, No. 5862. (25 January 2008), pp. 473-476, doi:10.1126/science.1151532
posted to alignment_methods genomic_data genomics sequence_alignment by walshtp  on 2009-09-29 14:16:02 ** along with 45 people and 2 groups adacier aprasad biomcgary bpb bpcusack cactus chburrus cjeans codonusage dandaman darian dayjm dfdong djkt dpollard druvus dutilh friendpine gjuggler gracio guhjy jdelcampo jfr johanviklund lilou lisa1 maren mclarenv meanerelk operon phoenixzxl pickw pradiptaray ptrobajo rdowell renatomilani Richmonp skembel spongelab timflutre tny treangen youdon yylin zwang EisenLab microbiology_nijmegen

Abstract

The statistical methods applied to the analysis of genomic data do not account for uncertainty in the sequence alignment. Indeed, the alignment is treated as an observation, and all of the subsequent inferences depend on the alignment being correct. This may not have been too problematic for many phylogenetic studies, in which the gene is carefully chosen for, among other things, ease of alignment. However, in a comparative genomics study, the same statistical methods are applied repeatedly on thousands of genes, ...

 

Global Analysis of Cdk1 Substrate Phosphorylation Sites Provides Insights into Evolution

  [CiTO]
Science, Vol. 325, No. 5948. (25 September 2009), pp. 1682-1686, doi:10.1126/science.1172867

Abstract

To explore the mechanisms and evolution of cell-cycle control, we analyzed the position and conservation of large numbers of phosphorylation sites for the cyclin-dependent kinase Cdk1 in the budding yeast Saccharomyces cerevisiae. We combined specific chemical inhibition of Cdk1 with quantitative mass spectrometry to identify the positions of 547 phosphorylation sites on 308 Cdk1 substrates in vivo. Comparisons of these substrates with orthologs throughout the ascomycete lineage revealed that the position of most phosphorylation sites is not conserved in evolution; instead, ...

 

Fast embedding methods for clustering tens of thousands of sequences

  [CiTO]
Computational Biology and Chemistry, Vol. 32, No. 4. (August 2008), pp. 282-286, doi:10.1016/j.compbiolchem.2008.03.005
posted to clustering sequence by walshtp  on 2009-09-26 14:28:16 ** along with 1 person and 2 groups barry Bioinformatics structural_bioinformatics

Abstract

Most sequence clustering methods require a full distance matrix to be computed between all pairs of sequences. This requires computer memory and time proportional to N(2) for N sequences. For small N or say up to 10000 or so, this can be accomplished in reasonable times for sequences of moderate length. For very large N, however, this becomes increasingly prohibitive. In this paper, we have tested variations on a class of published embedding methods that have been designed for clustering large ...

 

Upcoming challenges for multiple sequence alignment methods in the high-throughput era.

  [CiTO]
Bioinformatics (Oxford, England), Vol. 25, No. 19. (1 October 2009), pp. 2455-2465, doi:10.1093/bioinformatics/btp452
posted to alignment hts multiple ngs sequence by walshtp  on 2009-09-26 13:23:48 ** along with 16 people and 2 groups alexbowe Cavor djkt druvus fstrozzi gdr manduca meanerelk natstreet nickolay nuin ollieredfernwork rdiaz sjcockell torfinnnome yylin Journal picks Orengo Group Journal Picks

Abstract

This review focuses on recent trends in multiple sequence alignment tools. It describes the latest algorithmic improvements including the extension of consistency-based methods to the problem of template-based multiple sequence alignments. Some results are presented suggesting that template-based methods are significantly more accurate than simpler alternative methods. The validation of existing methods is also discussed at length with the detailed description of recent results and ...

 

Efficient SCOP-fold classification and retrieval using index-based protein substructure alignments

  [CiTO]
Bioinformatics, Vol. 25, No. 19. (1 October 2009), pp. 2559-2565, doi:10.1093/bioinformatics/btp474

Abstract

Motivation: To investigate structure-function relationships, life sciences researchers usually retrieve and classify proteins with similar substructures into the same fold. A manually constructed database, SCOP, is believed to be highly accurate; however, it is labor intensive. Another known method, DALI, is also precise but computationally expensive. We have developed an efficient algorithm, namely, index-based protein substructure alignment (IPSA), for protein-fold classification. IPSA constructs a two-layer indexing tree to quickly retrieve similar substructures in proteins and suggests possible folds by aligning these ...

 

ShortRead: a bioconductor package for input, quality assessment and exploration of high-throughput sequence data

  [CiTO]
Bioinformatics, Vol. 25, No. 19. (1 October 2009), pp. 2607-2608, doi:10.1093/bioinformatics/btp450
posted to bioconductor nextgen ngs r by walshtp  on 2009-09-26 13:19:31 ** along with 28 people and 2 groups darian djkt druvus dswan fstrozzi guhjy GustavoLacerda heliopais idonaldson ikarus97 jamesmorris jforment jfr jmeppley joaocarrico junehlee justinhjohnson manduca n00c natstreet oannes rschulz shikin SUN_RUPING Thaverkamp tobiasg82 torfinnnome zhouyu Bioinformatics Bioinformatics Core Service

Abstract

Summary: ShortRead is a package for input, quality assessment, manipulation and output of high-throughput sequencing data. ShortRead is provided in the R and Bioconductor environments, allowing ready access to additional facilities for advanced statistical analysis, data transformation, visualization and integration with diverse genomic resources.Availability and Implementation: This package is implemented in R and available at the Bioconductor web site; the package contains a ‘vignette’ outlining typical work flows.Contact: mtmorgan@fhcrc.org ...

 

Maximum likelihood genome assembly.

  [CiTO]
Journal of computational biology : a journal of computational molecular cell biology, Vol. 16, No. 8. (August 2009), pp. 1101-1116, doi:10.1089/cmb.2009.0047

Abstract

Whole genome shotgun assembly is the process of taking many short sequenced segments (reads) and reconstructing the genome from which they originated. We demonstrate how the technique of bidirected network flow can be used to explicitly model the double-stranded nature of DNA for genome assembly. By combining an algorithm for the Chinese Postman Problem on bidirected graphs with the construction of a bidirected de Bruijn graph, we are able to find the shortest double-stranded DNA sequence that contains a given set ...

 

Probing the “Dark Matter” of Protein Fold Space

  [CiTO]
Structure, Vol. 17, No. 9. (09 September 2009), pp. 1244-1252, doi:10.1016/j.str.2009.07.012

Abstract

We used a protein structure prediction method to generate a variety of folds as alpha-carbon models with realistic secondary structures and good hydrophobic packing. The prediction method used only idealized constructs that are not based on known protein structures or fragments of them, producing an unbiased distribution. Model and native fold comparison used a topology-based method as superposition can only be relied on in similar structures. When all the models were compared to a nonredundant set of all known structures, only ...

 

Cloud computing

  [CiTO]
Bioinformatics, Vol. 25, No. 12. (15 June 2009), pp. 1475-1475, doi:10.1093/bioinformatics/btp274
posted to cloudcomputing by walshtp  on 2009-09-16 21:50:32 ** along with 42 people and 3 groups aan abhishek_tiwari acesforum agomez andrea_bio antonio-pgarcia apaydin barry bertelsen bmajoros CoffeeCat cymacs druvus dswan dullhunk fergus fgibson flbarroso Gig77 Hanzhij irishoconnor jamiemcquay jfr Jingbo jmeppley JoseBrox kiekyon lmichan lxm1117 mfenner mh52 mikel_egana niallhaslam pekrau rdiaz reyez rossmounce rrbarb seb1 sobolevnrm tnhh unidodo Bioinformatics McCammon structural_bioinformatics

Abstract

10.1093/bioinformatics/btp274 ...

 

Fast Mapping of Short Sequences with Mismatches, Insertions and Deletions Using Index Structures

  [CiTO]
PLoS Comput Biol, Vol. 5, No. 9. (11 September 2009), e1000502, doi:10.1371/journal.pcbi.1000502
posted to genomics hts nextgen sequencing by walshtp  on 2009-09-16 21:06:28 ** along with 24 people and 4 groups agbiotec behindtherabbit cthachuk darian djkt druvus dswan epermal fenghezi heliopais humburg hzoltan jandot jmcq n00c natstreet nuin ongenetics orzenil polivares ppgardne RFMcC stubrown torfinnnome BergmanLab Bioinformatics iSEEM Journal picks

Abstract

With few exceptions, current methods for short read mapping make use of simple seed heuristics to speed up the search. Most of the underlying matching models neglect the necessity to allow not only mismatches, but also insertions and deletions. Current evaluations indicate, however, that very different error models apply to the novel high-throughput sequencing methods. While the most frequent error-type in Illumina reads are mismatches, reads produced by 454's GS FLX predominantly contain insertions and deletions (indels). Even though 454 sequencers ...

 

TagDust--a program to eliminate artifacts from next generation sequencing data.

  [CiTO]
Bioinformatics (Oxford, England), Vol. 25, No. 21. (1 November 2009), pp. 2839-2840, doi:10.1093/bioinformatics/btp527
posted to nextgen sequencing_technologies by walshtp  on 2009-09-16 07:29:25 ** along with 36 people and 3 groups accopeland antonkratz cantalapiedra chvlyl clearbluespring darian debasis djkt druvus fuadgwadry GeeSharpMinor guhjy GustavoLacerda heliopais humburg hzoltan idonaldson jandot jforment jgill junehlee justinhjohnson kamilkonowalik kshameer n00c NGS_Array_References nuin operon orzenil robertorun rschulz rtogawa seb1 sotacam stajich torfinnnome Bioinformatics Core Service Journal picks Mycology

Abstract

Next-generation parallel sequencing technologies produce large quantities of short sequence reads. Due to experimental procedures various types of artifacts are commonly sequenced alongside the targeted RNA or DNA sequences. Identification of such artifacts is important during the development of novel sequencing assays and for the downstream analysis of the sequenced libraries. ...

 

Pvclust: an R package for assessing the uncertainty in hierarchical clustering

  [CiTO]
Bioinformatics, Vol. 22, No. 12. (15 June 2006), pp. 1540-1542, doi:10.1093/bioinformatics/btl117

Abstract

Summary: Pvclust is an add-on package for a statistical software R to assess the uncertainty in hierarchical cluster analysis. Pvclust can be used easily for general statistical problems, such as DNA microarray analysis, to perform the bootstrap analysis of clustering, which has been popular in phylogenetic analysis. Pvclust calculates probability values (p-values) for each cluster using bootstrap resampling techniques. Two types of p-values are available: approximately unbiased (AU) p-value and bootstrap probability (BP) value. Multiscale bootstrap resampling is used for the ...

 

XMPP for cloud computing in bioinformatics supporting discovery and invocation of asynchronous web services

  [CiTO]
BMC Bioinformatics, Vol. 10, No. 1. (2009), 279, doi:10.1186/1471-2105-10-279
posted to bioinformaitcs cloud xmpp by walshtp  on 2009-09-09 16:11:05 ** along with 22 people and 2 groups abhishek_tiwari barriot brianb CameronNeylon CoffeeCat daforerog druvus dswan dullhunk egonw guhjy jfr joergkurtwegner jonalv kelvinramires malawski mchelen nuin operon pansapiens pekrau vagoskar Bioinformatics SciLifeLab Stockholm

Abstract

BACKGROUND:Life sciences make heavily use of the web for both data provision and analysis. However, the increasing amount of available data and the diversity of analysis tools call for machine accessible interfaces in order to be effective. HTTP-based Web service technologies, like the Simple Object Access Protocol (SOAP) and REpresentational State Transfer (REST) services, are today the most common technologies for this in bioinformatics. However, these methods have severe drawbacks, including lack of discoverability, and the inability for services to send ...

 

A quick guide to teaching R programming to computational biology students.

  [CiTO]
PLoS computational biology, Vol. 5, No. 8. (28 August 2009), e1000482, doi:10.1371/journal.pcbi.1000482
posted to bioinformatics programming r by walshtp  on 2009-09-01 15:44:46 ** along with 121 people and 7 groups AJCann ajw alebalbin amueller ansobol aprasad arjun_citeulike artaban421 asadrahman aslupe avilella ayansamanta Ayest barry bertelsen Borelli brianb bsamal cabbagesofdoom cbg cbonfil cgleaniz chrahn cisevol coffeerv daforerog daichi_saito darrenjw daveGerrard dfdong diamantis djkt dlizcano druvus dswan dullhunk egonw elzed fergus fgibson fishtank flbarroso fuenfgeld fxdm guhjy GuillaumeFilteau heathervincent heliopais hrogers hwright idonaldson irishoconnor ishmael jameswasmuth jasonn jclau jfr JoeBanks kaliczp klauso kou_jinsei krapnik kshameer livingthingdan ltitodem lwaldron lynnefox maren mawds mfenner mgaldino mrvaidya nanonan natstreet neils netzwerkerin nuin oinizan orzenil PabloMarin phoenixzxl pkonings polivares ppgardne provero psique ptrobajo randerr rdiaz RFMcC richardmcgee RobertSOakes Ronald80 rossmounce rvosa Scis0000002 scryrps sgsfak ShouyongPeng sigurdurthorjonsson silberbauer simonpowers skembel slack---line sriesenfeld srijit StephanMatthiesen Stew sujaikumar timhubbard TMichael tnhh tomhebbron VGreiff Vincent_Rouilly vprieto welliegirl xiaoheilong yochju zchen75 Zephyrus BergmanLab Bioinformatics Bioinformatics - CRUK BlaxterLab eLearning FAB-lab Journal picks
 

R/parallel - speeding up bioinformatics analysis with R

  [CiTO]
BMC Bioinformatics, Vol. 9, No. 1. (22 September 2008), 390, doi:10.1186/1471-2105-9-390
posted to programming r rparallel by walshtp  on 2009-08-21 12:08:47 ** along with 36 people and 1 group _bgnr amueller bsamal cassj chad_davis chrisamiller cisevol dakelley darian daveGerrard dposada druvus GeeSharpMinor guhjy humburg jandot jfr nailest neils nuin operon pauljohn32 pengchy phoenixzxl ptrobajo randerr rdiaz renatomilani robertlischke Scis0000002 shikin siebert sotacam talponer Yanno Zephyrus BergmanLab

Abstract

BACKGROUND:R is the preferred tool for statistical analysis of many bioinformaticians due in part to the increasing number of freely available analytical methods. Such methods can be quickly reused and adapted to each particular experiment. However, in experiments where large amounts of data are generated, for example using high-throughput screening devices, the processing time required to analyze data is often quite long. A solution to reduce the processing time is the use of parallel computing technologies. Because R does not support ...

 

p3d - Python module for structural bioinformatics

  [CiTO]
BMC Bioinformatics, Vol. 10, No. 1. (2009), 258, doi:10.1186/1471-2105-10-258
posted to bioinformatics programming python structural by walshtp  on 2009-08-21 12:06:19 ** along with 7 people and 1 group dgront hiec nickolay nuin operon oteri samuell Journal picks

Abstract

BACKGROUND:High-throughput bioinformatic analysis tools are needed to mine the large amount of structural data via knowledge based approaches. The development of such tools requires a robust interface to access the structural data in an easy way. For this the Python scripting language is the optimal choice since its philosophy is to write an understandable source code.RESULTS:p3d is an object oriented Python module that adds a simple yet powerful interface to the Python interpreter to process and analyse three dimensional protein structure ...

 

Low Cost, Scalable Proteomics Data Analysis Using Amazon’s Cloud Computing Services and Open Source Search Algorithms

  [CiTO]
Journal of Proteome Research, Vol. 8, No. 6. (5 June 2009), pp. 3148-3153, doi:10.1021/pr800970z

Abstract

PMID: 19358578 We describe a system combining cloud computing and open source software that allows individual laboratories or users to create scalable virtual proteomics analysis clusters and have large-scale computational resources at their disposal at a very low cost without the investment in computational hardware or software licensing fees. We provide detailed step-by-step instructions on using these virtual proteomics analysis clusters at the Medical College of Wisconsin Proteomics Center Web site ( http://proteomics.mcw.edu/vipdac ). ...

 

Comparative Analysis of Protein Structure Alignments

  [CiTO]
BMC Structural Biology, Vol. 7 (Jul 2007), pp. 50-50, doi:10.1186/1472-6807-7-50
posted to alignment file-import-09-07-30 protein structure by walshtp on 2009-07-30 09:32:16 **

Note (first note only)

10.1186/1472-6807-7-50

 

An evaluation of automated homology modelling methods at low target template sequence similarity -- Dalton and Jackson 23 (15): 1901 -- Bioinformatics

  [CiTO]
posted to comparative_modelling file-import-09-07-30 by walshtp on 2009-07-30 09:32:16 **
 

StrBioLib: a Java library for development of custom computational structural biology applications -- Chandonia 23 (15): 2018 -- Bioinformatics

  [CiTO]
posted to file-import-09-07-30 structural_biology by walshtp on 2009-07-30 09:32:16 **
 

Structural footprinting in protein structure comparison: The impact of structural fragments

  [CiTO]
BMC Structural Biology, Vol. 7 (Aug 2007), pp. 53-53
posted to alignment file-import-09-07-30 by walshtp on 2009-07-30 09:32:16 **

Note (first note only)

10.1186/1472-6807-7-53

 

Novel leverage of structural genomics

  [CiTO]
Nat Biotech, Vol. 25 (Aug 2007), pp. 849-851
posted to file-import-09-07-30 rost by walshtp on 2009-07-30 09:32:15 **

Note (first note only)

10.1038/nbt0807-849

 

SISYPHUS--structural alignments for proteins with non-trivial relationships -- Andreeva et al. 35 (Supplement 1): D253 -- Nucleic Acids Research

  [CiTO]
posted to file-import-09-07-30 sisyphus by walshtp on 2009-07-30 09:32:15 **
 

SCOP: a structural classification of proteins database for the investigation of sequences and structures.

  [CiTO]
Journal of molecular biology., Vol. 247 (Apr 1995), pp. 536-540
posted to file-import-09-07-30 scop by walshtp on 2009-07-30 09:32:15 **

Note (first note only)

10.1006/jmbi.1995.0159

 

The ASTRAL compendium for protein structure and sequence analysis -- Brenner et al. 28 (1): 254 -- Nucleic Acids Research

  [CiTO]
posted to classification file-import-09-07-30 protein structure by walshtp on 2009-07-30 09:32:15 **
Note: You may cite this page as: http://www.citeulike.org/user/walshtp

Result page: 1 2 3 Next

Create CiTO

Create a CiTO relationship by dragging the [CiTO] link onto another article.

Alternatively, drag two articles into the two boxes below. This is useful when the two articles are not on the same page - the articles will be remembered between pages.

This article...

...this one

Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.