## ✔ Relationships and Inflammation across the Lifespan: Social Developmental Pathways to Disease.

Social and personality psychology compass, Vol. 5, No. 11. (November 2011), pp. 891-903, doi:10.1111/j.1751-9004.2011.00392.x
### Abstract

There are well documented links between close relationships and physical health, such that those who have supportive close relationships have lower rates of morbidity and mortality compared to those who do not. Inflammation is one mechanism that may help to explain this link. Chronically high levels of inflammation predict disease. Across the lifespan, people who have supportive close relationships have lower levels of systemic inflammation ...

## ✔ The Five Cardinal Signs of Inflammation: Calor, Dolor, Rubor, Tumor … and Penuria (Apologies to Aulus Cornelius Celsus, De medicina, c. A.D. 25)

The Journals of Gerontology Series A: Biological Sciences and Medical Sciences, Vol. 61, No. 10. (1 October 2006), pp. 1051-1052, doi:10.1093/gerona/61.10.1051
## ✔ Disruption of the gut microbiome as a risk factor for microbial infections

Current Opinion in Microbiology, Vol. 16, No. 2. (April 2013), pp. 221-227, doi:10.1016/j.mib.2013.03.009
## ✔ Quantitatively Different, yet Qualitatively Alike: A Meta-Analysis of the Mouse Core Gut Microbiome with a View towards the Human Gut Microbiome

PLoS ONE, Vol. 8, No. 5. (1 May 2013), e62578, doi:10.1371/journal.pone.0062578
### Abstract

A number of human diseases such as obesity and diabetes are associated with changes or imbalances in the gut microbiota (GM). Laboratory mice are commonly used as experimental models for such disorders. The introduction and dynamic development of next generation sequencing techniques have enabled detailed mapping of the GM of both humans and animal models. Nevertheless there is still a significant knowledge gap regarding the human and mouse common GM core and thus the applicability of the latter as an animal ...

## ✔ Analyses of the Stability and Core Taxonomic Memberships of the Human Microbiome

PLoS ONE, Vol. 8, No. 5. (6 May 2013), e63139, doi:10.1371/journal.pone.0063139
### Abstract

Analyses of the taxonomic diversity associated with the human microbiome continue to be an area of great importance. The study of the nature and extent of the commonly shared taxa (“core”), versus those less prevalent, establishes a baseline for comparing healthy and diseased groups by quantifying the variation among people, across body habitats and over time. The National Institutes of Health (NIH) sponsored Human Microbiome Project (HMP) has provided an unprecedented opportunity to examine and better define what constitutes the taxonomic ...

## ✔ Visualization of ribosomal RNA operon copy number distribution

BMC Microbiology, Vol. 9, No. 1. (25 September 2009), 208, doi:10.1186/1471-2180-9-208
### Abstract

BACKGROUND:Results of microbial ecology studies using 16S rRNA sequence information can be deceiving due to differences in rRNA operon copy number and genome size of the detected organisms. It therefore will be useful for investigators to have a better understanding of how these two parameters differ in various organism types. In this study, the number of ribosomal operons and genome size were separately mapped onto a Bacterial phylogenetic tree.RESULTS:A representative Bacterial tree was constructed using 31 marker genes found in 578 ...

## ✔ Short-Read Assembly of Full-Length 16S Amplicons Reveals Bacterial Diversity in Subsurface Sediments

PLoS ONE, Vol. 8, No. 2. (6 February 2013), e56018, doi:10.1371/journal.pone.0056018
### Abstract

In microbial ecology, a fundamental question relates to how community diversity and composition change in response to perturbation. Most studies have had limited ability to deeply sample community structure (e.g. Sanger-sequenced 16S rRNA libraries), or have had limited taxonomic resolution (e.g. studies based on 16S rRNA hypervariable region sequencing). Here, we combine the higher taxonomic resolution of near-full-length 16S rRNA gene amplicons with the economics and sensitivity of short-read sequencing to assay the abundance and identity of organisms that represent as ...

## ✔ Salivary Candida, caries and Candida in toothbrushes.

The Journal of clinical pediatric dentistry, Vol. 37, No. 2. (2012), pp. 167-170
### Abstract

Candida species are common inhabitants of the normal oral microbiota. A few studies founded a relationship between high levels of Candida albicans in the oral cavity and high DMF scores. Toothbrushes can also be reservoirs of microorganisms, the proliferation of these microorganism on a toothbrush could be a major factor for its distribution in the oral cavity. ...

## ✔ Estimation of bacterial diversity using next generation sequencing of 16S rDNA: a comparison of different workflows.

BMC bioinformatics, Vol. 12, No. 1. (14 December 2011), 473, doi:10.1186/1471-2105-12-473
### Abstract

Next generation sequencing (NGS) enables a more comprehensive analysis of bacterial diversity from complex environmental samples. NGS data can be analysed using a variety of workflows. We test several simple and complex workflows, including frequently used as well as recently published tools, and report on their respective accuracy and efficiency under various conditions covering different sequence lengths, number of sequences and real world experimental data ...

## ✔ Exploring the dynamic core microbiome of plaque microbiota during head-and-neck radiotherapy using pyrosequencing.

PloS one, Vol. 8, No. 2. (2013), doi:10.1371/journal.pone.0056343
### Abstract

Radiotherapy is the primary treatment modality used for patients with head-and-neck cancers, but inevitably causes microorganism-related oral complications. This study aims to explore the dynamic core microbiome of oral microbiota in supragingival plaque during the course of head-and-neck radiotherapy. Eight subjects aged 26 to 70 were recruited. Dental plaque samples were collected (over seven sampling time points for each patient) before and during radiotherapy. The ...

## ✔ TIGER: A Tuning-Insensitive Approach for Optimally Estimating Gaussian Graphical Models

(11 Sep 2012)
### Abstract

We propose a new procedure for estimating high dimensional Gaussian graphical models. Our approach is asymptotically tuning-free and non-asymptotically tuning-insensitive: it requires very few efforts to choose the tuning parameter in finite sample settings. Computationally, our procedure is significantly faster than existing methods due to its tuning-insensitive property. Theoretically, the obtained estimator is simultaneously minimax optimal for precision matrix estimation under different norms. Empirically, we illustrate the advantages of our method using thorough simulated and real examples. The R package bigmatrix implementing the proposed methods is available on the Comprehensive ...

## ✔ A Constrained L1 Minimization Approach to Sparse Precision Matrix Estimation

(10 Feb 2011)
### Abstract

A constrained L1 minimization method is proposed for estimating a sparse inverse covariance matrix based on a sample of $n$ iid $p$-variate random variables. The resulting estimator is shown to enjoy a number of desirable properties. In particular, it is shown that the rate of convergence between the estimator and the true $s$-sparse precision matrix under the spectral norm is $s\sqrt\log p/n$ when the population distribution has either exponential-type tails or polynomial-type tails. Convergence rates under the elementwise $L_∞$ norm and Frobenius norm are also presented. In addition, ...

## ✔ The Dantzig selector: Statistical estimation when p is much larger than n

The Annals of Statistics, Vol. 35, No. 6. (December 2007), pp. 2313-2351, doi:10.1214/009053606000001523
posted to by Zephyrus on 2013-04-29 21:42:24 along with 4 people

## ✔ Square-Root Lasso: Pivotal Recovery of Sparse Signals via Conic Programming

Biometrika, Vol. 98, No. 4. (18 Dec 2011), pp. 791-806, doi:10.1093/biomet/asr043
### Abstract

We propose a pivotal method for estimating high-dimensional sparse linear regression models, where the overall number of regressors $p$ is large, possibly much larger than $n$, but only $s$ regressors are significant. The method is a modification of the lasso, called the square-root lasso. The method is pivotal in that it neither relies on the knowledge of the standard deviation $σ$ or nor does it need to pre-estimate $σ$. Moreover, the method does not rely on normality or sub-Gaussianity of noise. It achieves near-oracle performance, attaining the convergence ...

## ✔ L1 penalized LAD estimator for high dimensional linear

(28 Feb 2012)
## ✔ Pan-genome sequence analysis using Panseq: an online tool for the rapid analysis of core and accessory genomic regions

BMC Bioinformatics, Vol. 11, No. 1. (2010), 461, doi:10.1186/1471-2105-11-461
### Abstract

BACKGROUND:The pan-genome of a bacterial species consists of a core and an accessory gene pool. The accessory genome is thought to be an important source of genetic variability in bacterial populations and is gained through lateral gene transfer, allowing subpopulations of bacteria to better adapt to specific niches. Low-cost and high-throughput sequencing platforms have created an exponential increase in genome sequence data and an opportunity to study the pan-genomes of many bacterial species. In this study, we describe a new online ...

## ✔ BIGSdb: Scalable analysis of bacterial genome variation at the population level

BMC Bioinformatics, Vol. 11, No. 1. (10 December 2010), 595, doi:10.1186/1471-2105-11-595
### Abstract

BACKGROUND:The opportunities for bacterial population genomics that are being realised by the application of parallel nucleotide sequencing require novel bioinformatics platforms. These must be capable of the storage, retrieval, and analysis of linked phenotypic and genotypic information in an accessible, scalable and computationally efficient manner.RESULTS:The Bacterial Isolate Genome Sequence Database (BIGSDB) is a scalable, open source, web-accessible database system that meets these needs, enabling phenotype and sequence data, which can range from a single sequence read to whole genome data, to ...

## ✔ Comparative typing of L. delbrueckii subsp. bulgaricus strains using multilocus sequence typing and RAPD–PCR

European Food Research and Technology In European Food Research and Technology, Vol. 233, No. 3. (28 June 2011), pp. 377-385, doi:10.1007/s00217-011-1526-5
### Abstract

Comparative typing analysis of 25 Lactobacillus delbrueckii subsp. bulgaricus strains, isolated from traditional yoghurts in Turkey, was performed by RAPD–PCR (randomly amplified polymorphic DNA–PCR) and MLST (multilocus sequence typing). RAPD–PCR analyses were performed using two primers; M13 and 1254. Primer 1254 produced better results than primer M13. The bands produced by primer 1254 were brighter and easier to interpret, and a higher number of bands were produced. In addition, clusters produced by primer 1254 were grouped according to the source of ...

## ✔ Pathogen typing in the genomics era: MLST and the future of molecular epidemiology

Infection, Genetics and Evolution, Vol. 16 (June 2013), pp. 38-53, doi:10.1016/j.meegid.2013.01.009
## ✔ Bayesian semi-supervised classification of bacterial samples using MLST databases

BMC Bioinformatics, Vol. 12, No. 1. (2011), 302, doi:10.1186/1471-2105-12-302
### Abstract

BACKGROUND:Worldwide effort on sampling and characterization of molecular variation within a large number of human and animal pathogens has lead to the emergence of multi-locus sequence typing (MLST) databases as an important tool for studying the epidemiology and evolution of pathogens. Many of these databases are currently harboring several thousands of multi-locus DNA sequence types (STs) enriched with metadata over traits such as serotype, antibiotic resistance, host organism etc of the isolates. Curators of the databases have thus the possibility of ...

## ✔ Quake: quality-aware detection and correction of sequencing errors

Genome Biology, Vol. 11, No. 11. (2010), R116, doi:10.1186/gb-2010-11-11-r116

### Abstract

We introduce Quake, a program to detect and correct errors in DNA sequencing reads. Using a maximum likelihood approach incorporating quality values and nucleotide specific miscall rates, Quake achieves the highest accuracy on realistically simulated reads. We further demonstrate substantial improvements in de novo assembly and SNP detection after using Quake. Quake can be used for any size project, including more than one billion human reads, and is freely available as open source software from http://www.cbcb.umd.edu/software/quake webcite. ...

## ✔ Error correction of high-throughput sequencing datasets with non-uniform coverage

Bioinformatics, Vol. 27, No. 13. (01 July 2011), pp. i137-i141, doi:10.1093/bioinformatics/btr208

### Abstract

Motivation: The continuing improvements to high-throughput sequencing (HTS) platforms have begun to unfold a myriad of new applications. As a result, error correction of sequencing reads remains an important problem. Though several tools do an excellent job of correcting datasets where the reads are sampled close to uniformly, the problem of correcting reads coming from drastically non-uniform datasets, such as those from single-cell sequencing, remains open. ...

## ✔ A novel method to discover fluoroquinolone antibiotic resistance (qnr) genes in fragmented nucleotide sequences

BMC Genomics, Vol. 13, No. 1. (11 December 2012), 695, doi:10.1186/1471-2164-13-695
### Abstract

BACKGROUND:Broad-spectrum fluoroquinolone antibiotics are central in modern health care and are used to treat and prevent a wide range of bacterial infections. The recently discovered qnr genes provide a mechanism of resistance with the potential to rapidly spread between bacteria using horizontal gene transfer. As for many antibiotic resistance genes present in pathogens today, qnr genes are hypothesized to originate from environmental bacteria. The vast amount of data generated by shotgun metagenomics can therefore be used to explore the diversity of ...

## ✔ Efficiency and Power as a Function of Sequence Coverage, SNP Array Density, and Imputation

PLoS Comput Biol, Vol. 8, No. 7. (12 July 2012), e1002604, doi:10.1371/journal.pcbi.1002604
### Abstract

High coverage whole genome sequencing provides near complete information about genetic variation. However, other technologies can be more efficient in some settings by (a) reducing redundant coverage within samples and (b) exploiting patterns of genetic variation across samples. To characterize as many samples as possible, many genetic studies therefore employ lower coverage sequencing or SNP array genotyping coupled to statistical imputation. To compare these approaches individually and in conjunction, we developed a statistical framework to estimate genotypes jointly from sequence reads, ...

## ✔ A simple and efficient Bayesian procedure for selecting dimensionality in multidimensional scaling

Journal of Multivariate Analysis, Vol. 107 (May 2012), pp. 200-209, doi:10.1016/j.jmva.2012.01.012
### Abstract

Multidimensional scaling (MDS) is a technique which retrieves the locations of objects in a Euclidean space (the object configuration) from data consisting of the dissimilarities between pairs of objects. An important issue in MDS is finding an appropriate dimensionality underlying these dissimilarities. In this paper, we propose a simple and efficient Bayesian approach for selecting dimensionality in MDS. For each column (attribute) vector of an MDS configuration, we assume a prior that is a mixture of the point mass at 0 ...

## ✔ Sparse Discriminant Analysis

Technometrics, Vol. 53, No. 4. (1 November 2011), pp. 406-413, doi:10.1198/tech.2011.08118
### Abstract

We consider the problem of performing interpretable classification in the high-dimensional setting, in which the number of features is very large and the number of observations is limited. This setting has been studied extensively in the chemometrics literature, and more recently has become commonplace in biological and medical applications. In this setting, a traditional approach involves performing feature selection before classification. We propose sparse discriminant analysis, a method for performing linear discriminant analysis with a sparseness criterion imposed such that classification ...

## ✔ Nonadaptive Explanations for Signatures of Partial Selective Sweeps in Drosophila

Molecular Biology and Evolution, Vol. 25, No. 6. (01 June 2008), pp. 1025-1042, doi:10.1093/molbev/msn007
### Abstract

A beneficial mutation that has nearly but not yet fixed in a population produces a characteristic haplotype configuration, called a partial selective sweep. Whether nonadaptive processes might generate similar haplotype configurations has not been extensively explored. Here, we consider 5 population genetic data sets taken from regions flanking high-frequency transposable elements in North American strains of Drosophila melanogaster, each of which appears to be consistent with the expectations of a partial selective sweep. We use coalescent simulations to explore whether incorporation ...

## ✔ Testing significance of features by lassoed principal components

The Annals of Applied Statistics, Vol. 2, No. 3. (September 2008), pp. 986-1012, doi:10.1214/08-aoas182
## ✔ A Recoding Method to Improve the Humoral Immune Response to an HIV DNA Vaccine

PLoS ONE, Vol. 3, No. 9. (15 September 2008), e3214, doi:10.1371/journal.pone.0003214
### Abstract

This manuscript describes a novel strategy to improve HIV DNA vaccine design. Employing a new information theory based bioinformatic algorithm, we identify a set of nucleotide motifs which are common in the coding region of HIV, but are under-represented in genes that are highly expressed in the human genome. We hypothesize that these motifs contribute to the poor protein expression of gag, pol, and env genes from the c-DNAs of HIV clinical isolates. Using this approach and beginning with a codon ...

## ✔ Postoperative Pain Following Foot and Ankle Surgery: A Prospective Study

Foot & Ankle International, Vol. 29, No. 11. (November 2008), pp. 1063-1068, doi:10.3113/fai.2008.1063
## ✔ Reassessing authorship of the Book of Mormon using delta and nearest shrunken centroid classification

Literary and Linguistic Computing, Vol. 23, No. 4. (01 December 2008), pp. 465-491, doi:10.1093/llc/fqn040
### Abstract

Mormon prophet Joseph Smith (1805–44) claimed that more than two-dozen ancient individuals (Nephi, Mormon, Alma, etc.) living from around 2200 BC to 421 AD authored the Book of Mormon (1830), and that he translated their inscriptions into English. Later researchers who analyzed selections from the Book of Mormon concluded that differences between selections supported Smith's claim of multiple authorship and ancient origins. We offer a new approach that employs two classification techniques: ‘delta’ commonly used to determine probable authorship and ‘nearest ...

## ✔ Hierarchical Maintenance of MLL Myeloid Leukemia Stem Cells Employs a Transcriptional Program Shared with Embryonic Rather Than Adult Stem Cells

Cell Stem Cell, Vol. 4, No. 2. (6 February 2009), pp. 129-140, doi:10.1016/j.stem.2008.11.015
### Abstract

The genetic programs that promote retention of self-renewing leukemia stem cells (LSCs) at the apex of cellular hierarchies in acute myeloid leukemia (AML) are not known. In a mouse model of human AML, LSCs exhibit variable frequencies that correlate with the initiating MLL oncogene and are maintained in a self-renewing state by a transcriptional subprogram more akin to that of embryonic stem cells (ESCs) than ...

## ✔ Covariance-regularized regression and classification for high dimensional problems

Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 71, No. 3. (20 June 2009), pp. 615-636, doi:10.1111/j.1467-9868.2009.00699.x
### Abstract

Summary.  We propose covariance-regularized regression, a family of methods for prediction in high dimensional settings that uses a shrunken estimate of the inverse covariance matrix of the features to achieve superior prediction. An estimate of the inverse covariance matrix is obtained by maximizing the log-likelihood of the data, under a multivariate normal model, subject to a penalty; it is then used to estimate coefficients for the regression of the response onto the features. We show that ridge regression, the lasso and ...

## ✔ A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis

Biostatistics, Vol. 10, No. 3. (01 July 2009), pp. 515-534, doi:10.1093/biostatistics/kxp008
### Abstract

We present a penalized matrix decomposition (PMD), a new framework for computing a rank-K approximation for a matrix. We approximate the matrix X as , where dk, uk, and vk minimize the squared Frobenius norm of X, subject to penalties on uk and vk. This results in a regularized version of the singular value decomposition. Of particular interest is the use of L1-penalties on uk and vk, which yields a decomposition of X using sparse vectors. We show that when the ...

## ✔ 3′-End Sequencing for Expression Quantification (3SEQ) from Archival Tumor Samples

PLoS ONE, Vol. 5, No. 1. (19 January 2010), e8768, doi:10.1371/journal.pone.0008768
### Abstract

Gene expression microarrays are the most widely used technique for genome-wide expression profiling. However, microarrays do not perform well on formalin fixed paraffin embedded tissue (FFPET). Consequently, microarrays cannot be effectively utilized to perform gene expression profiling on the vast majority of archival tumor samples. To address this limitation of gene expression microarrays, we designed a novel procedure (3â²-end sequencing for expression quantification (3SEQ)) for gene expression profiling from FFPET using next-generation sequencing. We performed gene expression profiling by 3SEQ and ...

## ✔ Survival analysis with high-dimensional covariates

Statistical Methods in Medical Research, Vol. 19, No. 1. (01 February 2010), pp. 29-51, doi:10.1177/0962280209105024
### Abstract

In recent years, breakthroughs in biomedical technology have led to a wealth of data in which the number of features (for instance, genes on which expression measurements are available) exceeds the number of observations (e.g. patients). Sometimes survival outcomes are also available for those same observations. In this case, one might be interested in (a) identifying features that are associated with survival (in a univariate sense), and (b) developing a multivariate model for the relationship between the features and survival that ...

## ✔ Discovery of molecular subtypes in leiomyosarcoma through integrative molecular profiling

Oncogene, Vol. 29, No. 6. (09 November 2009), pp. 845-854, doi:10.1038/onc.2009.381
### Abstract

Discovery of molecular subtypes in leiomyosarcoma through integrative molecular profiling Oncogene advance online publication, November 9, 2009. doi:10.1038/onc.2009.381 Authors: A H Beck, C-H Lee, D M Witten, B C Gleason, B Edris, I Espinosa, S Zhu, R Li, K D Montgomery, R J Marinelli, R Tibshirani, T Hastie, D M Jablons, B P Rubin, C D Fletcher, R B West & M van de ...

## ✔ Erratum

Statistical Methods in Medical Research, Vol. 19, No. 2. (01 April 2010), pp. 200-200, doi:10.1177/0962280210366728
### Abstract

Daniela M. Witten and Robert Tibshirani. Survival analysis with high-dimensional covariates. Statistical Methods in Medical Research 2010; 19: 29–51 (DOI: 10.1177/0962280209105024). On page 36, Figure 1 is incorrect. Please note the following correction in the figure: The ‘Modified Cox’ and ‘LPC’ legends in the artwork were reversed. The correct figure is given below. ...