Measurements of the Impact of 3′ End Sequences on Gene Expression Reveal Wide Range and Sequence Dependent Effects
A full understanding of gene regulation requires an understanding of the contributions that the various regulatory regions have on gene expression. Although it is well established that sequences downstream of the main promoter can affect expression, our understanding of the scale of this effect and how it is encoded in the DNA is limited. Here, to measure the effect of native S. cerevisiae 3â² end sequences on expression, we constructed a library of 85 fluorescent reporter strains that differ only in their 3â² end region. Notably, despite being driven by the same strong promoter, our library spans a continuous twelve-fold range of expression values. These measurements correlate with endogenous mRNA levels, suggesting that the 3â² end contributes to constitutive differences in mRNA levels. We used deep sequencing to map the 3â²UTR ends of our strains and show that determination of polyadenylation sites is intrinsic to the local 3â² end sequence. Polyadenylation mapping was followed by sequence analysis, we found that increased A/T content upstream of the main polyadenylation site correlates with higher expression, both in the library and genome-wide, suggesting that native genes differ by the encoded efficiency of 3â² end processing. Finally, we use single cells fluorescence measurements, in different promoter activation levels, to show that 3â² end sequences modulate protein expression dynamics differently than promoters, by predominantly affecting the size of protein production bursts as opposed to the frequency at which these bursts occur. Altogether, our results lead to a more complete understanding of gene regulation by demonstrating that 3â² end regions have a unique and sequence dependent effect on gene expression. A basic question in gene expression is the relative contribution of different regulatory layers and genomic regions to the differences in protein levels. In this work we concentrated on the effect of 3â² end sequences. For this, we constructed a library of yeast strains that differ only by a native 3â² end region integrated downstream to a reported gene driven by a constant inducible promoter. Thus we could attribute all differences in reporter expression between the strains to the different 3â² end sequences. Interestingly, we found that despite being driven by the same strong, inducible promoter, our library spanned a wide and continuous range of expression levels of more than twelve-fold. As these measurements represent the sole effect of the 3â² end region, we quantify the contribution of these sequences to the variance in mRNA levels by comparing our measurements to endogenous mRNA levels. We follow by sequence analysis to find a simple sequence signature that correlates with expression. In addition, single cell analysis reveals distinct noise dynamics of 3â² end mediated differences in expression compared to different levels of promoter activation leading to a more complete understanding of gene expression which also incorporates the effect of these regions.