Quantifying the association between gene expressions and DNA-markers by penalized canonical correlation analysis.
Multiple changes at the DNA level are at the basis of complex diseases. Identifying the genetic networks that are influenced by these changes might help in understanding the development of these diseases. Canonical correlation analysis is used to associate gene expressions with DNA-markers and thus reveals sets of co-expressed and co-regulated genes and their associating DNA-markers. However, when the number of variables gets high, e.g. in the case of microarray studies, interpretation of these results can be difficult. By adapting the elastic net to canonical correlation analysis the number of variables reduces, and interpretation becomes easier, moreover, due to the grouping effect of the elastic net co-regulated and co-expressed genes cluster. Additionally, our adaptation works well in situations where the number of variables exceeds by far the number of subjects.