The Duality of Clusters and Statistical Interactions
We contend that clusters of cases co-constitute statistical interactions among variables. Interactions among variables imply clusters of cases within which statistical effects differ. Regression coefficients may be productively viewed as sums across clusters of cases, and in this sense regression coefficients may be said to be “composed” of clusters of cases. We explicate a four-step procedure that discovers interaction effects based on clusters of cases in the data matrix, hence aiding in inductive model specification. We illustrate with two examples. One is a reanalysis of data from a published study of the effect of social welfare policy extensiveness on poverty rates across 15 countries. The second uses General Social Survey data to predict four different dimensions of ego-network homophily. We find support for our contention that clusters of the rows of a data matrix may be exploited to discover statistical interactions among variables that improve model fit.