Weight estimation and significance testing for three focused statistics
Focused tests for clustering are designed to determine whether there is statistical evidence for raised incidence of some phenomenon around a prespecified location. The tests require definition of what is meant by ‘around’ the location, and this is achieved by specifying weights associated with surrounding locations. Different weight specifications will yield different levels of statistical significance, and because of the difficulty in knowing how to define the weights, it is tempting to try different definitions with the hope of finding one that is highly significant. This, however, introduces the problem of multiple testing; one will eventually be able to reject a null hypothesis if one tries often enough. This article describes approaches for adjusting the significance level when multiple tests, associated with varying definitions for the weights, are carried out. The approaches are developed for a local scan statistic, a maximum chi-square statistic, and a modified version of Stone’s statistic. An illustration is provided using leukemia data from central New York State.