Evaluating the effectiveness of explanations for recommender systems
When recommender systems present items, these can be accompanied by explanatory information. Such explanations can serve seven aims: effectiveness, satisfaction, transparency, scrutability, trust, persuasiveness, and efficiency. These aims can be incompatible, so any evaluation needs to state which aim is being investigated and use appropriate metrics. This paper focuses particularly on effectiveness (helping users to make good decisions) and its trade-off with satisfaction. It provides an overview of existing work on evaluating effectiveness and the metrics used. It also highlights the limitations of the existing effectiveness metrics, in particular the effects of under- and overestimation and recommendation domain. In addition to this methodological contribution, the paper presents four empirical studies in two domains: movies and cameras. These studies investigate the impact of personalizing simple feature-based explanations on effectiveness and satisfaction. Both approximated and real effectiveness is investigated. Contrary to expectation, personalization was detrimental to effectiveness, though it may improve user satisfaction. The studies also highlighted the importance of considering opt-out rates and the underlying rating distribution when evaluating effectiveness.