When Is the Right Time to Refresh Knowledge Discovered from Data?
Knowledge discovery in databases (KDD) techniques have been extensively employed to extract knowledge from massive data stores to support decision making in a wide range of critical applications. Maintaining the currency of discovered knowledge over evolving data sources is a fundamental challenge faced by all KDD applications. This paper addresses the challenge from the perspective of deciding the right times to refresh knowledge. We define the knowledge-refreshing problem and model it as a Markov decision process. Based on the identified properties of the Markov decision process model, we establish that the optimal knowledge-refreshing policy is monotonically increasing in the system state within every appropriate partition of the state space. We further show that the problem of searching for the optimal knowledge-refreshing policy can be reduced to the problem of finding the optimal thresholds and propose a method for computing the optimal knowledge-refreshing policy. The effectiveness and the robustness of the computed optimal knowledge-refreshing policy are examined through extensive empirical studies addressing a real-world knowledge-refreshing problem. Our method can be applied to refresh knowledge for KDD applications that employ major data-mining models.