RAMOBoost: Ranked Minority Oversampling in Boosting
In recent years, learning from imbalanced data has attracted growing attention from both academia and industry due to the explosive growth of applications that use and produce imbalanced data. However, because of the complex characteristics of imbalanced data, many real-world solutions struggle to provide robust efficiency in learning-based applications. In an effort to address this problem, this paper presents Ranked Minority Oversampling in Boosting (RAMOBoost), which is a RAMO technique based on the idea of adaptive synthetic data generation in an ensemble learning system. Briefly, RAMOBoost adaptively ranks minority class instances at each learning iteration according to a sampling probability distribution that is based on the underlying data distribution, and can adaptively shift the decision boundary toward difficult-to-learn minority and majority class instances by using a hypothesis assessment procedure. Simulation analysis on 19 real-world datasets assessed over various metrics-including overall accuracy, precision, recall, F-measure, G-mean, and receiver operation characteristic analysis-is used to illustrate the effectiveness of this method.