Can traditional fault prediction models be used for vulnerability prediction?
Finding security vulnerabilities requires a different mindset than finding general faults in software—thinking like an attacker. Therefore, security engineers looking to prioritize security inspection and testing efforts may be better served by a prediction model that indicates security vulnerabilities rather than faults. At the same time, faults and vulnerabilities have commonalities that may allow development teams to use traditional fault prediction models and metrics for vulnerability prediction. The goal of our study is to determine whether fault prediction models can be used for vulnerability prediction or if specialized vulnerability prediction models should be developed when both models are built with traditional metrics of complexity, code churn, and fault history. We have performed an empirical study on a widely-used, large open source project, the Mozilla Firefox web browser, where 21% of the source code files have faults and only 3% of the files have vulnerabilities. Both the fault prediction model and the vulnerability prediction model provide similar ability in vulnerability prediction across a wide range of classification thresholds. For example, the fault prediction model provided recall of 83% and precision of 11% at classification threshold 0.6 and the vulnerability prediction model provided recall of 83% and precision of 12% at classification threshold 0.5. Our results suggest that fault prediction models based upon traditional metrics can substitute for specialized vulnerability prediction models. However, both fault prediction and vulnerability prediction models require significant improvement to reduce false positives while providing high recall.