“Automatic Detection and Correction of Web Application Vulnerabilities using Data Mining to Predict False Positives”

From Navigators

Ibéria Medeiros, Nuno Ferreira Neves, Miguel Correia

in Proceedings of the International World Wide Web Conference (WWW), Seoul, Korea, Apr. 2014.

Abstract: Web application security is an important problem in today's internet. A major cause of this status is that many programmers do not have adequate knowledge about secure coding, so they leave applications with vulnerabilities. An approach to solve this problem is to use source code static analysis to solve these bugs, but these tools are known to report many false positives that make hard the task of correcting the application. This paper explores the use of a hybrid of methods to detect vulnerabilities with less false positives. After an initial step that uses taint analysis to flag candidate vulnerabilities, our approach uses data mining to predict the existence of false positives. This approach reaches a trade-off between two apparently opposite approaches: humans coding the knowledge about vulnerabilities (for taint analysis) versus automatically obtaining that knowledge (with machine learning, for data mining). Given this more precise form of detection, we do automatic code correction by inserting fixes in the source code. The approach was implemented in the WAP tool and an experimental evaluation was performed with a large set of open source PHP applications.