WAP is static analysis and data mining tool used to detect and correct input validation vulnerabilities in web applications written in PHP.
WAP detects the following vulnerabilities:
- SQL Injection (SQLI)
- Cross-site scripting (XSS)
- Remote File Inclusion (RFI)
- Local File Inclusion (LFI)
- Directory Traversal or Path Traversal (DT/PT)
- Source Code Disclosure (SCD)
- OS Command Injection (OSCI)
- PHP Code Injection
WAP analyses the source code to detect the input validation vulnerabilities. It track malicious inputs inserted by entry points and verify. After the detection, the tool uses data mining to confirm if the vulnerabilities are real or false positives. At last, the vulnerabilities are corrected with the insertion of the fixes in the source code.
WAP is written in Java language and divided into three parts :
- Code Analyzer: composed by the tree generator and taint analyzer. The tool has integrated a lexer and a parser generated by ANTLR, and based in a grammar and a tree grammar written to PHP language. The tree generator uses the lexer and the parser to build the AST (Abstract Sintatic Tree) to each PHP file. The taint analyzer performs the taint analysis navigating through the AST to detect potentials vulnerabilities.
- False Positives Predictor: composed by a supervised trained data set with instances classified as being vulnerabilities and false positives and by the Logistic Regression machine learning algorithm. For each potential vulnerability detected by code analyzer, this module collects the presence of the attributes that define a false positive. Then, the Logistic Regression algorithm receives them and classifies the instance as being a false positive or not (real vulnerability).
- Code Corrector: Each real vulnerability is removed by correction of its source code. This module for the type of vulnerability selects the fix that removes the vulnerability and signalizes the places in the source code where the fix will be inserted. Then, the code is corrected with the insertion of the fixes and new files are created.