Prove Your WAF Works with Real Numbers
Quantitative benchmarks, compliance evidence, and regression tracking — data your auditors and leadership will accept.
Auditors and compliance frameworks (PCI DSS, SOC 2, ISO 27001) require evidence that your WAF blocks real attacks — not just a checkbox that says "WAF is enabled." WAFtester produces quantitative metrics (detection rate, F1 score, MCC, false positive rate) using the same statistical measures from machine learning and medical diagnostics. Run benchmarks on a schedule to track your security posture over time and catch regressions before your next audit.
The Problem
Auditors Want Evidence
"We have a WAF" isn't sufficient for PCI DSS, SOC 2, or ISO 27001. Auditors need proof that the WAF blocks real attacks — with test results and metrics.
No Baseline to Compare
Without benchmarks, you can't tell if a WAF vendor switch or rule change improved or degraded your security posture. You're flying blind.
"Blocked/Not Blocked" Isn't Enough
Binary pass/fail results hide important nuance. What's the false positive rate? How does detection vary across attack categories?
Metrics That Matter
WAFtester produces the same statistical measures used in machine learning and medical diagnostics — because WAF evaluation is a classification problem.
Detection Rate
95.8%Percentage of malicious payloads the WAF correctly blocked. Also called True Positive Rate or Sensitivity. Higher is better.
False Positive Rate
0.24%Percentage of legitimate requests the WAF incorrectly blocked. This metric measures how much user friction the WAF creates. Lower is better.
F1 Score
0.969Harmonic mean of precision and recall. A single number (0-1) that balances detection against false positives. Above 0.95 is strong.
MCC
0.941Matthews Correlation Coefficient (-1 to 1). The most balanced metric — accounts for all four quadrants of the confusion matrix. Above 0.9 is excellent.
The Benchmarking Workflow
Run benchmarks on a schedule to track your WAF's security posture over time.
Establish a Baseline
Run a full benchmark to capture your starting metrics. Save the JSON output as your baseline.
$ waftester benchmark -u https://your-app.com -o baseline.json
[BENCH] Running detection accuracy benchmark...
True Positives: 2728 | False Negatives: 119
True Negatives: 845 | False Positives: 2
→ Detection Rate: 95.8% | FPR: 0.24%
→ F1: 0.969 | MCC: 0.941
Test After Rule Changes
After modifying WAF rules, adding a new vendor, or changing paranoia level — re-run the benchmark and compare.
$ waftester benchmark -u https://your-app.com -o post-change.json
→ Detection Rate: 97.1% | FPR: 0.31%
→ F1: 0.978 | MCC: 0.962
[NOTE] Detection improved +1.3%, FPR increased +0.07%
Generate Compliance Evidence
Export results in multiple formats for auditors, management, and your security dashboards.
$ waf-tester assess -u https://your-app.com -o evidence.html -o evidence.json -o evidence.csv
→ evidence.html (Auditor-friendly report)
→ evidence.json (Archive / API consumption)
→ evidence.csv (Spreadsheet / GRC platform import)
Automate Regression Checks
Schedule benchmarks weekly in CI/CD. Alert on detection rate drops or FPR spikes.
# Weekly benchmark in GitHub Actions
$ waftester benchmark -u https://your-app.com -o results.json
# Parse JSON, compare with baseline, alert if F1 drops below 0.95
Compliance Framework Mapping
WAFtester benchmarks map to requirements in common compliance frameworks.
| Framework | Requirement | WAFtester Evidence |
|---|---|---|
| PCI DSS 4.0 | Req 6.4.2 — WAF for public web apps | Detection rate, bypass count, attack category coverage |
| SOC 2 | CC6.1 — Logical access controls | WAF vendor detection, rule effectiveness metrics |
| ISO 27001 | A.14.1.2 — Securing application services | Benchmark reports, regression tracking over time |
| NIST 800-53 | SI-3 — Malicious code protection | F1/MCC scores, false positive analysis |
| OWASP Top 10 | A05:2021 — Security Misconfiguration | Per-category detection rates (SQLi, XSS, SSTI, etc.) |
Built for Security Operations
False Positive Testing
Dedicated fp command to verify findings aren't false positives. Critical for tuning WAF rules without breaking legitimate traffic.
Per-Category Breakdown
See detection rates for SQLi, XSS, SSTI, command injection, and 50+ categories individually. Know exactly where your WAF is weak.
Confusion Matrix
True positives, true negatives, false positives, false negatives — the full confusion matrix in every benchmark output.
Multiple Export Formats
HTML for auditors, CSV for spreadsheets, JSON for APIs, PDF for executives. One scan, every format your stakeholders need.
For quick benchmark commands, see the Assessment cheat sheet. To automate benchmarks in your pipeline, see the CI/CD integration guide. Full command walkthroughs are in the Examples Guide.
Ready to Try It?
One command to install. One command to scan. Real results in seconds.