Use Case: Compliance & Benchmarking

Prove Your WAF Works with Real Numbers

Name: WAFtester
Author: WAFtester

Quantitative benchmarks, compliance evidence, and regression tracking — data your auditors and leadership will accept.

Auditors and compliance frameworks (PCI DSS, SOC 2, ISO 27001) require evidence that your WAF blocks real attacks — not just a checkbox that says "WAF is enabled." WAFtester produces quantitative metrics (detection rate, F1 score, MCC, false positive rate) using the same statistical measures from machine learning and medical diagnostics. Run benchmarks on a schedule to track your security posture over time and catch regressions before your next audit.

The Problem

📝

Auditors Want Evidence

"We have a WAF" isn't sufficient for PCI DSS, SOC 2, or ISO 27001. Auditors need proof that the WAF blocks real attacks — with test results and metrics.

📉

No Baseline to Compare

Without benchmarks, you can't tell if a WAF vendor switch or rule change improved or degraded your security posture. You're flying blind.

🔢

"Blocked/Not Blocked" Isn't Enough

Binary pass/fail results hide important nuance. What's the false positive rate? How does detection vary across attack categories?

Metrics That Matter

WAFtester produces the same statistical measures used in machine learning and medical diagnostics — because WAF evaluation is a classification problem.

Detection Rate

95.8%

Percentage of malicious payloads the WAF correctly blocked. Also called True Positive Rate or Sensitivity. Higher is better.

False Positive Rate

0.24%

Percentage of legitimate requests the WAF incorrectly blocked. This metric measures how much user friction the WAF creates. Lower is better.

F1 Score

0.969

Harmonic mean of precision and recall. A single number (0-1) that balances detection against false positives. Above 0.95 is strong.

MCC

0.941

Matthews Correlation Coefficient (-1 to 1). The most balanced metric — accounts for all four quadrants of the confusion matrix. Above 0.9 is excellent.

The Benchmarking Workflow

Run benchmarks on a schedule to track your WAF's security posture over time.

Establish a Baseline

Run a full benchmark to capture your starting metrics. Save the JSON output as your baseline.

Baseline benchmark

$ waftester benchmark -u https://your-app.com -o baseline.json

[BENCH] Running detection accuracy benchmark...

True Positives: 2728 | False Negatives: 119

True Negatives: 845 | False Positives: 2

→ Detection Rate: 95.8% | FPR: 0.24%

→ F1: 0.969 | MCC: 0.941

Test After Rule Changes

After modifying WAF rules, adding a new vendor, or changing paranoia level — re-run the benchmark and compare.

Post-change benchmark

$ waftester benchmark -u https://your-app.com -o post-change.json

→ Detection Rate: 97.1% | FPR: 0.31%

→ F1: 0.978 | MCC: 0.962

[NOTE] Detection improved +1.3%, FPR increased +0.07%

Generate Compliance Evidence

Export results in multiple formats for auditors, management, and your security dashboards.

Evidence export

$ waf-tester assess -u https://your-app.com -o evidence.html -o evidence.json -o evidence.csv

→ evidence.html (Auditor-friendly report)

→ evidence.json (Archive / API consumption)

→ evidence.csv (Spreadsheet / GRC platform import)

Automate Regression Checks

Schedule benchmarks weekly in CI/CD. Alert on detection rate drops or FPR spikes.

Scheduled benchmark (cron)

# Weekly benchmark in GitHub Actions

$ waftester benchmark -u https://your-app.com -o results.json

# Parse JSON, compare with baseline, alert if F1 drops below 0.95

Compliance Framework Mapping

WAFtester benchmarks map to requirements in common compliance frameworks.

Framework	Requirement	WAFtester Evidence
PCI DSS 4.0	Req 6.4.2 — WAF for public web apps	Detection rate, bypass count, attack category coverage
SOC 2	CC6.1 — Logical access controls	WAF vendor detection, rule effectiveness metrics
ISO 27001	A.14.1.2 — Securing application services	Benchmark reports, regression tracking over time
NIST 800-53	SI-3 — Malicious code protection	F1/MCC scores, false positive analysis
OWASP Top 10	A05:2021 — Security Misconfiguration	Per-category detection rates (SQLi, XSS, SSTI, etc.)

Built for Security Operations

False Positive Testing

Dedicated fp command to verify findings aren't false positives. Critical for tuning WAF rules without breaking legitimate traffic.

Per-Category Breakdown

See detection rates for SQLi, XSS, SSTI, command injection, and 50+ categories individually. Know exactly where your WAF is weak.

Confusion Matrix

True positives, true negatives, false positives, false negatives — the full confusion matrix in every benchmark output.

Multiple Export Formats

HTML for auditors, CSV for spreadsheets, JSON for APIs, PDF for executives. One scan, every format your stakeholders need.

For quick benchmark commands, see the Assessment cheat sheet. To automate benchmarks in your pipeline, see the CI/CD integration guide. Full command walkthroughs are in the Examples Guide.

Ready to Try It?

One command to install. One command to scan. Real results in seconds.

$ npm install -g @waftester/cli

Get Started View on GitHub