Hypotheses
How CodeWall uses hypotheses to drive targeted, methodical vulnerability discovery.
Hypotheses are the core reasoning mechanism behind CodeWall's penetration testing. Rather than running a static checklist of scans, CodeWall's agent formulates hypotheses about potential vulnerabilities in your target — then designs specific tests to confirm or reject each one.
What is a hypothesis?
A hypothesis is a structured statement about a suspected vulnerability. Each hypothesis includes:
| Field | Description |
|---|---|
| Statement | A clear, testable claim — e.g., "The /api/users endpoint is vulnerable to IDOR via predictable user IDs" |
| Severity | The expected impact if confirmed (Critical, High, Medium, Low) |
| Family | The vulnerability class — auth, injection, XSS, misconfiguration, memory-safety, etc. |
| Confidence | How likely the agent believes this hypothesis is to be true (0–100%) |
| Rationale | Why the agent suspects this vulnerability exists |
| Preconditions | What must be true for the vulnerability to be exploitable |
| Proposed checks | The specific tests the agent plans to run |
Why hypotheses matter
Traditional scanners work by firing hundreds of generic payloads and checking for known signatures. CodeWall works differently — it reasons about your application's specific architecture, technology stack, and behaviour to form targeted hypotheses.
This approach has several advantages:
- Fewer false positives — the agent only tests what it has reason to suspect, rather than spraying payloads
- Deeper coverage — hypothesis-driven testing catches logic flaws, business logic vulnerabilities, and chained attack paths that signature-based scanners miss
- Transparency — you can see exactly what the agent is thinking and why, not just a list of CVEs
- Efficiency — the agent spends its budget on the most promising attack vectors rather than exhaustive enumeration
The hypothesis lifecycle
- Formulation — during the analysis phase, the agent reviews reconnaissance data and formulates hypotheses about potential vulnerabilities
- Prioritisation — hypotheses are ranked by severity and confidence, so the most impactful and likely vulnerabilities are tested first
- Validation — during the validate phase, the agent runs the proposed checks for each hypothesis
- Outcome — each hypothesis is marked as verified (vulnerability confirmed), rejected (not exploitable), not tested (skipped due to budget or prerequisites), or error (test failed to execute)
Verified hypotheses become findings with full proof-of-concept evidence and remediation guidance.
Adding your own hypotheses
When approval gates are enabled, you can inject your own hypotheses into a running test before the analysis phase completes. This is powerful for:
- Domain knowledge — you know your application better than any scanner. If you suspect a specific endpoint is vulnerable, tell the agent to test it
- Regression testing — add hypotheses for previously fixed vulnerabilities to verify they haven't regressed
- Compliance checks — inject hypotheses for specific compliance requirements your organisation must meet
- Red team scenarios — guide the agent toward specific attack paths you want validated
How to add a hypothesis
- Navigate to your running test
- Switch to the Hypotheses tab
- Fill in the statement, severity, family, and optional rationale
- Click Add Hypothesis
Your hypothesis is queued alongside the agent's own hypotheses and will be tested during the validate and exploit phases. The agent treats operator-submitted hypotheses with the same rigour as its own — designing specific test cases, executing them, and reporting the outcome.
When you can add hypotheses
You can add hypotheses while the test is in the preflight, recon, or analysis phases. If approval gates are enabled, you can also add them while the test is awaiting approval for the recon or analysis phase. Once the agent moves into the validate phase, the hypothesis list is locked.
Retesting hypotheses
If a hypothesis was marked as not tested or error, you can trigger a targeted retest directly from the Hypotheses tab. This launches a new focused test that only validates that specific hypothesis, without re-running the full engagement.

