Contextual AI AppSec Review
Map your application's real attack surface first. Then give that evidence to AI - instead of hoping it guesses correctly from raw code.
Why Generic AI Scans Fall Short
Feeding a codebase into an LLM and asking it to find vulnerabilities produces findings that are technically plausible but contextually wrong. The model doesn't know which inputs are stored, which routes are public-facing, or what your secrets are named - so it guesses. Most of what it returns is noise, and the real issues get buried in it.
This process allows to minimize false positives. Before any AI analysis runs, we use custom scripts to build a structured, factual picture of what your application actually accepts, stores, and exposes. That evidence becomes the model's working context - not the source code in full, but a precise inventory of what matters for security.
What changes with structured context
When AI is given verified data about specific inputs, confirmed data flows, and mapped configuration exposure, it can reason about actual paths rather than theoretical ones. Findings reference real code locations. False positives drop significantly. Critical issues surface clearly instead of being diluted by generic warnings.
Generic Scan vs. Contextual Review
The same AI model produces very different results depending on what you give it.
| Inpute Type | Generic AI Scan | Contextual AI Review |
|---|---|---|
| Input to the model | Raw source code or file paths | Structured inventory of inputs, flows, and config exposure |
| Data flow awareness | Inferred - often incorrect | Verified programmatically before analysis |
| Finding specificity | Pattern-matched from training data | Tied to actual code locations and variable names |
| False positive rate | High - validation context is missing | Lower - model sees what's actually adjacent to each input |
| Secret detection | Pattern matching on naming conventions | Pattern matching plus runtime exposure analysis |
| Review vectors | Whatever the model prioritizes | OWASP Top 10 applied one at a time to structured data |
How It Works
Five sequential steps, each building on the output of the previous one. The first four are programmatic. The fifth is where AI analysis runs.
Build the ground truth
Custom scripts parse your backend schemas - Prisma config files, raw SQL migrations - to produce a master list of every defined database column. A separate scan reads the source code for every external data entry point: REST route parameters, JSON request bodies, GraphQL resolver arguments, and incoming webhooks.
The result is two inventories: what the database expects, and what the application accepts from the outside world.
→ Output: database column list, external input listClassify every input
The two inventories are cross-referenced. Each external input is compared against the column list and placed into one of two categories:
- Persistent - matches a database column; data this application intends to store. Injection attacks and unauthorized writes live here.
- Volatile - no column match; data consumed in transit or reflected back to the client. Reflected vulnerabilities live here.
Knowing which category an input belongs to means the AI can be asked the right questions about it rather than a blanket query that treats all inputs the same way.
→ Output: classified input inventoryVerify the data flows
Classification alone is not enough. For each identified input, the tooling extracts 30–40 lines of surrounding code - enough to see what actually happens to the value after it arrives. Pattern matching then looks for concrete storage operations within that window: ORM queries, raw SQL adapters, Redis writes, Kafka producer calls.
Two discrepancies are flagged automatically:
- A persistent input with no storage operation in its execution window
- A volatile input that appears in a database write
The goal is to replace assumptions about what the code does with verification of what it actually does.
→ Output: verified data flow map with code contextMap the configuration surface
Pattern matching scans for embedded environment variables, Vault PKI paths, hardware security module references, and financial credentials. Each extracted value is assessed for exposure risk based on naming conventions and where it ends up at runtime.
Two categories receive immediate flags:
- Framework-specific variables with client-side prefixes (e.g.,
NEXT_PUBLIC_) that contain backend secrets - Hardcoded bearer strings or authentication tokens that appear directly in source without referencing an environment variable
Hardcoded secrets survive credential rotation and persist in version history. This step surfaces them explicitly before AI analysis begins.
→ Output: configuration exposure inventoryRun structured AI analysis
The three datasets - classified inputs, verified data flows, configuration exposure - become the model's working context. They are fed in logical chunks sized to the model's context window. The model does not receive the full codebase.
Analysis runs against specific review vectors rather than a broad "find vulnerabilities" prompt. For web applications, this means the OWASP Top 10 is applied one category at a time: broken access control, injection, insecure design, and so on. Each pass asks the model to reason about the structured data against that specific threat class.
Findings produced this way are tied to actual code locations, actual data flows, and actual configuration exposures. Each one can be followed from the entry point to the flagged destination using the extracted code context.
→ Output: finding report with code locations and data flow evidenceReading the Output
The analysis will surface real problems. It will also surface findings that look alarming but aren't - because validation logic lives in middleware outside the extracted window, or because the AI reasoned incorrectly from incomplete information. Three questions sort them out.
Does the data path actually connect?
Trace the input from the entry point to the flagged destination. If a condition, type check, validation call, or permission gate sits between them, the finding needs further investigation before it can be confirmed.
Is the path reachable without authentication?
A vulnerability on an admin-only endpoint is serious. The same vulnerability on a public webhook is critical. Findings behind specific permission requirements carry lower urgency than findings on unauthenticated routes.
Does the fix already exist elsewhere in the codebase?
Check how similar inputs on similar routes are handled. If the flagged instance is the outlier, it is a real gap. If it matches how everything else is handled, the analysis likely missed shared handling code - a false positive.
What a Real Finding Looks Like
A confirmed finding has three characteristics. Any finding missing one of these warrants closer inspection before action.
Specific location
The finding names the file, function, or route where the input enters the application. Not a class of vulnerability in the abstract - a concrete place in the code.
Confirmed data path
The value can be traced from its entry point to the flagged destination using the extracted code context, with nothing between them that inspects, transforms, or restricts it.
No contradicting code nearby
The surrounding 30–40 lines contain no validation calls, middleware references, or type checks that would handle the input before it reaches the vulnerable operation.
What This Process Covers
The methodology is framework-agnostic. It works wherever the underlying patterns exist in the source.
Input sources
- REST route parameters
- JSON request bodies
- GraphQL resolver arguments
- Webhook payloads
- Query strings and headers
Storage targets
- ORM queries (Prisma, Sequelize, TypeORM)
- Raw SQL adapters
- Redis writes
- Kafka producer calls
- File system writes
Configuration types
- Environment variables
- Vault PKI paths
- HSM references
- Financial credentials
- Hardcoded auth tokens
Start with a Mapping Session
We walk you through the process the first time, build the necessary automation for your stack, and establish a reusable review workflow your team can run on each release.
Schedule a Mapping Session