Agentic Fraud Lab
When the fraud agent writes its own rules โ inspired by CommBank's autonomous rule generation system.
Static rule-based fraud systems are slow by construction. By the time a senior analyst notices a new pattern, drafts a rule, runs it past compliance and pushes it to production, fraudsters have moved on. CommBank's agentic fraud system flipped this: 75% of card fraud rules are now agent-authored, cutting fraud losses by 20% in H1 FY26 vs. the prior period.
This lab walks through the 8-agent rule synthesis pipeline โ Signal Mining โ Hypothesis โ Rule Drafting โ Backtest โ Compliance โ Analyst Review โ Staged Deploy โ Decay Monitor โ on 5 emerging fraud patterns. You see the actual rule DSL the agent drafts, the backtest it runs, the compliance findings, the human reviewer's decision, and the live deployment trace.
Reference: superml.dev/commbank-agentic-fraud-rule-generation-2026 โ ยท Architecture showcase only โ no LLM, no DB. All numbers are illustrative of the pattern.
8-Agent Rule Synthesis Pipeline
Click Run Pipeline to trace how a fraud pattern becomes a deployed rule
Production Metrics โ CommBank-style autonomous rule generation
80M
Daily signals analysed
cross-channel fusion
โ20%
Fraud-loss reduction
H1 FY26 vs prior period
75%
Rules agent-authored
of card fraud rules
~2 days
Time-to-deploy a new rule
down from 6 weeks
12,400/mo
Analyst hours saved
across 4 fraud teams
3.1%
Auto-rollbacks triggered
rules that failed canary
Interactive Pattern โ Rule Trace
Choose an emerging fraud pattern
AI-Enabled Social Engineering
growingcriticalVoice-cloned vishing โ wire transfer
Scammer clones a relative's voice, calls victim from spoofed number, walks them through a "verification" wire transfer.
$142k
/day loss
14
victims/day
14,320
signals
Threat narrative
Over the last 11 days, fraud analysts started receiving an unusual cluster of complaints โ wires of $4โ18k authorised by long-tenure customers, who afterwards reported the call sounded exactly like their adult child. Three different victims described being asked to read out an OTP "to cancel a fraudulent charge." This pattern does not match any existing rule: the wires are individually within velocity limits, the device is the customer's own, and step-up auth is passing because the customer is on the phone.
Signal Mining Agent
Streams 80M signals/day. Flags suspicious clusters.
- โReal-time stream: transactions, logins, device events, payee adds โ 80M signals/day
- โRecently-confirmed fraud labels from analyst queue
- โCustomer complaints + chargeback feed
- โExternal intel: scam phone DBs, stolen card markets, breach notifications
Cluster
Voice-clone wire after recent inbound spoofed call
CLU-2026-0411
Similar to known archetype?
Closest archetype: traditional vishing (cosine 0.61) โ but 0.61 is far below the 0.85 reuse threshold
Channels
Mobile banking app, Phone banking
Geo spread
Sydney metro, Melbourne metro
Top deviating features
Example signal vectors (anonymised)
- โ TXN-08831 โ $9,400 wire to new payee 7 min after inbound call from spoofed family number
- โ TXN-08902 โ $14,200 wire, customer audibly on call during transaction (mic open)
- โ TXN-09011 โ $5,750, payee added 2 min before wire, OTP read aloud (inferred)
- โ TXN-09155 โ $11,800, caller number matched scam intel feed within 24h
Impact Analysis
Manual Detection vs. Agentic Rule Synthesis
Manual analyst-driven approach
SLOWTime to detect
32 days
Time to deploy rule
+14 days
Analyst hours
96 hrs
$ leaked while waiting
$6.50M
Without the agent, this pattern would surface only after 30+ chargebacks were filed and a senior analyst noticed the cluster manually. Drafting + reviewing the rule manually would take another 2 weeks. Estimated $6.5M leaked while waiting โ a real cost of doing fraud detection by hand.
Agentic rule synthesis
46 days โ 4 hrsValue delivered
The 8-agent pipeline mines, drafts, backtests, reviews and deploys a rule in hours โ not weeks. The human analyst spends their time on the 10% that actually requires judgement: novel hypotheses, policy choices, hardship-routing decisions. Routine rule maintenance is automated end-to-end.
โ20%
fraud loss
75%
rules agent-authored
~2d
pattern โ live
Active Rule Repository
9
Total active rules
7/9
Agent-authored
$6.94M
Monthly $ saved
1
Sunset queued
| Rule | Category | Author | Status | Precision | Recall | Triggers/d | $ saved/mo | Age | Trend |
|---|---|---|---|---|---|---|---|---|---|
PayID mule chain โฅ4 hops in 30 min R-2026-0329 | Mule | agent | live | 0.81 | 0.74 | 119 | $2240k | 33d | โ |
Voice-clone wire after spoofed inbound call R-2026-0341 | APP scam | agent | live | 0.92 | 0.71 | 84 | $1840k | 18d | โ |
Synthetic ID dormancy โ drain 24h R-2026-0337 | Synthetic ID | agent | live | 0.88 | 0.66 | 31 | $920k | 22d | โ |
IP-BIN country mismatch + e-commerce R-2025-1184 | Card-not-present | analyst | live | 0.71 | 0.49 | 142 | $690k | 318d | โ |
BNPL stacking โ 6+ accounts in 72h R-2026-0312 | BNPL fraud | agent+analyst | live | 0.84 | 0.61 | 22 | $410k | 49d | โ |
OTP relay โ fast keystroke pattern R-2026-0359 | ATO | agent | canary | 0.86 | 0.55 | 41 | $380k | 6d | โ |
Refund-loop ring on premium electronics R-2026-0301 | Chargeback ring | agent | live | 0.79 | 0.58 | 17 | $350k | 56d | โ |
High-MCC after card add via mobile R-2025-0987 | Card-not-present | agent | sunsetting | 0.62 | 0.41 | 73 | $110k | 184d | โ |
Crypto on-ramp after dormant 90d R-2026-0361 | ATO | agent | shadow | 0.83 | 0.62 | 28 | โ | 3d | โ |
The SuperML Take
The biggest unlock isn't that the agent can write rules โ it's that the agent maintains them. Pattern mining โ backtest โ deploy โ decay-monitor is a closed loop, and the agent runs that loop every day. Human analysts move from rule-authoring sweatshop to policy reviewers and edge-case curators.