SuperML Logo
๐Ÿค–

Agentic Fraud Lab

When the fraud agent writes its own rules โ€” inspired by CommBank's autonomous rule generation system.

Static rule-based fraud systems are slow by construction. By the time a senior analyst notices a new pattern, drafts a rule, runs it past compliance and pushes it to production, fraudsters have moved on. CommBank's agentic fraud system flipped this: 75% of card fraud rules are now agent-authored, cutting fraud losses by 20% in H1 FY26 vs. the prior period.

This lab walks through the 8-agent rule synthesis pipeline โ€” Signal Mining โ†’ Hypothesis โ†’ Rule Drafting โ†’ Backtest โ†’ Compliance โ†’ Analyst Review โ†’ Staged Deploy โ†’ Decay Monitor โ€” on 5 emerging fraud patterns. You see the actual rule DSL the agent drafts, the backtest it runs, the compliance findings, the human reviewer's decision, and the live deployment trace.

Agentic AI Autonomous Rule Generation Pattern Mining Human-in-the-Loop Champion/Challenger Rule Decay

Reference: superml.dev/commbank-agentic-fraud-rule-generation-2026 โ†— ยท Architecture showcase only โ€” no LLM, no DB. All numbers are illustrative of the pattern.

8-Agent Rule Synthesis Pipeline

Click Run Pipeline to trace how a fraud pattern becomes a deployed rule

โ†’
โ†’
โ†’
โ†’
โ†’
โ†’
โ†’

Production Metrics โ€” CommBank-style autonomous rule generation

๐Ÿ“ก

80M

Daily signals analysed

cross-channel fusion

๐Ÿ›ก๏ธ

โˆ’20%

Fraud-loss reduction

H1 FY26 vs prior period

๐Ÿค–

75%

Rules agent-authored

of card fraud rules

โšก

~2 days

Time-to-deploy a new rule

down from 6 weeks

โฑ๏ธ

12,400/mo

Analyst hours saved

across 4 fraud teams

โ†บ

3.1%

Auto-rollbacks triggered

rules that failed canary

Interactive Pattern โ†’ Rule Trace

Choose an emerging fraud pattern

๐ŸŽ™๏ธ

AI-Enabled Social Engineering

growingcritical

Voice-cloned vishing โ†’ wire transfer

Scammer clones a relative's voice, calls victim from spoofed number, walks them through a "verification" wire transfer.

$142k

/day loss

14

victims/day

14,320

signals

Threat narrative

Over the last 11 days, fraud analysts started receiving an unusual cluster of complaints โ€” wires of $4โ€“18k authorised by long-tenure customers, who afterwards reported the call sounded exactly like their adult child. Three different victims described being asked to read out an OTP "to cancel a fraudulent charge." This pattern does not match any existing rule: the wires are individually within velocity limits, the device is the customer's own, and step-up auth is passing because the customer is on the phone.

๐Ÿ“ก

Signal Mining Agent

Streams 80M signals/day. Flags suspicious clusters.

Agent 1 / 8
  • โ†’Real-time stream: transactions, logins, device events, payee adds โ€” 80M signals/day
  • โ†’Recently-confirmed fraud labels from analyst queue
  • โ†’Customer complaints + chargeback feed
  • โ†’External intel: scam phone DBs, stolen card markets, breach notifications
LLM role:GPT-4-class model summarises each cluster into plain-English pattern names like "voice-clone wire after spoofed inbound call"Tools:Snowflake (cross-channel data) ยท sentence-transformers ยท HDBSCAN ยท Redis stream

Cluster

Voice-clone wire after recent inbound spoofed call

CLU-2026-0411

Novelty
89
ยท14,320 signalsยทemerged 11d ago

Similar to known archetype?

Closest archetype: traditional vishing (cosine 0.61) โ€” but 0.61 is far below the 0.85 reuse threshold

Channels

Mobile banking app, Phone banking

Geo spread

Sydney metro, Melbourne metro

Top deviating features

time_since_inbound_call_min4โ€“18 minvs 0 calls before wire+โˆž (new feature)
caller_number_in_scam_intel_db74% of casesvs 0.4% of all wires+185ร—
wire_to_new_payee_added_today91%vs 12%+7.6ร—
customer_on_call_during_wire88%vs 6%+14.7ร—
session_otp_read_aloud_inferred63% (mic-pattern signal)vs <0.1%novel signal

Example signal vectors (anonymised)

  • โ†’ TXN-08831 โ€” $9,400 wire to new payee 7 min after inbound call from spoofed family number
  • โ†’ TXN-08902 โ€” $14,200 wire, customer audibly on call during transaction (mic open)
  • โ†’ TXN-09011 โ€” $5,750, payee added 2 min before wire, OTP read aloud (inferred)
  • โ†’ TXN-09155 โ€” $11,800, caller number matched scam intel feed within 24h

Impact Analysis

Manual Detection vs. Agentic Rule Synthesis

๐Ÿ“‹

Manual analyst-driven approach

SLOW

Time to detect

32 days

Time to deploy rule

+14 days

Analyst hours

96 hrs

$ leaked while waiting

$6.50M

Without the agent, this pattern would surface only after 30+ chargebacks were filed and a senior analyst noticed the cluster manually. Drafting + reviewing the rule manually would take another 2 weeks. Estimated $6.5M leaked while waiting โ€” a real cost of doing fraud detection by hand.

๐Ÿค–

Agentic rule synthesis

46 days โ†’ 4 hrs

Value delivered

The 8-agent pipeline mines, drafts, backtests, reviews and deploys a rule in hours โ€” not weeks. The human analyst spends their time on the 10% that actually requires judgement: novel hypotheses, policy choices, hardship-routing decisions. Routine rule maintenance is automated end-to-end.

โˆ’20%

fraud loss

75%

rules agent-authored

~2d

pattern โ†’ live

Active Rule Repository

9

Total active rules

7/9

Agent-authored

$6.94M

Monthly $ saved

1

Sunset queued

Filter:ยทSort:
RuleCategoryAuthorStatusPrecisionRecallTriggers/d$ saved/moAgeTrend

PayID mule chain โ‰ฅ4 hops in 30 min

R-2026-0329

Muleagentlive0.810.74119$2240k33dโ†’

Voice-clone wire after spoofed inbound call

R-2026-0341

APP scamagentlive0.920.7184$1840k18dโ†’

Synthetic ID dormancy โ†’ drain 24h

R-2026-0337

Synthetic IDagentlive0.880.6631$920k22dโ†‘

IP-BIN country mismatch + e-commerce

R-2025-1184

Card-not-presentanalystlive0.710.49142$690k318dโ†“

BNPL stacking โ€” 6+ accounts in 72h

R-2026-0312

BNPL fraudagent+analystlive0.840.6122$410k49dโ†’

OTP relay โ€” fast keystroke pattern

R-2026-0359

ATOagentcanary0.860.5541$380k6dโ†‘

Refund-loop ring on premium electronics

R-2026-0301

Chargeback ringagentlive0.790.5817$350k56dโ†“

High-MCC after card add via mobile

R-2025-0987

Card-not-presentagentsunsetting0.620.4173$110k184dโ†“

Crypto on-ramp after dormant 90d

R-2026-0361

ATOagentshadow0.830.6228โ€”3dโ†‘

The SuperML Take

The biggest unlock isn't that the agent can write rules โ€” it's that the agent maintains them. Pattern mining โ†’ backtest โ†’ deploy โ†’ decay-monitor is a closed loop, and the agent runs that loop every day. Human analysts move from rule-authoring sweatshop to policy reviewers and edge-case curators.