PLATFORM

Plant the mistake. Then prove your product caught it.

The scenario engine is the difference between 'we tested it' and 'we measured it'. You define the failure; Fictix injects it and keeps the ground truth; your product gets a score.

Why ground truth is the whole game

Run a detector on production books and you learn what it flagged — never what it missed, because nobody labelled reality. An evaluation without ground truth is a vibe. The scenario engine inverts that: you decide the truth by planting it, so every run yields real precision and recall, not anecdotes.

The eight needle families

A needle is a known mistake hidden in otherwise coherent books. Fictix ships eight families, each with a detectable signature and a recorded location:
NeedleWhat it looks like in the data
Same bill twiceOne vendor invoice paid on two different dates / two journal entries
Name mix-upsVendor or customer recorded under near-duplicate names that should consolidate
Ghost workerPayroll run for an employee with no offer, no start event, no activity
Wrong categoryExpense booked to a GL account inconsistent with its vendor and history
Missing infoRequired fields blank on otherwise valid documents (memo, class, date)
Wrong dateTransaction posted to the wrong period — revenue/expense recognition trap
Books don't matchSub-ledger total diverges from the GL / bank feed by a planted delta
Looks like fraudStructured pattern: round-dollar runs, threshold-hugging, off-hours edits
An edit-history layer can also plant after-the-fact changes, so audit-trail products have something real to find. Needles compose: one company can carry many, across systems, at controlled intensity.

Needle	What it looks like in the data
Same bill twice	One vendor invoice paid on two different dates / two journal entries
Name mix-ups	Vendor or customer recorded under near-duplicate names that should consolidate
Ghost worker	Payroll run for an employee with no offer, no start event, no activity
Wrong category	Expense booked to a GL account inconsistent with its vendor and history
Missing info	Required fields blank on otherwise valid documents (memo, class, date)
Wrong date	Transaction posted to the wrong period — revenue/expense recognition trap
Books don't match	Sub-ledger total diverges from the GL / bank feed by a planted delta
Looks like fraud	Structured pattern: round-dollar runs, threshold-hugging, off-hours edits

Author scenarios in plain language

You don't hand-write JSON. Describe the failure; Fictix injects it and writes a manifest — what was planted, where, and the expected finding. From the CLI:
fictix scenario add "pay invoice INV-2287 twice, 9 days apart"
fictix scenario add "ghost employee on the March payroll run"
fictix scenario list      # shows planted needles + ground-truth refs
fictix advance 30d        # move time so the trap matures
fictix assert --recall 0.9 --precision 0.8   # grade your detectorThe assertstep fails CI if your product's findings miss the recall/precision thresholds — detection becomes a build gate, not a quarterly review.

Test matrices and regression

Group scenarios into a test matrixthat must pass before release. Because the company is deterministic, last week's run and this week's run differ only by your code, so a score delta is a real regression — trackable over time, attributable to a commit, and filable as a ticket that points at the exact transaction you planted.

Turn up the pressure

Chaos and fraud are dials, not booleans. Raise intensity to stress a model toward its failure point; lower it to find the floor where it stops catching anything. Pair with the living simulation so needles appear mid-stream, not just at t=0.

Start with a snapshot. Make it live when you're ready.

Generate your first company →

Questions

How does Fictix know if my product is right?

It plants every anomaly itself and records a manifest of what and where, so it computes precision and recall on your product's findings per scenario — misses are explicit, not inferred.

Do I write code or config to design a scenario?

No. You describe the anomaly in plain language via the dashboard or `fictix scenario add`; Fictix injects it into the books and tracks ground truth.

Can scenario results gate a release?

Yes. `fictix assert` enforces recall/precision thresholds and fails CI, and a test matrix can be required to pass before shipping.

Can multiple anomalies coexist in one company?

Yes — needles compose across systems and time at controlled intensity, so you can simulate realistic, messy books rather than one isolated bug.

→ Evaluate financial AI against ground truth → What is a planted anomaly (a needle)?→ Living simulation → For Product & QA