Triage Metrics

Accuracy Overview

Reviewed Tickets
1
Category Accuracy
100%
Severity Accuracy
100%
Route Accuracy
100%
Weighted Accuracy
100%

Weighted: Category 40% + Severity 40% + Route 20%. Based on 1 reviewed tickets.

Accuracy by Category

Weighted accuracy per category (Category 40% + Severity 40% + Route 20%). Override rate shows how often the reviewer disagreed with the AI.

CategoryTicketsCategory Acc.Severity Acc.Route Acc.Weighted Acc.Override Rate
Login/Auth1100%100%100%100%0%

Governance — Forced Review

Total Triage Runs
5
Forced Review
0
Forced Review %
0%
Gating ReasonCount
High Severity (P1/P2)0
Sensitive Category0
Low Confidence0

Confidence Calibration

When the AI reports a confidence level, how accurate is it actually? A well-calibrated model should show accuracy close to its stated confidence. Based on 1 reviewed tickets.

Confidence 0.80-0.90(1 ticket)100% actualvs 85% expected
+15pp
▲ expected confidence (85%)

pp = percentage points. Green = well-calibrated (within 5pp). Amber = slightly over-confident. Red = significantly over-confident.

Prompt Version Tracking

Accuracy breakdown by prompt version and model. Use this to compare performance across prompt iterations.

Prompt VersionModelTickets TriagedReviewedCategory Acc.Severity Acc.Route Acc.Weighted Acc.
triage-v1deepseek-chat51100%100%100%100%

Ticket Distribution

Breakdown of AI-triaged tickets by category, severity, and route. Based on 5 triaged tickets.

Category

Login/Auth2  (40%)
Billing & Subscription1  (20%)
Integration1  (20%)
Performance / Reliability1  (20%)

Severity

P23  (60%)
P41  (20%)
P31  (20%)

Route

Support L23  (60%)
Support L11  (20%)
Billing1  (20%)

Confusion Matrices

Rows = AI predicted · Columns = Human final decision. Diagonal = correct predictions (highlighted).

Category

Predicted ↓ / Actual →Login/Auth
Login/Auth1

Severity

Predicted ↓ / Actual →P2
P21

Route

Predicted ↓ / Actual →Support L1
Support L11