Precision, recall and F1 for testers: what actually matters
A plain-English guide to the ML metrics that show up on AI testing exams — and why accuracy can lie to you on imbalanced data.
If you test AI systems, you need a working intuition for precision, recall and F1 — and when each one matters.
Why accuracy can mislead
On an imbalanced dataset, a model that always predicts the majority class can score 95%+ while missing every minority case.
Frequently asked
When should I prefer F1 over accuracy?
When the classes are imbalanced and the rare class matters — F1 balances precision and recall instead of being dominated by the majority class.
MK
Mike K
ISTQB-Certified Tester, ExamCaliber Editorial Team
Part of the ExamCaliber editorial team. Every ExamCaliber question and rationale is written and reviewed by hand against the current syllabus — never scraped from exam dumps.