Accuracy measures how reliably an AI system gets things right. In classification tasks, it is typically the percentage of correct predictions out of total predictions. For generative tasks, accuracy is harder to measure mechanically and usually involves human review or reference-based scoring.
Accuracy is not the same as confidence. An AI can sound completely sure of itself and still be wrong. In enterprise settings, this distinction has real consequences: a misclassified ticket goes to the wrong team, a wrong extraction corrupts a record, a bad decision in a loan workflow creates financial or compliance risk.
An invoice extraction agent tested on 500 invoices that correctly identifies vendor name, amount, and due date on 470 of them has 94% accuracy on those fields. That number gives teams something concrete to act on — and tells them exactly where to focus improvement efforts.