Tokens are the chunks of text that an AI model reads and writes. A token is roughly a word or part of a word — most models process text as a sequence of tokens rather than characters or complete words.
The number of tokens in an input and output affects cost, latency, and whether the content fits within the model's context window. Understanding token usage is practical for anyone working with language model APIs: long documents require more tokens, complex instructions use more tokens, and systems need to be designed to stay within the limits of what the model can process in a single call.