Transformer

What is a Transformer?

The transformer is the model architecture that underlies most modern large language models. It introduced the attention mechanism, which allows the model to understand how words in a sequence relate to each other regardless of their distance — rather than processing text strictly left to right.

Why did the Transformer change AI?

Transformer-based models significantly outperformed previous architectures on language tasks and have become the standard foundation for enterprise AI applications. The attention mechanism is what allows these models to maintain coherent understanding across long passages of text, handle complex grammatical relationships, and generate contextually appropriate responses — capabilities that earlier architectures struggled to deliver reliably.