Definition
The context window is the maximum number of tokens a language model can read and process in a single inference. By 2026, frontier models offer 128K to 2M+ token windows (Gemini 2.5 leads on size). Larger windows enable longer documents, multi-doc analysis, and longer agent traces, but quality often degrades on tasks requiring deep recall from the middle of very long contexts ('lost in the middle' phenomenon).
Example
GPT-5 supports 1M+ tokens. Claude Opus 4 supports 1M tokens. Gemini 2.5 Pro supports 2M tokens.
When to use
Plan input size carefully. Test recall at the middle of long contexts before relying on it.