The Full Definition
The context window is the maximum size of input a language model can process in a single call, measured in tokens (roughly ¾ of a word in English). It includes both the prompt you provide and the response the model generates. Modern models range from 8K tokens (around 6,000 words) to 1M+ tokens (around 750,000 words) of context. Within that window, the model can reason across everything it sees; beyond it, content must be summarized, retrieved, or chunked.
Why It Matters
Context window size is a fundamental architectural constraint. It dictates whether you can stuff a contract into the prompt directly or whether you need RAG. Bigger context windows are powerful but expensive — long-context calls cost dramatically more, and quality can degrade in the middle of very long contexts ("lost in the middle").
How This Shows Up in Practice
A team tried to drop a 400-page corporate document into a 1M-token model and ask questions of it. It worked — but each query cost $4 and took 30 seconds. Switching to RAG over chunks of the same document brought cost to $0.02 and latency to under a second, with equal quality.
Common Questions
Is a bigger context window always better?
No. Bigger windows increase cost roughly linearly with size, can suffer from "lost in the middle" effects, and rarely outperform a well-designed RAG system on retrieval-style tasks. Match context size to actual need.
What's a token?
A token is the unit a model reads — roughly ¾ of a word in English. "Tokenization" is the chunking step that converts text to tokens before the model processes it.
Related Terms
Large Language Model (LLM)
A neural network trained on massive amounts of text to predict the next token — the foundation of modern AI assistants, agents, and generative systems.
Retrieval-Augmented Generation (RAG)
A technique that grounds an LLM's output in a specific document corpus by retrieving relevant context before generation.
Transformer
The neural network architecture — based on attention — that powers every modern LLM, image model, and most state-of-the-art AI.
Want to put this to work?
A free process audit maps where context window — and the rest of the modern AI stack — actually move the needle in your business.