The Full Definition
A large language model is a neural network — usually a transformer architecture — trained on hundreds of billions of words to predict the next token in a sequence. From this single objective emerges a model capable of reading, writing, summarizing, reasoning about, and translating natural language. Modern LLMs (GPT-4, Claude, Llama, Gemini, Mistral) range from a few billion to over a trillion parameters and form the foundation of every contemporary AI application.
Why It Matters
LLMs are the engine. Every AI system that reads or writes natural language — assistants, agents, copilots, summarizers, classifiers — is wrapped around an LLM. Choosing the right one for the task (cost, latency, capability, deployment) is one of the most consequential architectural decisions in any AI build.
How This Shows Up in Practice
For a customer support agent: a small model handles intent classification cheaply; a larger model handles the actual conversation when nuance matters; a fine-tuned model handles the policy-specific responses where consistency is critical. The architecture chooses the right LLM for each step.
Common Questions
GPT vs Claude vs Llama — which is best?
It depends on the task. Claude tends to lead on long-context reasoning and nuanced writing; GPT-4 leads on broad reasoning and tool use; Llama and Mistral lead on cost-efficient self-hosted deployment. Most serious deployments use multiple.
Can I run an LLM on my own infrastructure?
Yes — open-weight models like Llama, Mistral, and Qwen run on standard GPU infrastructure or even on CPUs (at lower speed). Self-hosting is the right path when data sensitivity or cost-at-scale dictates.
Related Terms
Transformer
The neural network architecture — based on attention — that powers every modern LLM, image model, and most state-of-the-art AI.
Fine-Tuning
The process of further training a pretrained language model on a specific dataset to specialize its behavior, style, or domain.
Context Window
The maximum amount of text — measured in tokens — that an LLM can consider at once when generating a response.
Hallucination
When a language model produces confident-sounding output that is factually incorrect or unsupported by its inputs.
Want to put this to work?
A free process audit maps where large language model (llm) — and the rest of the modern AI stack — actually move the needle in your business.