Definition

Vector Database

A specialized database that stores and searches high-dimensional numerical representations of text, images, or other data — enabling semantic similarity search.

The Full Definition

A vector database stores embeddings — numerical representations of text, images, or other data produced by a machine learning model — and supports fast nearest-neighbor search across millions or billions of these vectors. Where a traditional database matches on exact terms ("find rows where state = CA"), a vector database matches on semantic similarity ("find the documents most semantically similar to this query"), even when no exact words overlap.

Why It Matters

Vector search is the retrieval engine that makes RAG, semantic search, recommendation, and clustering possible at scale. Without a purpose-built vector database, retrieval becomes the bottleneck of any AI system that touches more than a few thousand documents.

How This Shows Up in Practice

A medical practice indexes its clinical protocols, insurance policies, and prior patient communications in a vector database. When a patient messages a question, the system instantly retrieves the most semantically relevant guidance — even if the patient phrased the question completely differently than any document.

Common Questions

Which vector database should we use?

For most production deployments: Pinecone, Weaviate, Qdrant, or pgvector (Postgres extension) cover the practical range. The right choice depends on scale, hosting preference, and how much existing Postgres infrastructure you have.

Do we need a vector database for small datasets?

Below ~10,000 documents, in-memory search libraries (FAISS, Chroma) often suffice. Above that, a managed vector database simplifies operations significantly.

Related Terms

Want to put this to work?

A free process audit maps where vector database — and the rest of the modern AI stack — actually move the needle in your business.

Survey My Business