vector databases pgvector - Complete Guide

Vector Databases for AI: pgvector vs Pinecone vs Weaviate

AI applications need to store and search embeddings — high-dimensional vectors that represent text, images, and other data as numerical arrays. Traditional databases can’t efficiently search across millions of 1536-dimensional vectors to find the most similar ones. Vector databases are purpose-built for this: they use specialized indexes (HNSW, IVFFlat) to find nearest neighbors in milliseconds. This guide compares three leading options — pgvector, Pinecone, and Weaviate — with real benchmarks and practical selection criteria.

How Vector Search Works

When you generate an embedding from text using a model like OpenAI’s text-embedding-3-small, you get a 1536-dimensional float array. Semantically similar texts produce vectors that are close together in this high-dimensional space. Vector search finds the K nearest vectors to a query vector — essentially answering “what’s most similar to this?”

The challenge is scale. Brute-force comparison against every vector is O(n) — fine for 10,000 vectors, unusable for 10 million. Vector databases solve this with Approximate Nearest Neighbor (ANN) algorithms that trade a small accuracy loss (typically 95-99% recall) for orders-of-magnitude speed improvements. HNSW (Hierarchical Navigable Small World) graphs are the most popular: they build a multi-layer graph structure that enables sub-millisecond search across millions of vectors.

Moreover, production AI systems need more than just vector search. They need metadata filtering (find similar vectors BUT only in category X), hybrid search (combine vector similarity with keyword matching), and real-time updates (add new vectors without rebuilding the entire index). Each database handles these requirements differently.

Vector database embedding search and similarity comparison — Vector databases find semantically similar content through approximate nearest neighbor search

pgvector: Vector Search in PostgreSQL

pgvector adds vector operations to PostgreSQL — the database you probably already run. No new infrastructure, no new operational knowledge, no data synchronization between your relational data and vector store. Your vectors live alongside your regular tables with full SQL support.

-- Enable pgvector extension
CREATE EXTENSION vector;

-- Create table with vector column
CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    title TEXT NOT NULL,
    content TEXT NOT NULL,
    category TEXT,
    embedding vector(1536),  -- OpenAI embedding dimension
    created_at TIMESTAMP DEFAULT NOW()
);

-- Create HNSW index for fast similarity search
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 200);

-- Semantic search with metadata filtering
SELECT id, title, content,
    1 - (embedding <=> query_embedding) AS similarity
FROM documents
WHERE category = 'engineering'
AND created_at > '2025-01-01'
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 10;

-- Hybrid search: combine semantic + full-text
SELECT id, title,
    (0.7 * (1 - (embedding <=> query_vec))) +
    (0.3 * ts_rank(to_tsvector(content), plainto_tsquery('kubernetes scaling')))
    AS combined_score
FROM documents
WHERE to_tsvector(content) @@ plainto_tsquery('kubernetes scaling')
ORDER BY combined_score DESC
LIMIT 10;

pgvector strengths: zero additional infrastructure, full SQL support, ACID transactions, works with your existing PostgreSQL tooling (backups, replication, monitoring), and metadata filtering is just WHERE clauses. pgvector weaknesses: performance degrades past ~5 million vectors, HNSW index build is memory-intensive, no built-in sharding for horizontal scaling, and concurrent updates to HNSW indexes can be slower than purpose-built solutions.

Pinecone: Managed Vector Database

Pinecone is a fully managed vector database — no infrastructure to manage, no indexes to tune, no scaling to configure. You send vectors through an API and query them. It’s optimized for production workloads with features like namespaces, metadata filtering, and sparse-dense hybrid search.

from pinecone import Pinecone, ServerlessSpec

# Initialize Pinecone
pc = Pinecone(api_key="your-api-key")

# Create index
pc.create_index(
    name="documents",
    dimension=1536,
    metric="cosine",
    spec=ServerlessSpec(cloud="aws", region="us-east-1")
)

index = pc.Index("documents")

# Upsert vectors with metadata
index.upsert(vectors=[
    {
        "id": "doc-1",
        "values": embedding_vector,  # 1536-dim float list
        "metadata": {
            "title": "Kubernetes Scaling Guide",
            "category": "engineering",
            "author": "jane",
            "date": "2026-01-15"
        }
    }
], namespace="engineering-docs")

# Query with metadata filter
results = index.query(
    vector=query_embedding,
    top_k=10,
    namespace="engineering-docs",
    filter={
        "category": {"$eq": "engineering"},
        "date": {"$gte": "2025-01-01"}
    },
    include_metadata=True
)

for match in results.matches:
    print(f"{match.id}: {match.score:.3f} - {match.metadata['title']}")

Pinecone strengths: zero operational overhead, consistent low-latency queries at any scale, serverless pricing (pay per query), excellent metadata filtering, and built-in namespaces for multi-tenancy. Pinecone weaknesses: vendor lock-in (proprietary, no self-hosting), no hybrid search with keyword matching (vector-only), cost can be significant at scale ($0.08/query after free tier), limited query flexibility compared to SQL, and data must leave your infrastructure.

Managed vector database performance and scaling — Pinecone eliminates operational overhead but introduces vendor lock-in

Weaviate: Open-Source with Built-in ML

Weaviate is an open-source vector database with built-in vectorization — it can generate embeddings automatically using integrated ML models. You can send raw text and Weaviate handles the embedding generation, storage, and search in one step. Additionally, it supports GraphQL queries, hybrid search (vector + BM25), and multi-modal search (text + images).

import weaviate
import weaviate.classes as wvc

# Connect to Weaviate
client = weaviate.connect_to_local()

# Create collection with built-in vectorizer
documents = client.collections.create(
    name="Document",
    vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_openai(
        model="text-embedding-3-small"
    ),
    properties=[
        wvc.config.Property(name="title", data_type=wvc.config.DataType.TEXT),
        wvc.config.Property(name="content", data_type=wvc.config.DataType.TEXT),
        wvc.config.Property(name="category", data_type=wvc.config.DataType.TEXT),
    ]
)

# Add data — Weaviate generates embeddings automatically
documents.data.insert_many([
    {"title": "K8s Scaling", "content": "Guide to scaling...",
     "category": "engineering"},
    {"title": "ML Pipeline", "content": "Building production...",
     "category": "data-science"}
])

# Hybrid search: vector similarity + BM25 keyword matching
results = documents.query.hybrid(
    query="kubernetes autoscaling best practices",
    alpha=0.7,  # 70% vector, 30% keyword
    filters=wvc.query.Filter.by_property("category").equal("engineering"),
    limit=10,
    return_metadata=wvc.query.MetadataQuery(score=True)
)

for obj in results.objects:
    print(f"{obj.properties['title']}: {obj.metadata.score:.3f}")

Weaviate strengths: open-source with self-hosting option, built-in vectorization (no separate embedding pipeline), native hybrid search (vector + BM25), GraphQL API, multi-modal support, and active community. Weaviate weaknesses: higher operational complexity than pgvector (separate infrastructure), memory-intensive for large datasets, less mature than Pinecone for managed deployments, and built-in vectorization adds latency to inserts.

Choosing the Right Vector Database

Choose pgvector when: you already use PostgreSQL, your vector count is under 5 million, you need ACID transactions alongside vector search, you want to avoid new infrastructure, or you need complex SQL queries joining vector results with relational data. It’s the simplest path for most applications starting with AI features.

Choose Pinecone when: you want zero operational burden, you need consistent performance at 10M+ vectors, you’re building a SaaS product with multi-tenant vector isolation, or your team doesn’t have infrastructure expertise. The managed experience is genuinely excellent.

Choose Weaviate when: you need hybrid search (vector + keyword), you want built-in embedding generation, you prefer open-source with self-hosting control, you need multi-modal search, or you want GraphQL for complex queries. Weaviate is the most feature-rich open-source option.

Database architecture decision and AI data infrastructure — pgvector for simplicity, Pinecone for scale, Weaviate for features — match to your needs

Related Reading:

Resources:

In conclusion, the right vector database depends on your scale, infrastructure preferences, and feature needs. Start with pgvector if you already use PostgreSQL — it handles most use cases up to 5 million vectors with zero new infrastructure. Move to Pinecone or Weaviate when you need dedicated vector database performance, advanced features, or need to scale beyond what PostgreSQL can handle efficiently.

Vector Databases for AI: pgvector vs Pinecone vs Weaviate Comparison 2026