Latent Space: The AI Engineer Podcast
Episode: ⚡️The Rise and Fall of the Vector DB Category
Date: May 1, 2025
Guest: Joe Christian Burgum
Host(s): Latent.Space
Episode Overview
This “lightning pod” episode features Joe Christian Burgum, a veteran of search systems and AI infrastructure, discussing the rapid emergence—and equally swift decline—of the "vector database" (vector DB) as a standalone infrastructure category. The conversation explores the origins of vector DB hype in the AI community, why it became a dominant topic in RAG (Retrieval Augmented Generation), and the market and technical convergence that’s led to its fading distinctiveness. The discussion offers historical context, first-hand insight into embedding-based search, and practical guidance for developers and companies navigating today’s retrieval, RAG, and search infrastructure landscape.
Key Discussion Points and Insights
1. Joe’s Background and Motivation for the Vector DB Piece
[01:00]
- Joe recaps his 20+ years in search, from Yahoo and FAST Search to building with neural search and embeddings.
- The "ChatGPT moment" in Nov 2022 saw a surge of developers building ‘search’ and ‘RAG’ thanks to OpenAI guides, forging a strong (perhaps unnatural) link between retrieval and vector embeddings.
- Joe was motivated to clarify misconceptions and document this sudden boom and bust:
"The separate infrastructure category is dying, right?...I’m not saying the companies are dying, but the category is dying." — Joe [03:47]
2. The Dramatic Arc: Rise (and Fall) of Vector DBs
[03:03-04:30]
- Pinecone, early leader, quickly gained massive traction and funding but experienced an equally rapid cool-down in perception and personnel.
- Pinecone and others began repositioning to focus more on developers than enterprises.
- The vector DB category is contracting because nearly all modern databases and traditional search engines (Elasticsearch, Solr, Vespa, Postgres+PGVector) now offer vector search capabilities.
- "Now you have Vector search in almost any DB technology nowadays...the category is dying." — Joe [03:37]
3. From 'Vector Database' to 'Search Engine'
[05:10]
- Joe advocates returning to "search" as the key abstraction, integrating vector embeddings behind the scenes rather than as a distinct layer or product.
- There’s a convergence of feature sets—traditional search systems now offer dense/sparse retrieval options, and RAG is fundamentally about connecting reasoning models (LLMs) to generalized search tools.
- "I want to call these new companies...search engines. That’s a more natural abstraction for connecting AI with knowledge." — Joe [04:29]
4. Role of Embeddings, Industry Mainstreaming, and Limitations
[08:31]
- Embeddings are not new—big tech used them for years—but APIs from OpenAI made them mainstream for all devs.
- Embeddings are important but insufficient:
- Need metadata, freshness, authority, and other signals—pure cosine similarity is not enough.
- "It’s not only about similarity searches in this kind of embedding space...you need something more...freshness, authority, other signals that really play into web search." — Joe [09:12]
- Hopes for more specialized and multimodal embeddings (see: domain-specific and visual-language embeddings).
5. Classic Search, Hybrid, and Trade-Offs
[10:27-12:03]
- If you already use Postgres (with PGVector), it can be sufficient for reasonable scale and vector workloads.
-
"PGVector is doing more in the capabilities of vector search than some of the real vector database players."* — Joe [10:47]
- At large scale or when business depends on search quality, a dedicated search engine is still recommended.
6. Architecture and Practical Guidance for RAG/Search
[13:17-16:00]
- Joe lays out a practical development sequence:
- Clean and prep your data (PDFs, etc.)
- Baseline with BM25 (classic keyword matching)
- Explore hybrid search (combine keyword and embedding)
- Add re-ranking layer if latency and cost allow ("You can stitch this together...with multiple APIs depending on your budget." — Joe [15:21])
- Batch/offline is best for most workloads; don't over-architect for always-online, low-latency unless necessary.
- Embedding API calls can become a bottleneck at scale—local inference may be required.
7. What Should Live ‘In the Database’?
[17:23]
- Joe is not bullish on running ML models, embedding, or agentic inference inside the database (e.g., via Postgres ML).
- Prefers clear separation of infrastructure components; wary of monolithic, opaque, all-in-one DB systems.
- "I don’t believe in the developer experience of writing huge SQL statements for transforming data, then embedding it...I tend to want more control." — Joe [17:45]
8. The RAG ≠ Vector DB Misconception, and RAG’s Future
[18:45-21:12]
- Joe clarifies: the demise of the vector DB category does not mean RAG is dead. Augmenting AI with retrieval/search will remain relevant.
- Longer LLM context windows change the calculus for some applications (you can now fit small datasets in context!), but real-world use still requires retrieval for large corpora.
- "RAG is definitely not dead. Augmenting AI with retrieval or search is still going to be relevant for a very long time." — Joe [19:11]
9. Knowledge Graphs, Graph RAG, and Tech Obsession
[22:00-24:07]
- Graph RAG/knowledge graphs: Potentially valuable in some cases, but the real issue is building the graph (not the DB tech itself).
- Don’t always conflate new retrieval methods with specific infrastructure products; conceptually, you can do graph exploration with standard search tools.
- "People get caught up in some specific technology all the time...You jump from some concept into some technology." — Joe [22:23]
10. Embedding Model Innovation, Business, and the Road Ahead
[24:30-25:55]
- The next big wave may come from domain-specific embeddings (e.g., legal, health).
- Visual-language models could remove complex OCR steps and produce better multimodal representations.
- There are business challenges (API, compute cost, artist sustainability) even for technically superior models.
Notable Quotes & Memorable Moments
-
On category collapse:
"I’m not saying the companies are dying, but the category is dying." — Joe [03:47] -
On search vs. vector DBs:
"That’s a more natural abstraction for connecting AI with knowledge...the arguments for doing RAG—the natural concept is search." — Joe [04:29] -
On application advice:
"A very strong baseline is the classical BM25 algorithm...it gives you that baseline...Then you can start looking at embedding models...then, if you can afford it, add a re-ranking layer." — Joe [15:00] -
On mixing AI/ML infra with databases:
"I don’t believe in the developer experience of writing these huge SQL statements for transforming data from this and then embedding it and then writing it back and expressing this in the databases." — Joe [17:45] -
On hype and cycles:
"Now we have 10 million model longer context and you have the same cycle repeat every time." — Joe [19:39] -
On infinity context windows vs. retrieval:
"Some parts of it is still not...relevant now because we have longer context windows. But...already 36 million tokens...You’re not going to load all of that for a single query." — Joe [21:27]
Timeline of Important Segments
- [01:00] — Joe’s background in search and motivation for analyzing vector DBs
- [03:03] — Pinecone, funding, and the explosive rise/fall of vector DBs
- [04:29] — Why “search” should be the abstraction for RAG/Augmented AI
- [08:31] — Embeddings: importance, limitations, and misconceptions
- [10:27] — Practical use: Postgres + PGVector, convergence in capabilities
- [13:17] — Guidance for building RAG/search workflows
- [17:23] — Discussion on putting ML/embedding logic inside databases
- [18:45] — Vector DBs vs. RAG: clarifying the “RAG is dead” misconception
- [21:56] — Thoughts on knowledge graphs and Graph RAG
- [24:30] — The future for embedding models and business challenges
Final Thoughts & Resources
- Joe encourages listeners to connect with him on X (@joebergen).
- The AI community’s "high signal" conversations often happen there, “without which we wouldn’t have this meeting.” — Joe [26:43]
- The episode offers a balanced, historical, and pragmatic look at an infrastructure hype cycle, emphasizing enduring concepts over passing product categories.
For more show notes and resources visit:
https://www.latent.space
