• About
  • Advertise
  • Privacy & Policy
  • Contact
Friday, January 9, 2026
  • Login
  • Home
    • Home – Layout 1
    • Home – Layout 2
    • Home – Layout 3
    • Home – Layout 4
    • Home – Layout 5
    • Home – Layout 6
  • News
    • All
    • Business
    • Politics
    • Science
    • World
    Hillary Clinton in white pantsuit for Trump inauguration

    Hillary Clinton in white pantsuit for Trump inauguration

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Entertainment
    • All
    • Gaming
    • Movie
    • Music
    • Sports
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    So you want to be a startup investor? Here are things you should know

    So you want to be a startup investor? Here are things you should know

  • Lifestyle
    • All
    • Fashion
    • Food
    • Health
    • Travel
    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    How couples can solve lighting disagreements for good

    How couples can solve lighting disagreements for good

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Trending Tags

    • Golden Globes
    • Game of Thrones
    • MotoGP 2017
    • eSports
    • Fashion Week
  • Review
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    Intel Core i7-7700K ‘Kaby Lake’ review

    Intel Core i7-7700K ‘Kaby Lake’ review

No Result
View All Result
Ai News
Advertisement
  • Home
    • Home – Layout 1
    • Home – Layout 2
    • Home – Layout 3
    • Home – Layout 4
    • Home – Layout 5
    • Home – Layout 6
  • News
    • All
    • Business
    • Politics
    • Science
    • World
    Hillary Clinton in white pantsuit for Trump inauguration

    Hillary Clinton in white pantsuit for Trump inauguration

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Entertainment
    • All
    • Gaming
    • Movie
    • Music
    • Sports
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    So you want to be a startup investor? Here are things you should know

    So you want to be a startup investor? Here are things you should know

  • Lifestyle
    • All
    • Fashion
    • Food
    • Health
    • Travel
    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    How couples can solve lighting disagreements for good

    How couples can solve lighting disagreements for good

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Trending Tags

    • Golden Globes
    • Game of Thrones
    • MotoGP 2017
    • eSports
    • Fashion Week
  • Review
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    Intel Core i7-7700K ‘Kaby Lake’ review

    Intel Core i7-7700K ‘Kaby Lake’ review

No Result
View All Result
Ai News
No Result
View All Result
Home Machine Learning

When (Not) to Use Vector DB

AiNEWS2025 by AiNEWS2025
2025-12-16
in Machine Learning
0
When (Not) to Use Vector DB
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


. They solve a real problem, and in many cases, they are the right choice for RAG systems. But here’s the thing: just because you’re using embeddings doesn’t mean you need a vector database.

We’ve seen a growing trend where every RAG implementation starts by plugging in a vector DB. That might make sense for large-scale, persistent knowledge bases, but it’s not always the most efficient path, especially when your use case is more dynamic or time-sensitive.

At Planck, we utilize embeddings to enhance LLM-based systems. However, in one of our real-world applications, we opted to avoid a vector database and instead used a simple key-value store, which turned out to be a much better fit.

Before I dive into that, let’s explore a simple, generalized version of our scenario to explain why.

Foo Example

Let’s imagine a simple RAG-style system. A user uploads a few text files, maybe some reports or meeting notes. We split those files into chunks, generate embeddings for each chunk, and use those embeddings to answer questions. The user asks a handful of questions over the next few minutes, then leaves. At that point, both the files and their embeddings are useless and can be safely discarded.

In other words, the data is ephemeral, the user will ask only a few questions, and we want to answer them as fast as possible.

Now pause for a second and ask yourself:

Where should I store these embeddings?


Most people’s instinct is: “I have embeddings, so I need a vector database”, but pause for a second and think about what’s actually happening behind that abstraction. When you send embeddings to a vector DB, it doesn’t just “store” them. It builds an index that speeds up similarity searches. That indexing work is where a lot of the magic comes from, and also where a lot of the cost lives.

In a long-lived, large-scale knowledge base, this trade-off makes perfect sense: you pay an indexing cost once (or incrementally as data changes), and then spread that cost over millions of queries. In our Foo example, that’s not what’s happening. We are doing the opposite: constantly adding small, one-off batches of embeddings, answering a tiny number of queries per batch, and then throwing everything away.

So the real question is not “should I use a vector database?” but “is the indexing work worth it?” To answer that, we can look at a simple benchmark.

Benchmarking: No-Index Retrieval vs. Indexed Retrieval

Photo by Julia Fiander on Unsplash

This section is more technical. We’ll look at Python code and explain the underlying algorithms. If the exact implementation details aren’t relevant to you, feel free to skip ahead to the Results section.

We want to compare two systems:

  1. No indexing at all, just keeps embeddings in memory and scans them directly.
  2. A vector database, where we pay an indexing cost upfront to make each query faster.

First, consider the “no vector DB” approach. When a query comes in, we compute similarities between the query embedding and all stored embeddings, then select the top-k. That’s just K-Nearest Neighbors without any index.

import numpy as np

def run_knn(embeddings: np.ndarray, query_embedding: np.ndarray, top_k: int) -> np.ndarray:
    sims = embeddings @ query_embedding
    return sims.argsort()[-top_k:][::-1]

The code uses the dot product as a proxy for cosine similarity (assuming normalized vectors) and sorts the scores to find the best matches. It literally just scans all vectors and picks the nearest ones.

Now, let’s look at what a vector DB typically does. Under the hood, most vector databases rely on an approximate nearest neighbor (ANN) index. ANN methods trade a bit of accuracy for a large boost in search speed, and one of the most widely used algorithms for this is HNSW. We’ll use the hnswlib library to simulate the index behavior.

import numpy as np
import hnswlib

def create_hnsw_index(embeddings: np.ndarray, num_dims: int) -> hnswlib.Index:
    index = hnswlib.Index(space='cosine', dim=num_dims)
    index.init_index(max_elements=embeddings.shape[0])
    index.add_items(embeddings)
    return index

def query_hnsw(index: hnswlib.Index, query_embedding: np.ndarray, top_k: int) -> np.ndarray:
    labels, distances = index.knn_query(query_embedding, k=top_k)
    return labels[0]

To see where the trade-off lands, we can generate some random embeddings, normalize them, and measure how long each step takes:

import time
import numpy as np
import hnswlib
from tqdm import tqdm

def run_benchmark(num_embeddings: int, num_dims: int, top_k: int, num_iterations: int) -> None:
    print(f"Benchmarking with {num_embeddings} embeddings of dimension {num_dims}, retrieving top-{top_k} nearest neighbors.")

    knn_times: list[float] = []
    index_times: list[float] = []
    hnsw_query_times: list[float] = []

    for _ in tqdm(range(num_iterations), desc="Running benchmark"):
        embeddings = np.random.rand(num_embeddings, num_dims).astype('float32')
        embeddings = embeddings / np.linalg.norm(embeddings, axis=1, keepdims=True)
        query_embedding = np.random.rand(num_dims).astype('float32')
        query_embedding = query_embedding / np.linalg.norm(query_embedding)

        start_time = time.time()
        run_knn(embeddings, query_embedding, top_k)
        knn_times.append((time.time() - start_time) * 1e3)

        start_time = time.time()
        vector_db_index = create_hnsw_index(embeddings, num_dims)
        index_times.append((time.time() - start_time) * 1e3)

        start_time = time.time()
        query_hnsw(vector_db_index, query_embedding, top_k)
        hnsw_query_times.append((time.time() - start_time) * 1e3)

    print(f"BENCHMARK RESULTS (averaged over {num_iterations} iterations)")
    print(f"[Naive KNN] Average search time without indexing: {np.mean(knn_times):.2f} ms")
    print(f"[HNSW Index] Average index construction time: {np.mean(index_times):.2f} ms")
    print(f"[HNSW Index] Average query time with indexing: {np.mean(hnsw_query_times):.2f} ms")

run_benchmark(num_embeddings=50000, num_dims=1536, top_k=5, num_iterations=20)

Results

In this example, we use 50,000 embeddings with 1,536 dimensions (matching OpenAI’s text-embedding-3-small) and retrieve the top-5 neighbors. The exact results will vary with different configs, but the pattern we care about is the same.

I encourage you to run the benchmark with your own numbers, it’s the best way to see how the trade-offs play out in your specific use case.

On average, the naive KNN search takes 24.54 milliseconds per query. Building the HNSW index for the same embeddings takes around 277 seconds. Once the index is built, each query takes about 0.47 milliseconds.

From this, we can estimate the break-even point. The difference between naive KNN and indexed queries is 24.07 ms per query. That implies you need 11,510 queries before the time saved on each query compensates for the time spent building the index.

Generated using the benchmark code: A graph comparing naive KNN and indexed search efficiency

Furthermore, even with different values for the number of embeddings and top-k, the break-even point remains in the thousands of queries and stays within a fairly narrow range. You don’t get a scenario where indexing starts to pay off after just a few dozen queries.

Generated using the benchmark code: A graph showing break-even points for various embedding counts and top-k settings (image by author)

Now compare that to the Foo example. A user uploads a small set of files and asks a few questions, not thousands. The system never reaches the point where the index pays off. Instead, the indexing step simply delays the moment when the system can answer the first question and adds operational complexity.

For this sort of short-lived, per-user context, the simple in-memory KNN approach is not only easier to implement and operate, but it is also faster end-to-end.

If in-memory storage is not an option, either because the system is distributed or because we need to preserve the user’s state for a few minutes, we can use a key-value store like Redis. We can store a unique identifier for the user’s request as the key and store all the embeddings as the value.

This gives us a lightweight, low-complexity solution that’s well-suited to our use case of short-lived, low-query contexts.

Real-World Example: Why We Chose a Key-Value Store

Photo by Gavin Allanwood on Unsplash

At Planck, we answer insurance-related questions about businesses. A typical request begins with a business name and address, and then we retrieve real-time data about that specific business, including its online presence, registrations, and other public records. This data becomes our context, and we use LLMs and algorithms to answer questions based on it.

The important bit is that every time we get a request, we generate a fresh context. We’re not reusing existing data, it’s fetched on demand and remains relevant for a few minutes at most.

If you think back to the earlier benchmark, this pattern should already be triggering your “this is not a vector DB use case” sensor.

Every time we receive a request, we generate fresh embeddings for short-lived data that we’ll likely query only a few hundred times. Indexing those embeddings in a vector DB adds unnecessary latency. In contrast, with Redis, we can immediately store the embeddings and run a quick similarity search in the application code with almost no indexing delay.

That’s why we chose Redis instead of a vector database. While vector DBs are excellent at handling large volumes of embeddings and supporting fast nearest-neighbor queries, they introduce indexing overhead, and in our case, that overhead is not worth it.

In Conclusion

If you need to store millions of embeddings and support high-query workloads across a shared corpus, a vector DB would be a better fit. And yes, there are definitely use cases out there that truly need and benefit from a vector DB.

But just because you’re using embeddings or building a RAG system doesn’t mean you should default to a vector DB.

Each database technology has its strengths and trade-offs. The best choice begins with a deep understanding of your data and use case, rather than mindlessly following the trend.

So, the next time you need to choose a database, pause for a moment and ask: am I choosing the right one based on objective trade-offs, or am I just going with the trendiest, shiniest choice?

Source link

#VectorDB

Tags: Editors PickEmbeddingsLlm ApplicationsRagVector Database
Previous Post

Texas sues biggest TV makers, alleging smart TVs spy on users without consent

Next Post

The Download: Why 2025 has been the year of AI hype correction, and fighting GPS jamming

AiNEWS2025

AiNEWS2025

Next Post
The Download: Why 2025 has been the year of AI hype correction, and fighting GPS jamming

The Download: Why 2025 has been the year of AI hype correction, and fighting GPS jamming

Stay Connected test

  • 23.9k Followers
  • 99 Subscribers
  • Trending
  • Comments
  • Latest
A tiny new open source AI model performs as well as powerful big ones

A tiny new open source AI model performs as well as powerful big ones

0
Water Cooler Small Talk: The Birthday Paradox 🎂🎉 | by Maria Mouschoutzi, PhD | Sep, 2024

Water Cooler Small Talk: The Birthday Paradox 🎂🎉 | by Maria Mouschoutzi, PhD | Sep, 2024

0
Ghost of Yōtei: The acclaimed Ghost of Tsushima is getting a sequel

Ghost of Yōtei: The acclaimed Ghost of Tsushima is getting a sequel

0
Best Headphones for Working Out (2024): Bose, Shokz, JLab

Best Headphones for Working Out (2024): Bose, Shokz, JLab

0
implications for enterprise strategy in 2026

implications for enterprise strategy in 2026

2026-01-09
The Download: Mimicking pregnancy’s first moments in a lab, and AI parameters explained

The Download: Mimicking pregnancy’s first moments in a lab, and AI parameters explained

2026-01-09
Beyond Prompting: The Power of Context Engineering

Beyond Prompting: The Power of Context Engineering

2026-01-09
Michigan man learns the hard way that “catch a cheater” spyware apps aren’t legal

Michigan man learns the hard way that “catch a cheater” spyware apps aren’t legal

2026-01-09

Recent News

implications for enterprise strategy in 2026

implications for enterprise strategy in 2026

2026-01-09
The Download: Mimicking pregnancy’s first moments in a lab, and AI parameters explained

The Download: Mimicking pregnancy’s first moments in a lab, and AI parameters explained

2026-01-09
Beyond Prompting: The Power of Context Engineering

Beyond Prompting: The Power of Context Engineering

2026-01-09
Michigan man learns the hard way that “catch a cheater” spyware apps aren’t legal

Michigan man learns the hard way that “catch a cheater” spyware apps aren’t legal

2026-01-09
Footer logo

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow Us

Browse by Category

  • AI & Cloud Computing
  • AI & Cybersecurity
  • AI & Sentiment Analysis
  • AI Applications
  • AI Ethics
  • AI Future Predictions
  • AI in Education
  • AI in Fintech
  • AI in Gaming
  • AI in Healthcare
  • AI in Startups
  • AI Innovations
  • AI News
  • AI Research
  • AI Tools & Automation
  • Apps
  • AR/VR & AI
  • Business
  • Deep Learning
  • Emerging Technologies
  • Entertainment
  • Fashion
  • Food
  • Gadget
  • Gaming
  • Health
  • Lifestyle
  • Machine Learning
  • Mobile
  • Movie
  • Music
  • News
  • Politics
  • Review
  • Robotics & Smart Systems
  • Science
  • Sports
  • Startup
  • Tech
  • Travel
  • World

Recent News

implications for enterprise strategy in 2026

implications for enterprise strategy in 2026

2026-01-09
The Download: Mimicking pregnancy’s first moments in a lab, and AI parameters explained

The Download: Mimicking pregnancy’s first moments in a lab, and AI parameters explained

2026-01-09
  • About
  • Advertise
  • Privacy & Policy
  • Contact

© 2026 JNews - Premium WordPress news & magazine theme by Jegtheme.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result

© 2026 JNews - Premium WordPress news & magazine theme by Jegtheme.