• About
  • Advertise
  • Privacy & Policy
  • Contact
Sunday, January 11, 2026
  • Login
  • Home
    • Home – Layout 1
    • Home – Layout 2
    • Home – Layout 3
    • Home – Layout 4
    • Home – Layout 5
    • Home – Layout 6
  • News
    • All
    • Business
    • Politics
    • Science
    • World
    Hillary Clinton in white pantsuit for Trump inauguration

    Hillary Clinton in white pantsuit for Trump inauguration

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Entertainment
    • All
    • Gaming
    • Movie
    • Music
    • Sports
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    So you want to be a startup investor? Here are things you should know

    So you want to be a startup investor? Here are things you should know

  • Lifestyle
    • All
    • Fashion
    • Food
    • Health
    • Travel
    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    How couples can solve lighting disagreements for good

    How couples can solve lighting disagreements for good

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Trending Tags

    • Golden Globes
    • Game of Thrones
    • MotoGP 2017
    • eSports
    • Fashion Week
  • Review
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    Intel Core i7-7700K ‘Kaby Lake’ review

    Intel Core i7-7700K ‘Kaby Lake’ review

No Result
View All Result
Ai News
Advertisement
  • Home
    • Home – Layout 1
    • Home – Layout 2
    • Home – Layout 3
    • Home – Layout 4
    • Home – Layout 5
    • Home – Layout 6
  • News
    • All
    • Business
    • Politics
    • Science
    • World
    Hillary Clinton in white pantsuit for Trump inauguration

    Hillary Clinton in white pantsuit for Trump inauguration

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Entertainment
    • All
    • Gaming
    • Movie
    • Music
    • Sports
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    So you want to be a startup investor? Here are things you should know

    So you want to be a startup investor? Here are things you should know

  • Lifestyle
    • All
    • Fashion
    • Food
    • Health
    • Travel
    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    How couples can solve lighting disagreements for good

    How couples can solve lighting disagreements for good

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Trending Tags

    • Golden Globes
    • Game of Thrones
    • MotoGP 2017
    • eSports
    • Fashion Week
  • Review
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    Intel Core i7-7700K ‘Kaby Lake’ review

    Intel Core i7-7700K ‘Kaby Lake’ review

No Result
View All Result
Ai News
No Result
View All Result
Home Machine Learning

How to Context Engineer to Optimize Question Answering Pipelines

AiNEWS2025 by AiNEWS2025
2025-09-06
in Machine Learning
0
How to Context Engineer to Optimize Question Answering Pipelines
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


engineering is one of the most relevant topics in machine learning today, which is why I’m writing my third article on the topic. My goal is to both broaden my understanding of engineering contexts for LLMs and share that knowledge through my articles.

In today’s article, I’ll discuss improving the context you feed into your LLMs for question answering. Usually, this context is based on retrieval augmented generation (RAG), however, in today’s ever-shifting environment, this approach should be updated.

The co-founder of Chroma (a vector database provider) tweeted that RAG is dead. I don’t fully agree that we won’t use RAG anymore, but his tweet highlights how there are different options for filling the context of your LLM.

You can also read my previous context engineering articles:

  1. Basic Context engineering techniques
  2. Advanced context engineering techniques

Table of Contents

Why you should care about context engineering

First, let me highlight three key points for why you should care about context engineering:

  • Better output quality by avoiding context rot. Fewer unnecessary tokens increase output quality. You can read more details about it in this article
  • Cheaper (don’t send unnecessary tokens, they cost money)
  • Speed (less tokens = faster response times)

These are three core metrics for most question answering systems. The output quality is naturally of utmost priority, considering users will not want to use a low-performing system.

Furthermore, price should always be a consideration, and if you can lower it (without too much engineering cost), it’s a simple decision to do so. Lastly, a faster question answering system provides a better user experience. You don’t want users waiting numerous seconds to get a response when ChatGPT will respond much faster.

The traditional question-answering approach

Traditional, in this sense, means the most common question answering approach in systems built after the release of ChatGPT. This system is traditional RAG, which works as follows:

  1. Fetch the most relevant documents to the user’s question, using vector similarity retrieval
  2. Feed relevant documents along with a question into an LLM, and receive a response

Considering its simplicity, this approach works incredibly well. Interestingly enough, we also see this happening with another traditional approach. BM25 has been around since 1994 and was, for example, recently utilized by Anthropic when they introduced Contextual Retrieval, proving how effective even simple information retrieval techniques are.

However, you can still vastly improve your question answering system by updating your RAG using some techniques I’ll describe in the next section.

Improving RAG context fetching

Even though RAG works relatively well, you can likely achieve better performance by introducing the techniques I’ll discuss in this section. The techniques I describe here all focus on improving the context you feed to the LLM. You can improve this context with two main approaches:

  1. Use fewer tokens on irrelevant context (for example, removing or using less material from relevant documents)
  2. Add documents that are relevant

Thus, you should focus on achieving one of the points above. If you think in terms of precision and recall:

  1. Increases precision (at the cost of recall)
  2. Increase recall (at the cost of precision)

This is a tradeoff you must make while working on context engineering your question answering system.

Reducing the number of irrelevant tokens

In this section, I highlight three main approaches to reduce the number of irrelevant tokens you feed into the LLMs context:

  • Reranking
  • Summarization
  • Prompting GPT

When fetching documents from vector similarity search, they are returned in order of most relevant to least relevant, given the vector similarity score. However, this similarity score might not accurately represent which documents are most relevant.

Reranking

You can thus use a reranking model, for example, Qwen reranker, to reorder the document chunks. You can then choose to only keep the top X most relevant chunks (according to the reranker), which should remove some irrelevant documents from your context.

Summarization

You can also choose to summarize documents, reducing the number of tokens used per document. You can, for example, keep the full document from the top 10 most similar documents fetched, summarize documents ranked from 11-20, and discard the rest.

This approach will increase the likelihood that you keep the full context from relevant documents, while at least maintaining some context (the summary) from documents that are less likely to be relevant.

Prompting GPT

Lastly, you can also prompt GPT whether the fetched documents are relevant to the user query. For example, if you fetch 15 documents, you can make 15 individual LLM calls to judge if each document is relevant. You then discard documents that are deemed irrelevant. Keep in mind that these LLM calls need to be parallelized to keep response time within an acceptable limit.

Adding relevant documents

Before or after removing irrelevant documents, you also ensure you include relevant documents. I include two main approaches in this subsection:

  • Better embedding models
  • Searching through more documents (at the cost of lower precision)

Better embedding models

To find the best embedding models, you can go to the HuggingFace embedding model leaderboard, where Gemini and Qwen are in the top 3 as of the writing of this article. Updating your embedding model is usually a cheap approach to fetch more relevant documents. This is because running and storing embeddings is usually cheap, for example, embedding through the Gemini API, and storing vectors in Pinecone.

Search more documents

Another (relatively simple) approach to fetch more relevant documents is to fetch more documents in general. Fetching more documents naturally increases the probability that you add relevant ones. However, you have to balance this with avoiding context rot and reducing the number of irrelevant documents to a minimum. Every unnecessary token in an LLM call is, as earlier, likely to:

  • Reduce output quality
  • Increase cost
  • Lower speed

These are all crucial aspects of a question-answering system.

Agentic search approach

I’ve discussed agentic search approaches in previous articles, for example, when I discussed Scaling your AI Search. However, in this section, I’ll dive deeper into setting up an agentic search, which replaces some or all of the vector retrieval step in your RAG.

The first step is that the user provides their question to a given set of data points, for example, a set of documents. You then set up an agentic system consisting of an orchestra agent and a list of sub-agents.

This figure highlights an orchestra system of LLM agents. The main agent receives the user query and assigns tasks to subagents. Image by ChatGPT.

This is an example of the pipeline the agents would follow (though there are many ways to set it up).

  1. Orchestra agent tells two subagents to iterate over all document filenames and return relevant documents
  2. Relevant documents are fed back to the orchestra agent, which again releases a subagent to each of the relevant documents, to fetch subparts (chunks) of the document that are relevant to the user’s question. These chunks are then fed back to the orchestra agent
  3. The orchestra agent answers the user’s question, given the provided chunks

Another flow you could implement could be to store document embeddings, and replace step one with vector similarity between the user question and each document.

This agentic approach has upsides and downsides.

Upsides:

  • Better chance of fetching relevant chunks than with traditional RAG
  • More control over the RAG system. You can update system prompts, etc, while RAG is relatively static with its embedding similarities

Downside:

In my opinion, building such an agent-based retrieval system is a super powerful approach that can lead to amazing results. The consideration you have to make when building such a system is whether the increased quality you’ll (likely) see is worth the increase in cost.

Other context engineering aspects

In this article, I’ve mainly covered context engineering for the documents we fetch in a question answering system. However, there are also other aspects you should be aware of, mainly:

  • The system/user prompt you are using
  • Other information fed into the prompt

The prompt you write for your question answering system should be precise, structured, and avoid irrelevant information. You can read many other articles on the topic of structuring prompts, and you can typically ask an LLM to improve these aspects of your prompt.

Sometimes, you also feed other information into your prompt. A common example is feeding in metadata, for example, data covering information about the user, such as:

  • Name
  • Job role
  • What they usually search for
  • etc

Whenever you add such information, you should always ask yourself:

Does amending this information help my question answering system answer the question?

Sometimes the answer is yes, other times it’s no. The most important part is that you made a rational decision on whether the information is needed in the prompt. If you can’t justify having this information in the prompt, it should usually be removed.

Conclusion

In this article, I have discussed context engineering for your question answering system, and why it’s important. Question answering systems usually consist of an initial step to fetch relevant information. The focus on this information should be to reduce the number of irrelevant tokens to a minimum, while also including as many relevant pieces of information as possible.

👉 Find me on socials:

🧑‍💻 Get in touch

🔗 LinkedIn

🐦 X / Twitter

✍️ Medium

You can also read my in-depth article on Anthropic’s contextual retrieval below:

Source link

#Context #Engineer #Optimize #Question #Answering #Pipelines

Tags: chatgptContext EngineeringLlmmachine learningRag
Previous Post

GOP may finally succeed in unrelenting quest to kill two NASA climate satellites

Next Post

Synthesia’s AI clones are more expressive than ever. Soon they’ll be able to talk back.

AiNEWS2025

AiNEWS2025

Next Post
Synthesia’s AI clones are more expressive than ever. Soon they’ll be able to talk back.

Synthesia’s AI clones are more expressive than ever. Soon they’ll be able to talk back.

Stay Connected test

  • 23.9k Followers
  • 99 Subscribers
  • Trending
  • Comments
  • Latest
A tiny new open source AI model performs as well as powerful big ones

A tiny new open source AI model performs as well as powerful big ones

0
Water Cooler Small Talk: The Birthday Paradox 🎂🎉 | by Maria Mouschoutzi, PhD | Sep, 2024

Water Cooler Small Talk: The Birthday Paradox 🎂🎉 | by Maria Mouschoutzi, PhD | Sep, 2024

0
Ghost of Yōtei: The acclaimed Ghost of Tsushima is getting a sequel

Ghost of Yōtei: The acclaimed Ghost of Tsushima is getting a sequel

0
Best Headphones for Working Out (2024): Bose, Shokz, JLab

Best Headphones for Working Out (2024): Bose, Shokz, JLab

0
How to Leverage Slash Commands to Code Effectively

How to Leverage Slash Commands to Code Effectively

2026-01-11
The oceans just keep getting hotter

The oceans just keep getting hotter

2026-01-11
The full history of TiVo, and how it changed TV forever

The full history of TiVo, and how it changed TV forever

2026-01-11
Doomsday Glacier Bombarded by Earthquakes

Doomsday Glacier Bombarded by Earthquakes

2026-01-11

Recent News

How to Leverage Slash Commands to Code Effectively

How to Leverage Slash Commands to Code Effectively

2026-01-11
The oceans just keep getting hotter

The oceans just keep getting hotter

2026-01-11
The full history of TiVo, and how it changed TV forever

The full history of TiVo, and how it changed TV forever

2026-01-11
Doomsday Glacier Bombarded by Earthquakes

Doomsday Glacier Bombarded by Earthquakes

2026-01-11
Footer logo

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow Us

Browse by Category

  • AI & Cloud Computing
  • AI & Cybersecurity
  • AI & Sentiment Analysis
  • AI Applications
  • AI Ethics
  • AI Future Predictions
  • AI in Education
  • AI in Fintech
  • AI in Gaming
  • AI in Healthcare
  • AI in Startups
  • AI Innovations
  • AI News
  • AI Research
  • AI Tools & Automation
  • Apps
  • AR/VR & AI
  • Business
  • Deep Learning
  • Emerging Technologies
  • Entertainment
  • Fashion
  • Food
  • Gadget
  • Gaming
  • Health
  • Lifestyle
  • Machine Learning
  • Mobile
  • Movie
  • Music
  • News
  • Politics
  • Review
  • Robotics & Smart Systems
  • Science
  • Sports
  • Startup
  • Tech
  • Travel
  • World

Recent News

How to Leverage Slash Commands to Code Effectively

How to Leverage Slash Commands to Code Effectively

2026-01-11
The oceans just keep getting hotter

The oceans just keep getting hotter

2026-01-11
  • About
  • Advertise
  • Privacy & Policy
  • Contact

© 2026 JNews - Premium WordPress news & magazine theme by Jegtheme.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result

© 2026 JNews - Premium WordPress news & magazine theme by Jegtheme.