Retrieval Augmented Generation (RAG) has become a crucial technique for improving the accuracy and relevance of AI-generated responses. The effectiveness of RAG heavily depends on the quality of context provided to the large language model (LLM), which is typically retrieved from vector stores based on user queries. The relevance of this context directly impacts the model’s ability to generate accurate and contextually appropriate responses.
One effective way to improve context relevance is through metadata filtering, which allows you to refine search results by pre-filtering the vector store based on custom metadata attributes. By narrowing down the search space to the most relevant documents or chunks, metadata filtering reduces noise and irrelevant information, enabling the LLM to focus on the most relevant content.
In some use cases, particularly those involving complex user queries or a large number of metadata attributes, manually constructing metadata filters can become challenging and potentially error-prone. To address these challenges, you can use LLMs to create a robust solution. This approach, which we call intelligent metadata filtering, uses tool use (also known as function calling) to dynamically extract metadata filters from natural language queries. Function calling allows LLMs to interact with external tools or functions, enhancing their ability to process and respond to complex queries.
Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, along with a broad set of capabilities to build generative AI applications with security, privacy, and responsible AI. One of its key features, Amazon Bedrock Knowledge Bases, allows you to securely connect FMs to your proprietary data using a fully managed RAG capability and supports powerful metadata filtering capabilities.
In this post, we explore an innovative approach that uses LLMs on Amazon Bedrock to intelligently extract metadata filters from natural language queries. By combining the capabilities of LLM function calling and Pydantic data models, you can dynamically extract metadata from user queries. This approach can also enhance the quality of retrieved information and responses generated by the RAG applications.
This approach not only addresses the challenges of manual metadata filter construction, but also demonstrates how you can use Amazon Bedrock to create more effective and user-friendly RAG applications.
Metadata filtering is a powerful feature that allows you to refine search results by pre-filtering the vector store based on custom metadata attributes. This approach narrows down the search space to the most relevant documents or passages, reducing noise and irrelevant information. For a comprehensive overview of metadata filtering and its benefits, refer to Amazon Bedrock Knowledge Bases now supports metadata filtering to improve retrieval accuracy.
In RAG applications, the accuracy and relevance of generated responses heavily depend on the quality of the context provided to the LLM. This context, typically retrieved from the knowledge base based on user queries, directly impacts the model’s ability to generate accurate and contextually appropriate outputs.
To evaluate the effectiveness of a RAG system, we focus on three key metrics:
By implementing dynamic metadata filtering, you can significantly improve these metrics, leading to more accurate and relevant RAG responses. Let’s explore how to implement this approach using Amazon Bedrock and Pydantic.
In this section, we illustrate the flow of the dynamic metadata filtering solution using the tool use (function calling) capability. The following diagram illustrates high level RAG architecture with dynamic metadata filtering.

The process consists of the following steps:
RetrieveAndGenerateThis architecture uses the power of tool use for intelligent metadata extraction from a user’s query, combined with the robust RAG capabilities of Amazon Bedrock Knowledge Bases. The key innovation lies in Step 2, where the LLM is used to dynamically interpret the user’s query and extract relevant metadata for filtering. This approach allows for more flexible and intuitive querying, because users can express their information needs in natural language without having to manually specify metadata filters.
The subsequent steps (3–4) follow a more standard RAG workflow, but with the added benefit of using the dynamically generated metadata filter to improve the relevance of retrieved documents. This combination of intelligent metadata extraction and traditional RAG techniques results in more accurate and contextually appropriate responses to user queries.
Before proceeding with this tutorial, make sure you have the following in place:
In the following sections, we explore how to implement dynamic metadata filtering using the tool use feature in Amazon Bedrock and Pydantic for data validation.
Tool use is a powerful feature in Amazon Bedrock that allows models to access external tools or functions to enhance their response generation capabilities. When you send a message to a model, you can provide definitions for one or more tools that could potentially help the model generate a response. If the model determines it needs a tool, it responds with a request for you to call the tool, including the necessary input parameters.
In our example, we use Amazon Bedrock to extract entities like genre and year from natural language queries about video games. For a query like “A strategy game with cool graphics released after 2023?”” it will extract “strategy” (genre) and “2023” (year). These extracted entities will then dynamically construct metadata filters to retrieve only relevant games from the knowledge base. This allows flexible, natural language querying with precise metadata filtering.
First, set up your environment with the necessary imports and Boto3 clients:
import json
import boto3
from typing import List, Optional
from pydantic import BaseModel, validator
region = "us-east-1"
bedrock = boto3.client("bedrock-runtime", region_name=region)
bedrock_agent_runtime = boto3.client("bedrock-agent-runtime")
MODEL_ID = "<add-model-id>"
kb_id = "<Your-Knowledge-Base-ID>"
For this solution, you use Pydantic models to validate and structure our extracted entities:
class Entity(BaseModel):
genre: Optional[str]
year: Optional[str]
class ExtractedEntities(BaseModel):
entities: List[Entity]
@validator('entities', pre=True)
def remove_duplicates(cls, entities):
unique_entities = []
seen = set()
for entity in entities:
entity_tuple = tuple(sorted(entity.items()))
if entity_tuple not in seen:
seen.add(entity_tuple)
unique_entities.append(dict(entity_tuple))
return unique_entities
You now define a tool for entity extraction with basic instructions and use it with Amazon Bedrock. You should use a proper description for this to work for your use case:
tools = [
{
"toolSpec": {
"name": "extract_entities",
"description": "Extract named entities from the text. If you are not 100% sure of the entity value, use 'unknown'.",
"inputSchema": {
"json": {
"type": "object",
"properties": {
"entities": {
"type": "array",
"items": {
"type": "object",
"properties": {
"genre": {"type": "string", "description": "The genre of the game. First alphabet is upper case."},
"year": {"type": "string", "description": "The year when the game was released."}
},
"required": ["genre", "year"]
}
}
},
"required": ["entities"]
}
}
}
}
]
def extract_entities(text):
response = bedrock.converse(
modelId=MODEL_ID,
inferenceConfig={
"temperature": 0,
"maxTokens": 4000
},
toolConfig={"tools": tools},
messages=[{"role": "user", "content": [{"text": text}]}]
)
json_entities = None
for content in response['output']['message']['content']:
if "toolUse" in content and content['toolUse']['name'] == "extract_entities":
json_entities = content['toolUse']['input']
break
if json_entities:
return ExtractedEntities.parse_obj(json_entities)
else:
print("No entities found in the response.")
return None
Create a function to construct the metadata filter based on the extracted entities:
def construct_metadata_filter(extracted_entities):
if not extracted_entities or not extracted_entities.entities:
return None
entity = extracted_entities.entities[0]
metadata_filter = {"andAll": []}
if entity.genre and entity.genre != 'unknown':
metadata_filter["andAll"].append({
"equals": {
"key": "genres",
"value": entity.genre
}
})
if entity.year and entity.year != 'unknown':
metadata_filter["andAll"].append({
"greaterThanOrEquals": {
"key": "year",
"value": int(entity.year)
}
})
return metadata_filter if metadata_filter["andAll"] else None
Finally, create a main function to process the query and retrieve results:
def process_query(text):
extracted_entities = extract_entities(text)
metadata_filter = construct_metadata_filter(extracted_entities)
response = bedrock_agent_runtime.retrieve(
knowledgeBaseId=kb_id,
retrievalConfiguration={
"vectorSearchConfiguration": {
"filter": metadata_filter
}
},
retrievalQuery={
'text': text
}
)
return response
# Example usage
text = "A strategy game with cool graphic released after 2023"
result = process_query(text)
# Print results
for game in result.get('retrievalResults', []):
print(f"Title: {game.get('content').get('text').split(':')[0].split(',')[-1].replace('score ','')}")
print(f"Year: {game.get('metadata').get('year')}")
print(f"Genre: {game.get('metadata').get('genres')}")
print("---")
This implementation uses the tool use feature in Amazon Bedrock to dynamically extract entities from user queries. It then uses these entities to construct metadata filters, which are applied when retrieving results from the knowledge base.
The key advantages of this approach include:
When implementing dynamic metadata filtering, it’s important to consider and handle edge cases. In this section, we discuss some ways you can address them.
If the tool use process fails to extract metadata from the user query due to an absence of filters or errors, you have several options:
if not metadata_filter:
response = bedrock_agent_runtime.retrieve(
knowledgeBaseId=kb_id,
retrievalQuery={'text': text}
)
default_filter = {"andAll": [{"greaterThanOrEquals": {"key": "year", "value": 2020}}]}
metadata_filter = metadata_filter or default_filter
if not metadata_filter:
return {
"error": "I'm sorry, but I couldn't understand the specific details of your request. Could you please provide more information about the type of game or the release year you're interested in?"
}
This approach makes sure that only queries with clear, extractable metadata are processed, potentially reducing errors and improving overall response quality.
The dynamic approach introduces an additional FM call to extract metadata, which will increase both cost and latency. To mitigate this, consider the following:
After you’ve finished experimenting with this solution, it’s crucial to clean up your resources to avoid unnecessary charges. For detailed cleanup instructions, see Amazon Bedrock Knowledge Bases now supports metadata filtering to improve retrieval accuracy. These steps will guide you through deleting your knowledge base, vector database, AWS Identity and Access Management (IAM) roles, and sample datasets, making sure that you don’t incur unexpected costs.
By implementing dynamic metadata filtering using Amazon Bedrock and Pydantic, you can significantly enhance the flexibility and power of RAG applications. This approach allows for more intuitive querying of knowledge bases, leading to improved context recall and more relevant AI-generated responses.
As you explore this technique, remember to balance the benefits of dynamic filtering against the additional computational costs. We encourage you to try this method in your own RAG applications and share your experiences with the community.
For additional resources, refer to the following:
Happy building with Amazon Bedrock!
Mani Khanuja is a Tech Lead – Generative AI Specialists, author of the book Applied Machine Learning and High-Performance Computing on AWS, and a member of the Board of Directors for Women in Manufacturing Education Foundation Board. She leads machine learning projects in various domains such as computer vision, natural language processing, and generative AI. She speaks at internal and external conferences such AWS re:Invent, Women in Manufacturing West, YouTube webinars, and GHC 23. In her free time, she likes to go for long runs along the beach.
Ishan Singh is a Generative AI Data Scientist at Amazon Web Services, where he helps customers build innovative and responsible generative AI solutions and products. With a strong background in machine learning and natural language processing, Ishan specializes in developing safe and responsible AI systems that drive business value. Outside of work, he enjoys playing competitive volleyball, exploring local bike trails, and spending time with his wife and dog, Beau.
Manuel Rioux est fièrement propulsé par WordPress