Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies through a single API, along with capabilities to build generative AI applications with security, privacy, and responsible AI.
Batch inference in Amazon Bedrock is for larger workloads where immediate responses aren’t critical. With a batch processing approach, organizations can analyze substantial datasets efficiently, with significant cost advantages: you can benefit from a 50% reduction in pricing compared to the on-demand option. This makes batch inference particularly valuable for handling extensive data to get inference from Amazon Bedrock FMs.
As organizations scale their use of Amazon Bedrock FMs for large-volume data processing, implementing effective monitoring and management practices for batch inference jobs becomes an important focus area for optimization. This solution demonstrates how to implement automated monitoring for Amazon Bedrock batch inference jobs using AWS serverless services such as AWS Lambda, Amazon DynamoDB, and Amazon EventBridge, reducing operational overhead while maintaining reliable processing of large-scale batch inference workloads. Through a practical example in the financial services sector, we show how to build a production-ready system that automatically tracks job status, provides real-time notifications, and maintains audit records of processing activities.
Consider a scenario where a financial services company manages millions of customer interactions and data points, including credit histories, spending patterns, and financial preferences. This company recognized the potential of using advanced AI capabilities to deliver personalized product recommendations at scale. However, processing such massive datasets in real time isn’t always necessary or cost-effective.
The solution presented in this post uses batch inference in Amazon Bedrock with automated monitoring to process large volumes of customer data efficiently using the following architecture.

This architecture workflow includes the following steps:
inference_results folder.This automated monitoring solution for Amazon Bedrock batch inference offers several key benefits:
To implement this solution, you must have the following:
Additionally, make sure to deploy the solution in an AWS Region that supports batch inference.
For this solution, we provide an AWS CloudFormation template that sets up the services included in the architecture, to enable repeatable deployments. This template creates the following resources:
To deploy the CloudFormation template, complete the following steps:


create-jsonl-file in its name.

finance-product-recommender-v1 prompt to see the prompt template input for the FM.
The job progresses through different statuses: Submitted, Validating, In Progress, and lastly Completed, or Failed. You can leave this page and check the status after a few hours.
The EventBridge rule will automatically trigger the second Lambda function with event-bridge-trigger in its name on completion of the job. This function will add an entry in the DynamoDB table named bedrock_batch_job_status with details of the execution, as shown in the following screenshot.

This DynamoDB table functions as a state manager for Amazon Bedrock batch inference jobs, tracking the lifecycle of each request. The columns of the table are logically divided into the following categories:
job_arn, job_name) – These columns provide the unique identifier and a human-readable name for each batch inference request, serving as the primary keys or core attributes for tracking.StartTime, EndTime, last_processed_timestamp, TotalDuration) – This category captures the temporal aspects and the overall progression of the job, allowing for monitoring of its current state, start/end times, and total processing duration. last_processed_timestamp is crucial for understanding the most recent activity or checkpoint.TotalRecordCount, ProcessedRecordCount, SuccessRecordCount, ErrorRecordCount) – These metrics provide granular insights into the processing efficiency and outcome of the batch job, highlighting data volume, successful processing rates, and error occurrences.InputTokenCount, OutputTokenCount) – Specifically designed for cost analysis, these columns track the consumption of tokens, which is a direct factor in Amazon Bedrock pricing, enabling accurate resource usage assessment.InputLocation, OutputLocation) – These columns link the inference job to its source and destination data within Amazon S3, maintaining traceability of the data involved in the batch processing.Complete the following steps to open the output file and view the recommendations for each customer generated by the FM:
model-invocation-job/, as illustrated in the following screenshot.
manifest contains information like number of tokens, number of successful records, and number of errors.The example in the following screenshot shows several recommended products and why the FM chose this product for the specific customer.

To optimize or enhance your monitoring solution, consider the following best practices:
inputTokenCount + outputTokenCount) / jobDuration)The cost estimate of running this solution one time is less than $1. The estimate for batch inference jobs considers Anthropic’s Claude 3.5 sonnet V2 model. Refer to Model pricing details for batch job pricing of other models on Amazon Bedrock.
If you no longer need this automated monitoring solution, follow these steps to delete the resources it created to avoid additional costs:
This automatically deletes the deployed stack and the resources created.
In this post, we demonstrated how a financial services company can use an FM to process large volumes of customer records and get specific data-driven product recommendations. We also showed how to implement an automated monitoring solution for Amazon Bedrock batch inference jobs. By using EventBridge, Lambda, and DynamoDB, you can gain real-time visibility into batch processing operations, so you can efficiently generate personalized product recommendations based on customer credit data. The solution addresses key challenges in managing batch inference operations:
This automated monitoring approach significantly enhances the ability to process large amounts of financial data using batch inference for Amazon Bedrock. This solution offers a scalable, efficient, and cost-effective approach to do batch inference for a variety of use cases, such as generating product recommendations, identifying fraud patterns, or analyzing financial trends in bulk, with the added benefit of real-time operational visibility.
Durga Prasad is a Senior Consultant at AWS, specializing in the Data and AI/ML. He has over 17 years of industry experience and is passionate about helping customers design, prototype, and scale Big Data and Generative AI applications using AWS native and open-source tech stacks.
Chanpreet Singh is a Senior Consultant at AWS with 18+ years of industry experience, specializing in Data Analytics and AI/ML solutions. He partners with enterprise customers to architect and implement cutting-edge solutions in Big Data, Machine Learning, and Generative AI using AWS native services, partner solutions and open-source technologies. A passionate technologist and problem solver, he balances his professional life with nature exploration, reading, and quality family time.
Manuel Rioux est fièrement propulsé par WordPress