Keeping an up-to-date asset inventory with real devices deployed in the field can be a challenging and time-consuming task. Many electricity providers use manufacturer’s labels as key information to link their physical assets within asset inventory systems. Computer vision can be a viable solution to speed up operator inspections and reduce human errors by automatically extracting relevant data from the label. However, building a standard computer vision application capable of managing hundreds of different types of labels can be a complex and time-consuming endeavor.
In this post, we present a solution using generative AI and large language models (LLMs) to alleviate the time-consuming and labor-intensive tasks required to build a computer vision application, enabling you to immediately start taking pictures of your asset labels and extract the necessary information to update the inventory using AWS services like AWS Lambda, Amazon Bedrock, Amazon Titan, Anthropic’s Claude 3 on Amazon Bedrock, Amazon API Gateway, AWS Amplify, Amazon Simple Storage Service (Amazon S3), and Amazon DynamoDB.
LLMs are large deep learning models that are pre-trained on vast amounts of data. They are capable of understanding and generating human-like text, making them incredibly versatile tools with a wide range of applications. This approach harnesses the image understanding capabilities of Anthropic’s Claude 3 model to extract information directly from photographs taken on-site, by analyzing the labels present in those field images.
The AI-powered asset inventory labeling solution aims to streamline the process of updating inventory databases by automatically extracting relevant information from asset labels through computer vision and generative AI capabilities. The solution uses various AWS services to create an end-to-end system that enables field technicians to capture label images, extract data using AI models, verify the accuracy, and seamlessly update the inventory database.
The following diagram illustrates the solution architecture.
The workflow consists of the following steps:
Let’s look at the key components of the solution in more detail.
The mobile app component plays a crucial role in this AI-powered asset inventory labeling solution. It serves as the primary interface for field technicians on their tablets or mobile devices to capture and upload images of asset labels using the device’s camera. The implementation of the mobile app includes an authentication mechanism that will allow access only to authenticated users. It’s also built using a serverless approach to minimize recurring costs and have a highly scalable and robust solution.
The mobile app has been built using the following services:
The AI data extraction service is designed to extract critical information, such as manufacturer name, model number, and serial number from images of asset labels.
To enhance the accuracy and efficiency of the data extraction process, the service employs a knowledge base comprising sample label images and their corresponding data fields. This knowledge base serves as a reference guide for the AI model, enabling it to learn and generalize from labeled examples to new label formats effectively. The knowledge base is stored as vector embeddings in a high-performance vector database: Meta’s FAISS (Facebook AI Similarity Search), hosted on Amazon S3.
Embeddings are dense numerical representations that capture the essence of complex data like text or images in a vector space. Each data point is mapped to a vector or ordered list of numbers, where similar data points are positioned closer together. This embedding space allows for efficient similarity calculations by measuring the distance between vectors. Embeddings enable machine learning (ML) models to effectively process and understand relationships within complex data, leading to improved performance on various tasks like natural language processing and computer vision.
The following diagram illustrates an example workflow.
The vector embeddings are generated using Amazon Titan, a powerful embedding generation service, which converts the labeled examples into numerical representations suitable for efficient similarity searches. The workflow consists of the following steps:
In this phase, the AWS services used are:
Data verification plays a crucial role in maintaining the accuracy and reliability of the extracted data before updating the asset inventory database and is included in the mobile app.
The workflow consists of the following steps:
Data verification uses the following AWS services:
This component assesses the discrepancy of automatically identified data by the AI data extraction service and the final data approved by the field operator and computes the difference. If the difference is below a configured threshold, the data is sent to update the inventory database; otherwise a human review process is engaged:
The backend review component uses the following AWS services:
The inventory database component plays a crucial role in storing and managing the verified asset data in a scalable and efficient manner. Amazon DynamoDB, a fully managed NoSQL database service from AWS, is used for this purpose. DynamoDB is a serverless, scalable, and highly available key-value and document database service. It’s designed to handle massive amounts of data and high traffic workloads, making it well-suited for storing and retrieving large-scale inventory data.
The verified data from the AI extraction and human verification processes is ingested into the DynamoDB table. This includes data with high confidence from the initial extraction, as well as data that has been reviewed and corrected by human reviewers.
The knowledge base update component enables continuous improvement and adaptation of the generative AI models used for asset label data extraction:
The knowledge base update process makes sure that the solution remains adaptive and continuously improves its performance over time, reducing the likelihood of unseen label examples and the involvement of subject matter experts to correct the extracted data.
This component uses the following AWS services:
This section explains how users interact with the system and how data flows between different components of the solution. We’ll examine each key component’s role in the process, from initial user access through data verification and storage.
The end user accesses the mobile app using the browser included in the handheld device. The application URL to access the mobile app is available after you have deployed the frontend application. Using the browser on a handheld device or your PC, browse to the application URL address, where a login window will appear. Because this is a demo environment, you can register on the application by following the automated registration workflow implemented through Amazon Cognito and choosing Create Account, as shown in the following screenshot.

During the registration process, you must provide a valid email address that will be used to verify your identity, and define a password. After you’re registered, you can log in with your credentials.
After authentication is complete, the mobile app appears, as shown in the following screenshot.

The process to use the app is the following:
This module uses Anthropic’s Claude 3 FM, a multimodal system capable of processing both images and text. To extract relevant data, we employ a prompt technique that uses samples to guide the model’s output. Our prompt includes two sample images along with their corresponding extracted text. The model identifies which sample image most closely resembles the one we want to analyze and uses that sample’s extracted text as a reference to determine the relevant information in the target image.
We use the following prompt to achieve this result:
{
"role": "user",
"content": [
{
"type": "text",
"text": "first_sample_image:",
},
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/jpeg",
"data": first_sample_encoded_image,
},
},
{
"type": "text",
"text": "target_image:",
},
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/jpeg",
"data": encoded_image,
},
},
{"type": "text",
"text": f"""
answer the question using the following example as reference.
match exactly the same set of fields and information as in the provided example.
<example>
analyze first_sample_image and answer with a json file with the following information: Model, SerialN, ZOD.
answer only with json.
Answer:
{first_sample_answer}
</example>
<question>
analyze target_image and answer with a json file with the following information: Model, SerialN, ZOD.
answer only with json.
Answer:
</question>
"""},
],
}
In the preceding code, first_sample_encoded_image and first_sample_answer are the reference image and expected output, respectively, and encoded_image contains the new image that has to be analyzed.
After the image is processed by the AI data extraction service, the control goes back to the mobile app:

This process is implemented using Amazon A2I:

To keep this solution as simple as possible while covering the required capability, we selected DynamoDB as our inventory database. This is a no SQL database, and we will store data in a table with the following information:
DynamoDB offers an on-demand pricing model that allows costs to directly depend on actual database usage.
The knowledge base database is stored as two files in an S3 bucket:
To be able to minimize race conditions when updating the database, a single Lambda function is configured as the consumer of the SQS queue. The Lambda function extracts the information about the link to the reference image and the metadata, certified by the back-office operator, updates both files, and stores the new version in the S3 bucket.
In the following sections, we create a seamless workflow for field data collection, AI-powered extraction, human validation, and inventory updates.
You need the following prerequisites before you can proceed with solution. For this post, we use the us-east-1 Region. You will also need an AWS Identity and Access Management (IAM) user with administrative privileges to deploy the required components and a development environment with access to AWS resources already configured.
For the development environment, you can use an Amazon Elastic Compute Cloud (Amazon EC2) instance (choose select at least a t3.small instance type in order to be able to build the web application) or use a development environment of your own choice. Install Python 3.9 and install and configure AWS Command Line Interface (AWS CLI).
You will also need to install the Amplify CLI. Refer to Set up Amplify CLI for more information.
The next step is to enable the models used in this workshop in Amazon Bedrock. To do this, complete the following steps:

A new window will list the requested models.
The next step is to create an Amazon SageMaker Ground Truth private labeling workforce that will be used to perform back-office activities. If you don’t already have a private labeling workforce in your account, you can create one following these steps:



git clone https://github.com/aws-samples/Build_a_computer_vision_based_asset_inventory_app_with_low_no_training
cd Build_a_computer_vision_based_asset_inventory_app_with_low_no_training
bash build_lambda_layer.sh
You should get an output similar to the following screenshot.

Save theLAMBDA_LAYER_VERSION_ARN for later use.
You are now ready to deploy the backend infrastructure and frontend application.
The backend is deployed using AWS CloudFormation to build the following components:
Download the CloudFormation template, then complete the following steps:

asset-inventory).
Wait until the CloudFormation stack creation process is complete; it will take about 15–20 minutes. You can then view the stack details.

Note the values on the Outputs tab. You will use the output data later to complete the configuration of the frontend application.

In this section, you will build the web application that is used by the on-site operator to collect a picture of the labels, submit it to the backend services to extract relevant information, validate or correct returned information, and submit the validated or corrected information to be stored in the asset inventory.
The web application uses React and will use the Amplify JavaScript Library.
Amplify provides several products to build full stack applications:
In this example, you have already created the needed services with the CloudFormation template, so the Amplify CLI will deploy the application on the Amplify provided hosting service.
git clone https://github.com/aws-samples/Build_a_computer_vision_based_asset_inventory_app_with_low_no_training
cd Build_a_computer_vision_based_asset_inventory_app_with_low_no_training
cd webapp
ln -s $HOME/.aws/credentials $HOME/.aws/config
amplify init
After issuing this command, the Amplify CLI will ask you for some parameters.
amplifyconfiguration.js.template (you can find it in folder webapp/src) with the information collected from the output of the CloudFormation stack and save as amplifyconfiguration.js. This file tells Amplify which is the correct endpoint to use to interact with the backend resources created for this application. The information required is as follows:
amplify hosting add
The Amplify CLI will ask you which type of hosting service you prefer and what type of deployment.
npm install
amplify publish
Y.At the end of the deployment phase, Amplify will return the public URL of the web application, similar to the following:
...
Find out more about deployment here:
https://cra.link/deployment
Zipping artifacts completed.
Deployment complete!
https://dev.xxx.amplifyapp.com
Now you can use your browser to connect to the application using the provided URL.
To delete the resources used to build this solution, complete the following steps:
amplify delete
At the end of the deletion process, you should not see the entry related to asset-inventory on the list of stacks.
aws lambda delete-layer-version —layer-name asset-inventory-blog —version-number 1
aws delete-workteam —workteam-name <the name you defined when you created the workteam>
In this post, we presented a solution that incorporates various AWS services to handle image storage (Amazon S3), mobile app development (Amplify), AI model hosting (Amazon Bedrock using Anthropic’s Claude), data verification (Amazon A2I), database (DynamoDB), and vector embeddings (Amazon Bedrock using Amazon Titan Multimodal Embeddings). It creates a seamless workflow for field data collection, AI-powered extraction, human validation, and inventory updates.
By taking advantage of the breadth of AWS services and integrating generative AI capabilities, this solution dramatically improves the efficiency and accuracy of asset inventory management processes. It reduces manual labor, accelerates data entry, and maintains high-quality inventory records, enabling organizations to optimize asset tracking and maintenance operations.
You can deploy this solution and immediately start collecting images of your assets to build or update your asset inventory.

Federico D’Alessio is an AWS Solutions Architect and joined AWS in 2018. He is currently working in the Power and Utility and Transportation market. Federico is cloud addict and when not at work, he tries to reach clouds with his hang glider.

Leonardo Fenu is a Solutions Architect, who has been helping AWS customers align their technology with their business goals since 2018. When he is not hiking in the mountains or spending time with his family, he enjoys tinkering with hardware and software, exploring the latest cloud technologies, and finding creative ways to solve complex problems.

Elisabetta Castellano is an AWS Solutions Architect focused on empowering customers to maximize their cloud computing potential, with expertise in machine learning and generative AI. She enjoys immersing herself in cinema, live music performances, and books.

Carmela Gambardella is an AWS Solutions Architect since April 2018. Before AWS, Carmela has held various roles in large IT companies, such as software engineer, security consultant and solutions architect. She has been using her experience in security, compliance and cloud operations to help public sector organizations in their transformation journey to the cloud. In her spare time, she is a passionate reader, she enjoys hiking, traveling and playing yoga.
Manuel Rioux est fièrement propulsé par WordPress