Live streaming has been gaining immense popularity in recent years, attracting an ever-growing number of viewers and content creators across various platforms. From gaming and entertainment to education and corporate events, live streams have become a powerful medium for real-time engagement and content consumption. However, as the reach of live streams expands globally, language barriers and accessibility challenges have emerged, limiting the ability of viewers to fully comprehend and participate in these immersive experiences.
Recognizing this need, we have developed a Chrome extension that harnesses the power of AWS AI and generative AI services, including Amazon Bedrock, an AWS managed service to build and scale generative AI applications with foundation models (FMs). This extension aims to revolutionize the live streaming experience by providing real-time transcription, translation, and summarization capabilities directly within your browser.
With this extension, viewers can seamlessly transcribe live streams into text, enabling them to follow along with the content even in noisy environments or when listening to audio is not feasible. Moreover, the extension’s translation capabilities open up live streams to a global audience, breaking down language barriers and fostering more inclusive participation. By offering real-time translations into multiple languages, viewers from around the world can engage with live content as if it were delivered in their first language.
In addition, the extension’s capabilities extend beyond mere transcription and translation. Using the advanced natural language processing and summarization capabilities of FMs available through Amazon Bedrock, the extension can generate concise summaries of the content being transcribed in real time. This innovative feature empowers viewers to catch up with what is being presented, making it simpler to grasp key points and highlights, even if they have missed portions of the live stream or find it challenging to follow complex discussions.
In this post, we explore the approach behind building this powerful extension and provide step-by-step instructions to deploy and use it in your browser.
The solution is powered by two AWS AI services, Amazon Transcribe and Amazon Translate, along with Amazon Bedrock, a fully managed service that allows you to build generative AI applications. The solution also uses Amazon Cognito user pools and identity pools for managing authentication and authorization of users, Amazon API Gateway REST APIs, AWS Lambda functions, and an Amazon Simple Storage Service (Amazon S3) bucket.
After deploying the solution, you can access the following features:
Live transcription is currently available in the over 50 languages currently supported by Amazon Transcribe streaming (Chinese, English, French, German, Hindi, Italian, Japanese, Korean, Brazilian Portuguese, Spanish, and Thai), while translation is available in the over 75 languages currently supported by Amazon Translate.
The following diagram illustrates the architecture of the application.

The solution workflow includes the following steps:
In the following sections, we walk through how to deploy the Chrome extension and the underlying backend resources and set up the extension, then we demonstrate using the extension in a sample use case.
For this walkthrough, you should have the following prerequisites:
The first step consists of deploying an AWS Cloud Development Kit (AWS CDK) application that automatically provisions and configures the required AWS resources, including:
Complete the following steps to deploy the AWS CDK application:
git clone https://github.com/aws-samples/aws-transcribe-translate-summarize-live-streams-in-browser.git
cd aws-transcribe-translate-summarize-live-streams-in-browser
cdk/bin/config.json file and populate the following configuration variables:{
"prefix": "aaa123",
"aws_region": "us-west-2",
"bedrock_region": "us-west-2",
"bucket_name": "summarization-test",
"bedrock_model_id": "anthropic.claude-3-sonnet-20240229-v1:0"
}
The template launches in the us-east-2 AWS Region by default. To launch the solution in a different Region, change the aws_region parameter accordingly. Make sure to select a Region in which all the AWS services in scope (Amazon Transcribe, Amazon Translate, Amazon Bedrock, Amazon Cognito, API Gateway, Lambda, Amazon S3) are available.
The Region used for bedrock_region can be different from aws_region because you might have access to Amazon Bedrock models in a Region different from the Region where you want to deploy the project.
By default, the project uses Anthropic’s Claude 3 Sonnet as a summarization model; however, you can use a different model by changing the bedrock_model_id in the configuration file. For the complete list of model IDs, see Amazon Bedrock model IDs. When selecting a model for your deployment, don’t forget to check that the desired model is available in your preferred Region; for more details about model availability, see Model support by AWS Region.
npx cdk bootstrap aws://{targetAccountId}/{targetRegion}
cdk sub-directory, install dependencies, and deploy the stack by running the following commands:cd cdk
npm i
npx cdk deploy
Wait for AWS CloudFormation to finish the stack creation.
You need to use the CloudFormation stack outputs to connect the frontend to the backend. After the deployment is complete, you have two options.
The preferred option is to use the provided postdeploy.sh script to automatically copy the cdk configuration parameters to a configuration file by running the following command, still in the /cdk folder:
./scripts/postdeploy.sh
Alternatively, you can copy the configuration manually:
AwsStreamAnalysisStack.Complete the following steps to get the extension ready for transcribing, translating, and summarizing live streams:
src/config.js Based on how you chose to collect the CloudFormation stack outputs, follow the appropriate step:
src/config.js file have been automatically updated with the corresponding values.src/config.js file with the values you noted. Use the following format:const config = {
"aws_project_region": "{aws_region}", // The same you have used as aws_region in cdk/bin/config.json
"aws_cognito_identity_pool_id": "{CognitoIdentityPoolId}", // From CloudFormation outputs
"aws_user_pools_id": "{CognitoUserPoolId}", // From CloudFormation outputs
"aws_user_pools_web_client_id": "{CognitoUserPoolClientId}", // From CloudFormation outputs
"bucket_s3": "{BucketS3Name}", // From CloudFormation outputs
"bedrock_region": "{bedrock_region}", // The same you have used as bedrock_region in cdk/bin/config.json
"api_gateway_id": "{APIGatewayId}" // From CloudFormation outputs
};
Take note of the CognitoUserPoolId, which will be needed in a later step to create a new user.
aws-transcribe-translate-summarize-live-streams-in-browser directory with a command similar to following:cd ~/aws-transcribe-translate-summarize-live-streams-in-browser
npm i
npm run build
chrome://extensions/.Make sure that developer mode is enabled by toggling the icon on the top right corner of the page.
aws-transcribe-translate-summarize-live-streams-in-browser.CognitoUserPoolId value noted from the CloudFormation stack outputs.See a walkthrough of Steps 4-6 in the animated image below. For additional details, refer to Creating a new user in the AWS Management Console.

Now that the extension in set up, you can interact with it by completing these steps:
You’re now ready to experiment with the extension.
Content on the Translation tab will appear with a few seconds of delay compared to what you see on the Transcription tab. When transcribing speech in real time, Amazon Transcribe incrementally returns a stream of partial results until it generates the final transcription for a speech segment. This Chrome extension has been implemented to translate text only after a final transcription result is returned.
See the extension in action in the video below.
If you receive the error “Extension has not been invoked for the current page (see activeTab permission). Chrome pages cannot be captured.”, check the following:
If you can’t get the summary of the live stream, make sure you have stopped the recording and then request the summary. You can’t change the language of the transcript and summary after the recording has started, so remember to choose it appropriately before you start the recording.
When you’re done with your tests, to avoid incurring future charges, delete the resources created during this walkthrough by deleting the CloudFormation stack:
AwsStreamAnalysisStack.CognitoUserPoolId and CognitoIdentityPoolId values among the CloudFormation stack outputs, which will be needed in the following step.Because the Amazon Cognito resources won’t be automatically deleted, delete them manually:
CognitoUserPoolId and CognitoIdentityPoolId values previously retrieved in the CloudFormation stack outputs.In this post, we showed you how to deploy a code sample that uses AWS AI and generative AI services to access features such as live transcription, translation and summarization. You can follow the steps we provided to start experimenting with the browser extension.
To learn more about how to build and scale generative AI applications, refer to Transform your business with generative AI.
Luca Guida is a Senior Solutions Architect at AWS; he is based in Milan and he supports independent software vendors in their cloud journey. With an academic background in computer science and engineering, he started developing his AI/ML passion at university; as a member of the natural language processing and generative AI community within AWS, Luca helps customers be successful while adopting AI/ML services.
Chiara Relandini is an Associate Solutions Architect at AWS. She collaborates with customers from diverse sectors, including digital native businesses and independent software vendors. After focusing on ML during her studies, Chiara supports customers in using generative AI and ML technologies effectively, helping them extract maximum value from these powerful tools.
Arian Rezai Tabrizi is an Associate Solutions Architect based in Milan. She supports enterprises across various industries, including retail, fashion, and manufacturing, on their cloud journey. Drawing from her background in data science, Arian assists customers in effectively using generative AI and other AI technologies.
Manuel Rioux est fièrement propulsé par WordPress