IT teams face mounting challenges as they manage increasingly complex infrastructure and applications, often spending countless hours manually identifying operational issues, troubleshooting problems, and performing repetitive maintenance tasks. This operational burden diverts valuable technical resources from innovation and strategic initiatives. Artificial intelligence for IT operations (AIOps) presents a transformative solution, using AI to automate operational workflows, detect anomalies, and resolve incidents with minimal human intervention. Organizations can optimize their operational efficiency while maintaining security as they manage their infrastructure and applications.
You can use Amazon Q Developer CLI and Model Context Protocol (MCP) servers to build powerful AIOps solutions that can reduce manual effort through natural language interactions. Amazon Q Developer can help developers and IT professionals with many of their tasks—from coding, testing, and deploying, to troubleshooting, performing security scanning and fixes, modernizing applications, optimizing AWS resources, and creating data engineering pipelines. The MCP extends these capabilities by enabling Amazon Q to connect with custom tools and services through a standardized interface, allowing for more sophisticated operational automations.
In this post, we discuss how to implement a low-code no-code AIOps solution that helps organizations monitor, identify, and troubleshoot operational events while maintaining their security posture. We show how these technologies work together to automate repetitive tasks, streamline incident response, and enhance operational efficiency across your organization.
This is the third post in a series on AIOps using generative AI services on AWS. Refer to the following two posts for building AIOps using Amazon Bedrock and Amazon Q Business:
MCP servers act like a universal connector for AI models, enabling them to interact with external systems, fetch live data, and integrate with various tools seamlessly. This helps Amazon Q provide more contextually relevant assistance by accessing the information it needs in real time. The following architecture diagram illustrates how you can use a single configuration file, mcp.json, to configure MCP servers in Amazon Q Developer CLI to connect to external systems.
The workflow consists of the following steps:
mcp.json file.In this post, we show how to use Amazon Q Developer CLI to address the following operational issues:
Complete the following prerequisites before you start setting up the demo:
MCP configuration in Amazon Q Developer CLI is managed through JSON files. You will configure the Amazon Bedrock Knowledge Base Retrieval MCP Server. At the time of writing, only the stdio transport is supported in Amazon Q Developer CLI.
Amazon Q Developer CLI supports two levels of MCP configuration:
~/.aws/amazonq/mcp.json and applies to all workspaces.amazonq/mcp.json and is specific to the current workspaceFor this post, we use the workspace configuration, but you have option to use either of them.
.amazonq/mcp.json with the following content:{
"mcpServers": {
"awslabs.bedrock-kb-retrieval-mcp-server": {
"command": "uvx",
"args": ["awslabs.bedrock-kb-retrieval-mcp-server@latest"],
"env": {
"AWS_PROFILE": "your-profile-name ",
"AWS_REGION": "your-region",
"FASTMCP_LOG_LEVEL": "ERROR",
"KB_INCLUSION_TAG_KEY": "name=aiops-knowledge-base",
"BEDROCK_KB_RERANKING_ENABLED": "false"
},
"disabled": false,
"autoApprove": []
}
}
}
See the AWS MCP Servers GitHub repository for an updated list of available MCP servers.
q login
q and then run /tools to validate that the Amazon Bedrock Knowledge Base Retrieval MCP server is configured.Tool permissions have two possible states:
By default, this tool will not be trusted.
5. Run /tools trust awslabsbedrock_kb_retrieval_mcp_server___QueryKnowledgeBases to trust the MCP server.
6. Run the /tools command again to validate it.
Deploy the following AWS CloudFormation template to deploy the AWS resources that you will use to test AIOps. You can deploy this template in either the us-east-1 or us-west-2 AWS Region. You can deploy it in other Regions by updating the applicable AMI IDs in the template. This template will deploy two EC2 instances and three S3 buckets.
This CloudFormation template is for demo purposes only and not meant for production usage.
AWSTemplateFormatVersion: '2010-09-09'
Description: >-
This template creates the necessary AWS resources which will be used to test AIOps using
Amazon Q Developer CLI with MCP server integration.
Metadata:
AWS::CloudFormation::Interface:
ParameterGroups:
- Label:
default: Network
Parameters:
- SecurityGroupIngressCidrIp
- Label:
default: General
Parameters:
- Prefix
ParameterLabels:
SecurityGroupIngressCidrIp:
default: Security group ingress CIDR IP
Parameters:
Prefix:
Type: String
Description: Unique name prefix for resources that are created by the stack.
ConstraintDescription: >-
must not start with a dash, and must only contain lowercase a-z, digits,
and a dash.
AllowedPattern: ^[a-z0-9][a-z0-9-]+$
MinLength: 1
MaxLength: 30
Default: aiops-qdevcli
SecurityGroupIngressCidrIp:
Type: String
Description: >-
IPv4 address in CIDR format for allowed incoming traffic to the EC2 instance. Defaults to allowing all IPs.
ConstraintDescription: >-
must be in the form x.x.x.x/s, where x is 0-255, and s is 0-32.
AllowedPattern: ^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]).){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(/([0-9]|[1-2][0-9]|3[0-2]))$
Default: 0.0.0.0/0
Resources:
# AIOps Amazon S3 bucket1
AIOpsQDeveloperCliS3Bucket1:
Type: AWS::S3::Bucket
Properties:
AccessControl: Private
BucketName:
Fn::Sub: ${Prefix}-bucket1-${AWS::AccountId}
PublicAccessBlockConfiguration:
BlockPublicAcls: true
BlockPublicPolicy: true
IgnorePublicAcls: true
RestrictPublicBuckets: true
# AIOps Amazon S3 bucket2
AIOpsQDeveloperCliS3Bucket2:
Type: AWS::S3::Bucket
Properties:
AccessControl: Private
BucketName:
Fn::Sub: ${Prefix}-bucket2-${AWS::AccountId}
PublicAccessBlockConfiguration:
BlockPublicAcls: true
BlockPublicPolicy: true
IgnorePublicAcls: true
RestrictPublicBuckets: true
# AIOps Amazon S3 bucket3
AIOpsQDeveloperCliS3Bucket3:
Type: AWS::S3::Bucket
Properties:
AccessControl: Private
BucketName:
Fn::Sub: ${Prefix}-bucket3-${AWS::AccountId}
PublicAccessBlockConfiguration:
BlockPublicAcls: true
BlockPublicPolicy: true
IgnorePublicAcls: true
RestrictPublicBuckets: true
# AIOps Knowledgebase S3 bucket
AIOpsQDeveloperKBS3Bucket:
Type: AWS::S3::Bucket
Properties:
AccessControl: Private
BucketName:
Fn::Sub: ${Prefix}-kb-${AWS::AccountId}
PublicAccessBlockConfiguration:
BlockPublicAcls: true
BlockPublicPolicy: true
IgnorePublicAcls: true
RestrictPublicBuckets: true
# AIOps VPC resources
AIOpsQDeveloperCliVPC:
Type: AWS::EC2::VPC
Properties:
CidrBlock: 10.0.0.0/16
Tags:
- Key: Name
Value: AIOpsQDeveloperCliVPC
AIOpsQDeveloperCliSubnet1:
Type: AWS::EC2::Subnet
Properties:
CidrBlock: 10.0.1.0/24
VpcId:
Ref: AIOpsQDeveloperCliVPC
AvailabilityZone: !Select
- 0
- !GetAZs
Ref: 'AWS::Region'
Tags:
- Key: Name
Value: AIOpsQDeveloperCliSubnet1
AIOpsQDeveloperCliSubnet2:
Type: AWS::EC2::Subnet
Properties:
CidrBlock: 10.0.3.0/24
VpcId:
Ref: AIOpsQDeveloperCliVPC
AvailabilityZone: !Select
- 1
- !GetAZs
Ref: 'AWS::Region'
Tags:
- Key: Name
Value: AIOpsQDeveloperCliSubnet2
AIOpsQDeveloperIGW:
Type: AWS::EC2::InternetGateway
Properties:
Tags:
- Key: Name
Value: AIOpsQDeveloperIGW
AIOpsQDeveloperCliVPCGatewayAttachment:
Type: AWS::EC2::VPCGatewayAttachment
Properties:
InternetGatewayId:
Ref: AIOpsQDeveloperIGW
VpcId:
Ref: AIOpsQDeveloperCliVPC
AIOpsQDeveloperCliRT:
Type: AWS::EC2::RouteTable
Properties:
VpcId:
Ref: AIOpsQDeveloperCliVPC
Tags:
- Key: Name
Value: AIOpsQDeveloperCliRT
AIOpsRoute:
Type: AWS::EC2::Route
DependsOn:
- AIOpsQDeveloperCliVPCGatewayAttachment
Properties:
DestinationCidrBlock: 0.0.0.0/0
GatewayId:
Ref: AIOpsQDeveloperIGW
RouteTableId:
Ref: AIOpsQDeveloperCliRT
AIOpsQDeveloperCliSubnetRouteTableAssociation1:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId:
Ref: AIOpsQDeveloperCliRT
SubnetId:
Ref: AIOpsQDeveloperCliSubnet1
AIOpsQDeveloperCliSubnetRouteTableAssociation2:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId:
Ref: AIOpsQDeveloperCliRT
SubnetId:
Ref: AIOpsQDeveloperCliSubnet2
AIOpsQDeveloperCliSG1:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: >-
Allows incoming traffic on port 5080 and denies all outgoing traffic.
SecurityGroupEgress:
- Description: Denies all outgoing traffic.
IpProtocol: -1
CidrIp: 0.0.0.0/32
SecurityGroupIngress:
- Description: Allows incoming TCP traffic on port 22.
IpProtocol: tcp
FromPort: 22
ToPort: 22
CidrIp:
Ref: SecurityGroupIngressCidrIp
VpcId:
Ref: AIOpsQDeveloperCliVPC
Tags:
- Key: Name
Value: AIOpsQDeveloperCliSG1
AIOpsQDeveloperCliSG2:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: >-
Allows incoming traffic on port 5080 and denies all outgoing traffic.
SecurityGroupEgress:
- Description: Denies all outgoing traffic.
IpProtocol: -1
CidrIp: 0.0.0.0/32
SecurityGroupIngress:
- Description: Allows incoming TCP traffic on port 5080.
IpProtocol: tcp
FromPort: 5080
ToPort: 5080
CidrIp:
Ref: SecurityGroupIngressCidrIp
- Description: Allows incoming TCP traffic on port 22.
IpProtocol: tcp
FromPort: 22
ToPort: 22
CidrIp:
Ref: SecurityGroupIngressCidrIp
VpcId:
Ref: AIOpsQDeveloperCliVPC
Tags:
- Key: Name
Value: AIOpsQDeveloperCliSG2
EC2KeyPair:
Type: AWS::EC2::KeyPair
Properties:
KeyName:
Fn::Sub: ${Prefix}-keypair-${AWS::AccountId}
# EC2 instance to demo high CPU Utilization AIOps
EC2InstanceHighCPUUtilDemo:
Type: AWS::EC2::Instance
Properties:
InstanceType: t2.micro
KeyName: !Ref EC2KeyPair
ImageId: !FindInMap [RegionMap, !Ref 'AWS::Region', AL2023]
NetworkInterfaces:
- AssociatePublicIpAddress: true
DeviceIndex: 0
SubnetId: !Ref AIOpsQDeveloperCliSubnet1
GroupSet:
- !Ref AIOpsQDeveloperCliSG1
Tags:
- Key: Name
Value:
Fn::Sub: ${Prefix}-high-cpu-util
# EC2 instance to demo unwanted open port detection AIOps
EC2InstanceOpenPortDemo:
Type: AWS::EC2::Instance
Properties:
InstanceType: t2.micro
KeyName: !Ref EC2KeyPair
ImageId: !FindInMap [RegionMap, !Ref 'AWS::Region', AL2023]
NetworkInterfaces:
- AssociatePublicIpAddress: true
DeviceIndex: 0
SubnetId: !Ref AIOpsQDeveloperCliSubnet1
GroupSet:
- !Ref AIOpsQDeveloperCliSG2
Tags:
- Key: Name
Value:
Fn::Sub: ${Prefix}-open-port-demo
CPUUtilizationAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName:
Fn::Sub: ${Prefix}-EC2-Instance-CPU-Utilization
AlarmDescription: Alarm when server CPU exceeds 70%
ComparisonOperator: GreaterThanThreshold
EvaluationPeriods: 1
MetricName: CPUUtilization
Namespace: AWS/EC2
Period: 60
Statistic: Average
Threshold: 70.0
ActionsEnabled: false
Dimensions:
- Name: InstanceId
Value: !Ref EC2InstanceHighCPUUtilDemo
Unit: Percent
Mappings:
RegionMap:
us-east-1:
AL2023: ami-085ad6ae776d8f09c
us-west-2:
AL2023: ami-0005ee01bca55ab66
Outputs:
AIOpsQDeveloperCliS3Bucket1:
Description: S3 bucket created for testing AIOps
Value:
Ref: AIOpsQDeveloperCliS3Bucket1
AIOpsQDeveloperCliS3Bucket2:
Description: S3 bucket created for testing AIOps
Value:
Ref: AIOpsQDeveloperCliS3Bucket2
AIOpsQDeveloperCliS3Bucket3:
Description: S3 bucket created for testing AIOps
Value:
Ref: AIOpsQDeveloperCliS3Bucket3
AIOpsQDeveloperKBS3Bucket:
Description: S3 bucket created for testing AIOps
Value:
Ref: AIOpsQDeveloperKBS3Bucket
EC2InstanceHighCPUUtilDemo:
Description: EC2 instance for testing AIOps
Value:
Ref: EC2InstanceHighCPUUtilDemo
EC2InstanceOpenPortDemo:
Description: EC2 instance for testing AIOps
Value:
Ref: EC2InstanceOpenPortDemo
Validate that the template deployed two EC2 instances, which are in Running state.
Additionally, validate that the template created three S3 buckets with the names aiops-qdevcli-bucketX-<your-AWS-account-Id> and one bucket with the name aiops-qdevcli-<your-AWS-account-Id> in your selected Region.
Upload the sample high CPU utilization runbook to the aiops-qdevcli-<your-AWS-account-Id> bucket. Create a knowledge base pointing to the bucket, and note the knowledge base ID to use in the first example use case.
In this use case, you introduce CPU stress in one of the EC2 instances and then use Amazon Q Developer CLI to identify and remediate it.
aiops-qdevcli-high-cpu-util instance using EC2 Instance Connect.stress-ng:sudo dnf install stress-ng
stress-ng --cpu 1 --timeout 3600s
You must wait approximately 10 minutes for the Amazon CloudWatch alarm to get triggered.
aiops-qdevcli-high-cpu-util instance is currently in Alarm state.Amazon Q Developer CLI autocorrects the errors that it encountered while running the commands.
Watch the following video for more details.
Due to the inherent nondeterministic nature of the FMs, the responses you receive from Amazon Q Developer CLI might not be exactly the same as those shown in the demo.
In this use case, you will simulate an accidental security issue by unblocking public access for one of the buckets and then use Amazon Q Developer CLI to identify and remediate the issue.
aiops-qdevcli-xxxx buckets, and on the Permissions tab, choose Edit and change Block all public access to Off.Watch the following video for more details.
In this use case, you will use Amazon Q Developer CLI to identify the EC2 instance that has a specific port open and then close the port.
aiops-qdevcli-open-port-demo instance has port 5080 open for all inbound TCP connections. This is an unwanted security risk that you want to identify and remediate.Watch the following video for details.
Properly decommissioning provisioned AWS resources is an important best practice to optimize costs and enhance security posture after concluding proofs of concept and demonstrations. Complete the following steps to delete the resources created in your AWS account:
aiops-qdevcli-kb-xxx bucket.As an alternative, try the preceding steps using natural language queries in Amazon Q Developer CLI.
.amazonq/mcp.json file from your workspace folder to remove the MCP configuration for Amazon Q Developer CLI.In this post, we showed how Amazon Q Developer CLI interprets natural language queries, automatically converts them into appropriate commands, and identifies the necessary tools for execution. The solution’s intelligent error-handling capabilities analyze logs and perform auto-corrections, minimizing manual intervention. By implementing Amazon Q Developer CLI, you can enhance your team’s operational efficiency, reduce human errors, and manage complex environments more effectively through a conversational interface.We encourage you to explore additional use cases and share your feedback with us. For more information on Amazon Q Developer CLI and AWS MCP servers, refer to the following resources:
Biswanath Mukherjee is a Senior Solutions Architect at Amazon Web Services. He works with large strategic customers of AWS by providing them technical guidance to migrate and modernize their applications on AWS Cloud. With his extensive experience in cloud architecture and migration, he partners with customers to develop innovative solutions that leverage the scalability, reliability, and agility of AWS to meet their business needs. His expertise spans diverse industries and use cases, enabling customers to unlock the full potential of the AWS Cloud.
Upendra V is a Senior Solutions Architect at Amazon Web Services, specializing in Generative AI and cloud solutions. He helps enterprise customers design and deploy production-ready Generative AI workloads, implement Large Language Models (LLMs) and Agentic AI systems, and optimize cloud deployments. With expertise in cloud adoption and machine learning, he enables organizations to build and scale AI-driven applications efficiently.
Manuel Rioux est fièrement propulsé par WordPress