Every organization has access to the same foundation models. The real competitive advantage comes from customizing them with your proprietary data and domain expertise. But getting there is complex, even for experienced teams. It requires mastering fine-tuning techniques like Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Reinforcement Learning Verifiable Rewards (RLVR), navigating fragmented APIs and model-specific data formats, designing rigorous evaluations, and managing months-long experiment cycles.
Amazon SageMaker AI now offers an agentic experience that changes this. Developers describe their use case using natural language, and the AI coding agent streamlines the entire journey, from use case definition and data preparation through technique selection, evaluation, and deployment. Purpose-built agent skills deliver specialized expertise on fine-tuning applied to your specific use case, data transformation to required formats, quality evaluation using LLM-as-a-Judge metrics, and flexible deployment to Amazon Bedrock or SageMaker AI endpoints. Agent skills for model customization not only boost productivity but also decrease token usage. All generated code is fully editable, producing reusable artifacts that integrate seamlessly into existing workflows.
What makes this experience truly powerful is agent Skills for model customization. They are pre-built, modular instruction sets that encode deep AWS and data science expertise across the entire customization lifecycle. When you describe your use case, the AI coding agent activates the relevant skills, guiding it through data preparation and validation, technique selection, hyperparameter configuration, model evaluation, and deployment. Skills provide specialized knowledge about SageMaker AI APIs, ML workflows, best practices, and common patterns, enabling your coding agent to provide more accurate, SageMaker AI-specific guidance, generating ready-to-run notebooks at each step. Skills are fully customizable, so you can modify them to match your team’s workflows, governance standards, and tooling preferences, enabling reproducible organizational best practices, a common challenge with general-purpose coding assistants.
JupyterLab in SageMaker AI includes an integrated agentic development environment support through ACP. By default, Kiro, Amazon’s AI software development agent, is pre-configured in the chat panel, providing AI-powered code completion, debugging assistance, and interactive coding support directly within your JupyterLab environment. When you use coding agents in SageMaker AI JupyterLab, the space automatically loads relevant SageMaker AI model customization Skills into your agent’s context.
Additionally, you can configure other Agent Communication Protocol (ACP) compatible coding agents of your choice, such as Claude Code, giving you flexibility to work with the tools that best fit your workflow. ACP-compatible agents can benefit from the same SageMaker AI Skills integration when used within SageMaker AI JupyterLab. While this example shows the integration with JupyterLab, you can also use remote access to your own IDE outside of JupyterLab.
Before starting this tutorial, you must have the following prerequisites:
The SageMaker AI agent skills are built conforming with the Agent Skills open format. The agent-guided model customization workflows are powered by nine modular skills that cover the full customization lifecycle:
| Skill Name | Phase | Description |
| Use Case Specification | Configuration | Structured discovery to define business problem, users, and success criteria |
| Planning | Discovery | Generates a dynamic, multi-step customization plan tailored to your use case |
| Fine-tuning Setup | Configuration, Training | Selects base model from SageMaker AI Hub and recommends technique (SFT, DPO, or RLVR) |
| Dataset Evaluation | Evaluation, Training | Validates dataset format and schema before training |
| Dataset Transformation | Data Engineering | Converts between ML data formats (OpenAI chat, SageMaker AI, Hugging Face, Amazon Nova) |
| Fine-tuning | Training | Generates training notebooks for SageMaker AI serverless fine-tuning |
| Model Evaluation | Evaluation | Configures LLM-as-Judge evaluation with built-in and custom metrics |
| Model Deployment | Deployment | Determines deployment pathway (SageMaker AI endpoint or Bedrock) and generates code |
The coding agent (Kiro, Claude Code, Cursor, etc.) provides the conversational interface while the SageMaker AI Skills orchestrate the workflow. When you interact with your coding agent, it activates the relevant skills. This allows you to call SageMaker AI APIs, access S3 data sources, and interact with model registries through AWS-provided MCP servers. Jupyter notebooks are generated for you that execute each step of the process into existing ML pipelines.
The model customization skills currently support three fine-tuning techniques and recommend the right one during the planning phase based on your use case.
| Technique | Description | Best For |
| SFT (Supervised Fine-Tuning) | Trains on input/output pairs | Task-specific behavior: instruction following, format compliance, domain-adapted responses |
| DPO (Direct Preference Optimization) | Trains on preferred vs. rejected outputs | Aligning tone, style, and subjective preferences to match human judgement |
| RLVR (Reinforcement Learning with Verifiable Rewards) | Trains using code-based reward functions | Tasks where correctness can be programmatically verified |
For this solution, you’ll fine-tune a small language model (SLM) on the FreedomIntelligence/medical-o1-reasoning-SFT dataset to build a clinical reasoning model that walks through medical cases step-by-step before providing a diagnosis. This demonstrates how fine-tuning can specialize a general-purpose model for domain-specific reasoning tasks. If you’d like to try a different use case, SageMaker AI provides a library of sample datasets across techniques like SFT, DPO, and RLVR that you can use as a starting point.
In this post, we’ll start with using Kiro and switch to Claude Code as our coding agent. To keep using Kiro, move to the Planning Phase section, or move to the next section to see how to use Claude Code in JupyterLab.
Kiro requires authentication before you can use it. The chat panel will guide you through the authentication process.


The first time you use Kiro in a space, it will ask you to login. To login, follow the instructions provided by the chat, or follow here:
kiro-cli login --use-device-flowSelect one of the 3 login options in the terminal:

SageMaker AI Studio supports implementing additional coding agents using Agent Control Protocol (ACP). Example agents that support ACP include:
View the JupyterLab user guide for more information on installation steps.
To use Claude Code:
npm install -g @zed-industries/claude-agent-acprestart-jupyter-server or by restarting the space via the Studio UI. Please note, this will result in any unsaved work or in memory state (like active kernels) being lost.
Claude Code can be used with most Anthropic subscriptions including configuring Claude Code with Amazon Bedrock on Amazon SageMaker AI Studio. To configure Claude Code to use Claude through Amazon Bedrock follow the prerequisites in the Claude code guide, enabling Bedrock model access and providing your execution role access to bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream. Then, create the following file to configure Claude Code to use Bedrock.
~/.claude/settings.json:
{
"env": {
"CLAUDE_CODE_USE_BEDROCK": "1"
}
}
Upon receiving the user prompt, the coding agent doesn’t immediately begin executing tasks. It enters a planning phase where it identifies and activates the skills necessary to complete the job. In the process, the agent generates a workflow which users can review and modify. From the initial prompt, the agent recognizes two relevant skill domains and activates both the planning skill for structuring the overall workflow and the finetuning-setup skill for configuring the training job. Before generating any code, the agent asks targeted questions about dataset readiness and use case details to inform its technique and evaluation metrics recommendations.
With multiple model families and fine-tuning techniques available, choosing the right approach for your specific use case can be challenging. The agent analyzes your dataset structure and task requirements to provide tailored model and technique recommendations, helping you avoid costly trial-and-error cycles. SageMaker AI supports serverless customization across Amazon Nova, GPT-OSS, Llama, Qwen, and DeepSeek family of models. For this use case, we chose Qwen3-0.6B because it is cost-effective to train and deploy while being sufficient for domain-specific tasks like medical reasoning.
The model’s loss will show a steady decrease during training, showing it successfully learned to provide step-by-step clinical reasoning before reaching diagnoses. For a deeper look at the full metric set and per-step breakdowns, we can view more in the MLflow app.

Once training completes, we need to measure how well the fine-tuned model performs compared to the base model. The agent recommends an evaluation approach based on our use case, or we can specify the metrics we care about, such as accuracy on held-out medical reasoning questions or reward score improvement over the base model. It then generates a notebook in SageMaker AI Studio JupyterLab that runs the evaluation against an evaluation dataset and reports the results, so we can validate the model’s performance. Evaluation results are also distributed to MLflow for comparisons before moving to deployment.
With evaluation complete, the final step is deploying the fine-tuned model for inference. The agent walks us through deployment options across SageMaker AI and Bedrock through Bedrock Custom Model Import, depending on our latency, scaling, and integration requirements. It then generates a notebook in JupyterLab that provisions the endpoint and runs a sample inference request, so we can validate whether the deployed model is ready to serve predictions.
The skills included with SageMaker AI cover common fine-tuning workflows, but you can also customize existing skills or author new ones to match your organization’s standards and tooling. For example, you might extend the model-evaluation skill to include domain-specific metrics or add a new skill for a custom deployment target. Skills are defined in simple markdown files in the ~/.kiro/skills directory, making them easy to author, version-control, and share across your organization.
In this post, we walked through the model customization lifecycle using SageMaker AI agent skills. Starting from a single natural language prompt, the agent planned the workflow, configured and ran a SFT fine-tuning job on Qwen3-0.6B, evaluated the results with metrics tailored to our use case, and deployed the fine-tuned model. The agentic model customization experience in Amazon SageMaker AI is available today. You can get started in minutes. Simply launch a JupyterLab space in SageMaker Studio with Kiro and Agent Skills pre-configured, or bring the same Skills into your preferred IDE from GitHub. Describe your use case in natural language, and let the agent guide you from data preparation through evaluation and deployment.
What once required months of specialized ML work and deep knowledge can now be completed in days. The expertise is encoded. The workflow is guided. And the code is yours. Get started today by visiting the GitHub repository for the SageMaker AI agent skills plugin and step-by-step guide. Review the documentation to see how SageMaker AI serverless model customization with agent skills can accelerate your path from idea to production models.
Manuel Rioux est fièrement propulsé par WordPress