Amazon SageMaker Unified Studio represents the evolution towards unifying the entire data, analytics, and artificial intelligence and machine learning (AI/ML) lifecycle within a single, governed environment. As organizations adopt SageMaker Unified Studio to unify their data, analytics, and AI workflows, they encounter new challenges around scaling, automation, isolation, multi-tenancy, and continuous integration and delivery (CI/CD). Scaling AI initiatives across teams and accounts introduces operational overhead, and proper isolation and multi-tenancy is essential for security and governance. Automating workflows and embedding CI/CD practices can be challenging in a unified environment, especially when balancing collaboration with strict resource boundaries.
This post presents architectural strategies and a scalable framework that helps organizations manage multi-tenant environments, automate consistently, and embed governance controls as they scale their AI initiatives with SageMaker Unified Studio.
This is a multi-part series, where we guide you through automation and artificial itelligence operations (AIOps) in SageMaker Unified Studio:
For a comprehensive overview of the capabilities of SageMaker Unified Studio and its user experience, refer to An integrated experience for all your data and AI with Amazon SageMaker Unified Studio.
Scaling AIOps automation across the enterprise requires a robust, multi-account AWS architecture. This approach enhances security, enables effective resource isolation, and supports the scalability needs of modern organizations. Our solution uses shared services—such as project templating, integrated CI/CD, data governance, ML pipeline automation, model promotion, and approval workflows—to streamline end-to-end AI/ML operations.
In this section, we introduce the high-level architecture and outline the workflow steps for implementing this solution. Subsequent sections will dive deeper into each architectural concept and component. This architecture is supported with practical code samples for building a reference implementation, which we discuss in Part 2.
The multi-account architecture involves several key user roles, each contributing to different stages of the AI/ML workflow. The following generic personas are commonly found in such environments; actual titles and responsibilities might differ in your organization.
The following figure illustrates the multi-account architecture spanning AI shared services, enterprise line of business (LOB) accounts, and the governance account.

The following workflow walks you through the end-to-end ML operations across a multi-account architecture, starting from initial project creation through development, testing, and production deployment, with built-in governance controls at each stage. We recommend using Amazon EventBridge for the events mentioned in this workflow.
In the following sections, we discuss the account structure and key components of the architecture in more detail.
Based on AWS Well-Architected best practices, we recommend implementing a multi-account architecture for your AIOps solution. This approach is aligned with the guidance in the whitepaper Organizing Your AWS Environment Using Multiple Accounts. Our architecture consists of the following specialized accounts:
The workflow steps described along with the architecture diagram in the preceding section assume the governance account is implemented. Before proceeding, an administrator must establish the SageMaker Unified Studio domain within the governance account. This initial setup configures the foundational components required to support the solution and its users. For setting up the SageMaker Unified Studio domain in the governance account, and configuring the necessary components, refer to Foundational blocks of Amazon SageMaker Unified Studio: An admin’s guide to implement unified access to all your data, analytics, and AI.
SageMaker Catalog is an enterprise-wide business catalog for publishing and consuming assets used in data and AI pipelines. With Amazon SageMaker Catalog, you can securely discover and access approved data and models using semantic search with generative AI–created metadata. We use structured AWS Glue tables to demonstrate the AI pipelines. If you need access to unstructured data, you can leverage the S3 Object collections feature supported in SageMaker Catalog as described in the documentation.
In SageMaker Unified Studio, a project creates a collaborative boundary within a domain where teams can work together on specific business use cases. Users can use projects to create, manage, and share data and resources in an isolated environment. When establishing a project, users can select from preconfigured project profiles (such as classical regression, LLM fine-tuning, or Retrieval Augmented Generation (RAG) applications) that align with their specific use case. These profiles must be configured in advance by administrators based on user requirements.
Projects function as workspaces with isolation boundaries enforced through AWS Identity and Access Management (IAM) resource tagging. However, there are important design considerations when implementing project isolation:
Our AIOps architecture illustrates SageMaker Unified Studio Project A spanning across three lines of business (DEV, TEST, and PROD), representing the different software development lifecycle (SDLC) stages. Although it might be technically possible to consolidate all SDLC stages within a single account, we strongly recommend maintaining account-level separation between non-prod (dev, test) and prod environments. This separation enhances security, improves scalability, and increases reliability for your AI workloads.
For test and production environments, creating SageMaker Unified Studio projects is optional if you don’t require access to deployed artifacts through the SageMaker Unified Studio portal. As an alternative, you can use AWS APIs to provision access for your IAM principals in these environments and use observability tools for visualization and monitoring.
In the next section, we explore strategies for effectively mapping projects with multi-tenancy to accounts while considering SDLC stages and isolation requirements.
Multi-tenancy refers to the architecture where a single environment serves multiple, isolated teams, LOBs, or organizations (tenants), making sure each tenant’s data and resources are securely separated while using shared resources for efficiency. This approach enhances scalability, cost-effectiveness, and resource utilization while maintaining data privacy and security for each tenant.
In the AIOps architecture described earlier, we chose to segment the SDLC environment at LOB level for facilitating multiple tenancy for AI projects. However, you can choose the granularity that best fits your solution’s multi-tenancy requirements. At the minimum, we recommend separating development and production projects by AWS accounts to enforce a high security bar for your AI workloads. Also, we recommend hosting your services shared across projects in a shared account such as the AI shared services account.
In the following diagram, we depict how to implement multi-tenancy for data producer use cases across multiple SDLC accounts with a single project workspace per use case. For each use case project (for example, Project D), we create only one Project D-Dev workspace. After promotion of those project resources from dev into test or prod, they are deployed and executed in those environments. Those deployed project resources can be accessed through observability tools from the shared services account for troubleshooting.

The high-level workflow steps for the preceding architecture are as follows:
In the following diagram, we depict how to implement multi-tenancy for data consumer ML use cases across multiple SDLC accounts with multiple projects. Typically, data scientists need production-grade data to build and train models. Therefore, in the following diagram, we depict only access to production data using the PROD domain unit of the Data Catalog. For each use case project (for example, Project A), we create three SDLC project workspaces using SDLC stage qualifiers (dev, test, prod). After promotion of those project resources from dev into test or prod, they are deployed and executed in those environments. Those deployed project resources are accessible from their connected project stage workspaces.

The high-level workflow steps for the architecture are as follows:
SageMaker Unified Studio supports seamless integration with external Git repositories, such as GitHub and GitLab, through its Git connections feature. When a new project is created, the selected Git connection is used to automatically set up a corresponding project repository, so users can manage code, collaborate, and perform Git operations directly within the SageMaker Unified Studio environment. As a best practice, we recommend providing a template or seed code tailored to the specific use case (such as classical regression, LLM fine-tuning, or RAG applications) during project setup to accelerate development and drive standardization.
For organizing CI/CD workflows, there are two primary approaches: using dedicated build and deploy folders within a single project repository, or adopting a separate repository for each stage. The separate repository approach, which offers clearer change controls and straightforward management of promotion workflows, will be covered in detail in Part 2 of this series.
In this first part of our series, we laid the groundwork for scaling AI and analytics with SageMaker Unified Studio by addressing architectural strategies for automation, multi-account promotion, and multi-tenant isolation. These foundational elements help organizations balance innovation with robust security and governance, so teams can collaborate efficiently while maintaining clear boundaries and compliance.
Although security, governance, scale, and performance are key considerations for an AIOps architecture, you must also balance that with cost considerations. We shared strategies for addressing these key considerations using best practices for multi-account, multi-tenancy, and project-based governance to move your project from development to production.
We discussed how you can modernize data pipelines, accelerate AI development, and enforce regulatory compliance, across these personas:
In Part 2, we guide you through hands-on implementation, demonstrating how each persona can collaborate seamlessly, from project creation to production deployment. Dive in and discover how your teams can unlock the full potential of SageMaker Unified Studio.
Ram Vittal is a GenAI/ML Specialist SA at AWS. He has over 3 decades of experience architecting and building distributed, hybrid, and cloud applications. He is passionate about building secure, scalable, reliable GenAI/ML solutions to help customers with their cloud adoption and optimization journey to improve their business outcomes. In his spare time, he rides motorcycle and walks with his sheep-a-doodle!
Sandeep Raveesh is a GenAI Specialist Solutions Architect at AWS. He works with customer through their AIOps journey across model training, GenAI applications like Agents, and scaling GenAI use-cases. He also focuses on Go-To-Market strategies helping AWS build and align products to solve industry challenges in the GenerativeAI space. You can find Sandeep on LinkedIn.
Koushik Konjeti is a Senior Solutions Architect at Amazon Web Services. He has a passion for aligning architectural guidance with customer goals, ensuring solutions are tailored to their unique requirements. Outside of work, he enjoys playing cricket and tennis.
Vaibhav Sabharwal is a Senior Solutions Architect with Amazon Web Services (AWS) based out of New York. He is passionate about learning new cloud technologies and assisting customers in building cloud adoption strategies, designing innovative solutions, and driving operational excellence. As a member of the Financial Services Technical Field Community at AWS, he actively contributes to the collaborative efforts within the industry.
Manuel Rioux est fièrement propulsé par WordPress