De-Risking AI Means New Infrastructure Security Patterns

Published: February 5th, 2026

Anthropic’s October research showing an AI model reproducing a real intrusion drew mixed reactions. Some questioned the framing and others questioned the intent, but most platform teams did not find the result surprising. Many are already expecting a significant security adjustment as AI workloads grow.

AI systems are scaling faster than the security properties of the infrastructure they depend on. This gap becomes more visible as models become more capable and more widely deployed. For teams that want to reduce risk rather than wait for failure modes to appear, the right place to begin is the infrastructure layer. Modern deployment stacks still assume cooperative workloads that share kernels, drivers and accelerators. Those assumptions do not hold in adversarial settings.

Understanding where isolation breaks is the foundation for building safer AI systems. So what’s broken today?

Containers are not isolation and multi tenancy increases exposure

Containers became popular because they make packaging easy and predictable. Developers can bundle everything an application needs and run it anywhere without rebuilding. That convenience is separate from isolation. A container does not create a strong boundary. It brings its own user space but it still relies on a shared host kernel. If an attacker reaches that kernel or exploits a kernel flaw, every other container on that host becomes part of the same failure domain.

This risk increases in multi tenant environments. When workloads from different teams or customers share the same container infrastructure, they also share the same kernel. If one container is compromised and the attacker can access kernel memory, the attacker can observe or interfere with other workloads. Secrets, inference outputs and model weights become visible. For AI systems that process sensitive data, this creates real exposure.

The risk grows further when containers manage access to GPUs. GPU runtimes and drivers pass through the kernel in complex ways. They involve shared memory, IPC surfaces and device level calls that expand the attack surface. If the kernel is the only enforcement point, then using shared hardware means the entire stack becomes part of the trusted computing base. That is a large amount of code to trust. Code will always have vulnerabilities, so relying on a large surface makes accidental exposure more likely.

The core issue is simple. The container boundary is not the isolation boundary. It can define how software is packaged but not how faults are contained. To protect AI workloads, the enforcement point has to move below the kernel.

VM isolation restores clean boundaries and reduces attack surface

Virtual machines provide a more reliable isolation point because each VM has its own kernel. When the trust boundary is placed at the VM layer, a kernel flaw in one workload does not affect its neighbors. This reduces the shared attack surface and separates tenants more effectively. Lightweight virtual machines make this practical at scale. They preserve the container workflow while adding a protective boundary around each workload.

A microVM can run a container inside a minimal, tightly controlled environment. By using a microVM with a container runtime, developers keep the same packaging and deployment model they rely on today and the system gains a boundary that does not depend on a shared kernel. This removes an entire class of cross container risks. It also lets operators reduce how much code sits on the isolation boundary.

Hypervisors written in memory safe languages help even more. More than half of vulnerabilities in low level systems come from memory safety issues. Using a memory safe implementation eliminates many of these faults by construction. A smaller and safer hypervisor means a smaller trusted computing base. This aligns with the goal of reducing how much of the system must be trusted.

The principle is straightforward. Isolation should rely on the smallest set of components that can enforce it consistently. Virtual machines provide that boundary, and microVMs make it practical to use that boundary for container based workflows.

GPUs and confidential computing require isolation closer to the hardware

GPUs introduce additional challenges for multi tenant AI systems. They were designed for throughput and not for separation between untrusted workloads. Many GPUs do not clear memory between jobs. Residual data can remain in device memory long after a workload finishes. Timing behavior and resource allocation patterns can reveal information about what other tenants are doing. This becomes more problematic as multi tenant inference becomes common.

Containers that share a GPU also share the driver stack. This creates new paths for observation or interference. Without strong isolation around the accelerator, a compromise in one tenant can expose data processed by another. For AI workloads that operate on sensitive information, this is not acceptable.

Confidential computing helps by encrypting data while it is in use. It reduces how much must be trusted in the host operating system. When combined with VM based isolation, confidential computing ensures that even the hypervisor has less visibility into the data. It shrinks the trusted computing base and limits the impact of a compromise.

The result is a more predictable environment where GPU workloads can run without assuming that all tenants are cooperative or honest. The boundary shifts closer to the hardware and becomes easier to reason about under pressure.

Limiting blast radius is the heart of de-risking AI

Improving AI security is not about adding more controls. It is about reducing how much of the system must be trusted. Clean boundaries matter. Smaller attack surfaces matter. Predictable failure domains matter. Containers alone cannot provide these guarantees. VM based isolation restores the separation containers lack. Memory safe hypervisors reduce the risk of kernel compromise. Confidential computing shrinks the trusted computing base. GPU isolation prevents leakage at the accelerator layer.

The industry shifted from single large machines to distributed container platforms to meet demand. That shift improved agility but introduced new forms of exposure. The next evolution is to combine the flexibility of containers with the protection of strong isolation. If AI is going to be deployed across sensitive environments, the infrastructure must tolerate adversarial pressure without exposing the workloads it runs. The goal is to limit what any vulnerability can do and ensure that faults remain contained. That is the path toward de-risking AI in a way that scales with its adoption.

Article Tags

containers, micro VMs, virtual machines

About Marina Moore

Marina Moore is Lead of Edera Research.

View all posts by Marina Moore

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_WTGVKVXEZJ	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_107693958_2	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_jsuid	1 year	This cookie contains random number which is generated when a visitor visits the website for the first time. This cookie is used to identify the new visitors to the website.
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
iutk	5 months 27 days	This cookie is used by Issuu analytic system to gather information regarding visitor activity on Issuu products.
uvc	1 year 1 month	Set by addthis.com to determine the usage of addthis.com service.

Cookie	Duration	Description
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
mc	1 year 1 month	Quantserve sets the mc cookie to anonymously track user behaviour on the website.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__gpi	1 year 24 days	No description
_heatmaps_g2g_101137905	10 minutes	No description
cf_7167_id	20 years	No description
cf_7167_person_last_update	session	No description
GoogleAdServingTest	session	No description
prism_252377639	1 month	No description
querylyvid	3 months	No description
xtc	1 year 1 month	No description

De-Risking AI Means New Infrastructure Security Patterns

Article Tags

Subscribe to SDTimes

About Marina Moore

Related Articles

Bounded Systems are Safe Systems

The power of hybrid virtualization: Combining VMs and containers for resilient edge and enterprise IT

Virtual machines and containers: Better together

Database Performance on Kubernetes: Tradeoffs and Lessons Learned