Working toward AIOps maturity? It’s never too early (or late) for platform engineering

Published: June 28th, 2024

Until about two years ago, many enterprises were experimenting with isolated proofs of concept or managing limited AI projects, with results that often had little impact on the company’s overall financial or operational performance. Few companies were making big bets on AI, and even fewer executive leaders lost their jobs when AI initiatives didn’t pan out.

Then came the GPUs and LLMs.

All of a sudden, enterprises in all industries found themselves in an all-out effort to position AI – both traditional and generative – at the core of as many business processes as possible, with as many employee- and customer-facing AI applications in as many geographies as they can manage concurrently. They’re all trying to get to market ahead of their competitors. Still, most are finding that the informal operational approaches they had been taking to their modest AI initiatives are ill-equipped to support distributed AI at scale.

They need a different approach.

Platform Engineering Must Move Beyond the Application Development Realm

Meanwhile, in DevOps, platform engineering is reaching critical mass. Gartner predicts that 80% of large software engineering organizations will establish platform engineering teams by 2026 – up from 45% in 2022. As organizations scale, platform engineering becomes essential to creating a more efficient, consistent, and scalable process for software development and deployment. It also helps improve overall productivity and creates a better employee experience.

The rise of platform engineering for application development, coinciding with the rise of AI at scale, presents a massive opportunity. A helpful paradigm has already been established: Developers appreciate platform engineering for the simplicity these solutions bring to their jobs, abstracting away the peripheral complexities of provisioning infrastructure, tools, and frameworks they need to assemble their ideal dev environments; operations teams love the automation and efficiencies platform engineering introduces on the ops side of the DevOps equation; and the executive suite is sold on the return the broader organization is seeing on its platform engineering investment.

Potential for similar outcomes exists within the organization’s AI operations (AIOps). Enterprises with mature AIOps can have hundreds of AI models in development and production at any time. In fact, according to a new study of 1,000 IT leaders and practitioners conducted by S&P Global and commissioned by Vultr, each enterprise employing these survey respondents has, on average, 158 AI models in development or production concurrently, and the vast majority of these organizations expect that number to grow very soon.

When bringing AIOps to a global scale, enterprises need an operating model that can provide the agility and resiliency to support such an order of magnitude. Without a tailored approach to AIOps, the risk posed is a perfect storm of inefficiency, delays, and ultimately, the potential loss of revenue, first-market advantages, and even crucial talent due to the impact on the machine learning (ML) engineer experience.

Fortunately, platform engineering can do for AIOps what it already does for traditional DevOps.

The time is now for platform engineering purpose-built for AIOps

Even though platform engineering for DevOps is an established paradigm, a platform engineering solution for AIOps must be purpose-built; enterprises can’t take a platform engineering solution designed for DevOps workflows and retrofit it for AI operations. The requirements of AIOps at scale are vastly different, so the platform engineering solution must be built from the ground up to address those particular needs.

Platform engineering for AIOps must support mature AIOps workflows, which can vary slightly between companies. However, distributed enterprises should deploy a hub-and-spoke operating model that generally comprises the following steps:

Initial AI model development and training on proprietary company data by a centralized data science team working in an established AI Center of Excellence
Containerization of proprietary models and storage in private model registries to make all models accessible across the enterprise
Distribution of models to regional data center locations where local data science teams fine-tune models on local data
Deployment and monitoring of models to deliver inference in edge environments

In addition to enabling the self-serve provisioning of the infrastructure and tooling preferred by each ML engineer in the AI Center of Excellence and the regional data center locations, platform engineering solutions built for distributed AIOps automate and simplify the workflows of this hub-and-spoke operating model.

MORE FROM THIS AUTHOR: Vultr adds CDN to its cloud computing platform

Mature AI involves more than just operational and business efficiencies. It must also include responsible end-to-end AI practices. The ethics of AI underpin public trust. As with any new technological innovation, improper management of privacy controls, data, or biases can harm adoption (user and business growth) and generate increased governmental scrutiny.

The EU AI Act, passed in March 2024, is the most notable legislation to date to govern the commercial use of AI. It’s likely only the start of new regulations to address short and long-term risks. Staying ahead of regulatory requirements is not only essential to remain in compliance; business dealings for those who fall out of compliance may be impacted around the globe. As part of the right platform engineering strategy, responsible AI can identify and mitigate risks through:

Automating workflow checks to look for bias and ethical AI practices
Creating a responsible AI “red” team to test and validate models
Deploying observability tooling and infrastructure to provide real-time monitoring

Platform engineering also future-proofs enterprise AI operations

As AI growth and the resulting demands on enterprise resources compound, IT leaders must align their global IT architecture with an operating model designed to accommodate distributed AI at scale. Doing so is the only way to prepare data science and AIOps teams for success.

Purpose-built platform engineering solutions enable IT teams to meet business needs and operational requirements while providing companies with a strategic advantage. These solutions also help organizations scale their operations and governance, ensuring compliance and alignment with responsible AI practices.

There is no better approach to scaling AI operations. It’s never too early (or late) to build platform engineering solutions to pave your company’s path to AI maturity

Article Tags

AIOps, platform engineering, vultr

About Kevin Cochrane

Kevin Cochrane is CMO at Vultr

View all posts by Kevin Cochrane

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_WTGVKVXEZJ	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_107693958_2	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_jsuid	1 year	This cookie contains random number which is generated when a visitor visits the website for the first time. This cookie is used to identify the new visitors to the website.
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
iutk	5 months 27 days	This cookie is used by Issuu analytic system to gather information regarding visitor activity on Issuu products.
uvc	1 year 1 month	Set by addthis.com to determine the usage of addthis.com service.

Cookie	Duration	Description
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
mc	1 year 1 month	Quantserve sets the mc cookie to anonymously track user behaviour on the website.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__gpi	1 year 24 days	No description
_heatmaps_g2g_101137905	10 minutes	No description
cf_7167_id	20 years	No description
cf_7167_person_last_update	session	No description
GoogleAdServingTest	session	No description
prism_252377639	1 month	No description
querylyvid	3 months	No description
xtc	1 year 1 month	No description

Working toward AIOps maturity? It’s never too early (or late) for platform engineering

Platform Engineering Must Move Beyond the Application Development Realm

The time is now for platform engineering purpose-built for AIOps

MORE FROM THIS AUTHOR: Vultr adds CDN to its cloud computing platform

Platform engineering also future-proofs enterprise AI operations

Article Tags

Subscribe to SDTimes

About Kevin Cochrane

Related Articles

LogicMonitor enhances observability platform to better monitor AI workloads and applications

Q&A: How platform engineering teams manage infrastructure and security

Run:ai joins Vultr’s Cloud Alliance program

ScienceLogic announces integration with Cisco Intersight