How Spectre and Meltdown hardware flaws alter IT Ops playbook

Published: January 19th, 2018

On New Year’s Eve, most IT Ops pros had never heard of Spectre and Meltdown. Within two days, the latest vulnerabilities became a key IT management priority for 2018. Now IT Ops pros are coming to terms with the range of systems they may need to patch and the potential impact the fixes might have on the performance of business-critical infrastructure.

Unlike most vulnerabilities that have dominated IT, the revelation of Spectre and Meltdown is different in that they exploit a weakness in hardware designed decades ago to improve system performance. The revelation also dispels some presumptions that data in-memory and virtual machines can’t leak.

The existence and scope of Spectre and Meltdown unexpectedly appeared in a report published by The Register on Jan. 2, a week before a team organized by Google called Project Zero planned to announce the vulnerabilities. Discovered last summer by Jann Horn, a 22-year-old cybersecurity researcher at Google, the Project Zero team of respected research experts drafted papers for the patches covering the three variants (two for Spectre and one covering Meltdown).

In addition to Google, Project Zero coordinated with Amazon Web Services (AWS), Microsoft, IBM, Oracle, Red Hat and SUSE, among other key software players. Likewise, Project Zero worked with the major providers of hardware, including Intel, AMD and ARM, and a swath of systems vendors, as the industry scrambled to patch core systems before the release of the technical documentation authored by Horn and published by Google.

At the heart of the discovery is the potential for side-channel analysis techniques that can enable unauthorized access to secure data that exploit what’s known as “speculative execution,” a capability in processors that enable high performance. To date, there are no known breaches exploiting Spectre and Meltdown but now that it’s public, experts are advising organizations with sensitive data and accessible to patch them.

Despite initial misinformation when news of Spectre and Meltdown surfaced earlier this month, experts say the risk is not expected to impact most PC and smartphone users because they are read-only vulnerabilities. Hence, an intruder could only read information such as passwords and cryptographic keys but can’t spread or execute malware or ransomware. Presuming PC and smartphone users keep their devices patched, the risk is minimal, according to experts.

The headache for IT Ops pros is the potential for Spectre and Meltdown to gather sensitive data in servers, storage, networks (even CDNs), virtual machines and notably among public cloud providers, which host multi tenant infrastructure. Among unpatched systems, the most vulnerable are those not properly secured using encryption and privilege access controls. The problem is that applying the patches can lead to varying levels of performance level drops because of the increased CPU utilization brought on by a variety of factors including the impact on speculative execution.

IT Ops pros and infosec teams are in uncharted territory on several fronts. Unlike most vulnerabilities that target weaknesses in software, Spectre and Meltdown exploit weaknesses that were presumed safe havens. “Security that was believed to be in place to separate data used by one application from being accessed by another may be compromised,” according to a blog post by Chad Erbe, a professional services architect at BeyondTrust, whose software manages privilege access to systems. “All of this is taking place at the hardware level, but the flaws are at the software level. Essentially, we have a physical security issue in the virtual world.”

Thomas LaRock, head geek at performance management vendor SolarWinds, said that because the flaws allow access to the kernel memory, any user process can access it. “So, any host is at the mercy of the guests,” LaRock said. “This applies to VMs and containers. Nothing is safe. That’s different than saying this is high risk, as you still need the bad actors to have access to the systems.”

To make the problem go away entirely would require replacing every potentially vulnerable processor, said Morey Haber, VP of technology at BeyondTrust. Given that every Intel CPU manufactured since 1995 (except Atom and Itanium) is potentially vulnerable, according to Project Zero’s FAQ created and posted by the Graz University of Technology in Germany, the cost would be untenable. Also, many of the older systems are running on hardware no longer manufactured, yet with legacy applications that can’t run on newer systems.

“The only viable short-term mitigation is to patch our operating systems and hypervisors at the lowest level (kernel) to prevent inappropriate memory calls that can leak information from an application or virtual machine,” Haber said. BMC Software executives Sean Berry, a solution evangelist in BMC’s Data Center Automation and Cloud group and Shawn Jaques, the company’s director of marketing, explained in a blog post how to balance the tradeoff between patching systems and addressing performance.

“The pervasiveness of the vulnerability across servers, devices and operating systems is nearly unprecedented,” they noted. “Since this vulnerability impacts a feature that improves performance, there is a potential significant performance hit from applying the software patches. The real challenge for most organizations is to effectively apply their patching process across a multitude of tools and teams to correct the systems before hackers start to exploit the now-public vulnerabilities.”

SolarWinds’ LaRock agreed, and added the following advice:

Assess your risk. If your server is isolated from intrusion (no browser, etc.), and not sharing the memory of other servers (so, not a guest with others on a host), then you have lower risk.
Assess the importance of the workload and/or data. Not every system is top secret or mission critical.
Gather inventory details, know what chips you have, which OS, what versions of database software, etc.
Build a patching plan, using the above details to help you prioritize.

The ones with the most at stake are the cloud providers, LaRock noted, because they have agreed-upon SLAs. LaRock’s colleague, Mike Heffner, a co-founder and lead engineer for Librato, now a SolarWinds company, noted a degradation in performance of the instances running on AWS in the weeks leading up to the disclosure of Spectre and Meltdown. In his original blog post about 10 days ago, Heffner documented the impact based on charts of a Python service worker tier in late December, where CPU utilization rose approximately 25 percent. Likewise, on Jan. 3, he noted Cassandra tiers saw similar spikes and at one point as high as 45 percent. In an update late last week, Heffner reported a marked reduction in CPU utilization to pre-patch levels.

It’s not surprising that the cloud providers would be on top of their game, considering they must provide the extra capacity or take a financial hit. “The cloud providers have agreed upon SLAs already in place, they will need to find a way to scale,” he said. “If they suddenly had a bump in 30 percent CPU utilization due to customer usage, they would find a way to make it work. And not every server is being impacted with performance issues as far as I know.”

While LaRock believes the problem will be short-lived, others say it’ll remain an issue indefinitely. “Expect this one to linger for a long time,” wrote Forrester Research principal analyst Jeff Pollard. “Thankfully, microcode fixes are available, but those fixes are being distributed by hardware manufacturers. That is a challenge; although enterprise organizations with support contracts can overcome it, for end-user systems it is a nightmare. The development, distribution, and installation of these patches will never end. On systems that don’t get patched, it means that information is at risk.”

The Spectre and Meltdown FAQ published by Project Zero includes a list and links to more than 30 top software and hardware providers’ vulnerability advisories and documentation, which include access to the latest patches.

Article Tags

IT ops, itsm, performance, security

About Jeffrey Schwartz

Jeffrey Schwartz has covered all aspects of IT, from datacenter, networking, storage, cloud and end-user client computing infrastructure, to software development, collaboration and services for nearly three decades. Most recently he was editor-in-chief of Redmond magazine, where he also had roles with sister publications Virtualization Review, Application Development Trends and Visual Studio Magazine, among others. Follow him on Twitter @JeffreySchwartz.

View all posts by Jeffrey Schwartz

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_WTGVKVXEZJ	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_107693958_2	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_jsuid	1 year	This cookie contains random number which is generated when a visitor visits the website for the first time. This cookie is used to identify the new visitors to the website.
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
iutk	5 months 27 days	This cookie is used by Issuu analytic system to gather information regarding visitor activity on Issuu products.
uvc	1 year 1 month	Set by addthis.com to determine the usage of addthis.com service.

Cookie	Duration	Description
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
mc	1 year 1 month	Quantserve sets the mc cookie to anonymously track user behaviour on the website.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__gpi	1 year 24 days	No description
_heatmaps_g2g_101137905	10 minutes	No description
cf_7167_id	20 years	No description
cf_7167_person_last_update	session	No description
GoogleAdServingTest	session	No description
prism_252377639	1 month	No description
querylyvid	3 months	No description
xtc	1 year 1 month	No description

How Spectre and Meltdown hardware flaws alter IT Ops playbook

Article Tags

Subscribe to SDTimes

About Jeffrey Schwartz

Related Articles

Palo Alto Networks secures AI ecosystem with launch of Prisma AIRS

RSAC 2025: Bugcrowd to offer red-team-as-a-service, Netwrix expands 1Secure platform, and more

Qualys Policy Audit automates compliance and risk management for security teams

1Password adds new capability to enable AI agents to securely access and manage secrets