The chaos engineering company Gremlin has updated its reliability testing suite with new features like custom test suites, reliability scores, and enterprise-wide dashboards. The new enhancements are designed to provide site reliability engineers with ways to customize their reliability standards and measure progress on those standards. Admins can now create their own test suites to … continue reading
Economic uncertainty, cloud-native technology, and demands for data sovereignty will reshape the cloud in 2023, according to the Predictions 2023: Cloud Computing report by Forrester. Despite the economic uncertainty that lies ahead, cloud-native is projected to go mainstream as companies plan to freeze investment in legacy systems and invest more in technologies such as Kubernetes. … continue reading
Over the past seven years, we’ve seen Kubernetes become the de facto platform for building modern applications. With this shift, application architectures have become increasingly distributed, dynamic, and modular. As a byproduct, logging data has exploded – depending on the company, anywhere from one terabyte to multiple petabytes of data can be generated each day. … continue reading
Lightstep announced that it is creating a differentiated portfolio for app development with the general availability of Lightstep Incident Response. The new solution will enable developers and site reliability engineers (SREs) to reduce downtime through the integration of service context and automation for responding to incidents, such as a software bug, power outage, or down … continue reading
New Relic announced the general availability of a new infrastructure monitoring solution that helps DevOps, SRE and ITOps teams isolate offending infrastructure components and view all related telemetry — including logs, events, and alerts — in context. The new solution aims to tackle the three key issues that surround infrastructure: the complexity of infrastructure, handling … continue reading
Although companies have been adopting cloud native technologies at higher rates than ever before, 85% of companies have yet to fully cross the chasm to adoption when it comes to Kubernetes and cloud native. However, they are quickly moving in that direction. Nearly 78% of companies in Canonical’s Kubernetes and Cloud Native Operations Report had … continue reading
Just over half of SREs (53%) said that the number one cloud application monitoring challenge is unified visibility across the stack. Organizations are looking toward AI and machine learning to solve these problems, but adoption of AIOps is slow. This is according to the 2021 SRE Report that was conducted by the digital experience monitoring … continue reading
A new study found that the increased need for process automation and SREs has been fueled by companies’ increase in digital transformation initiatives as well as remote and hybrid work policies. Industries have seen a 90% increase in customer-affecting issues and 68% of businesses reported an increased cost of downtime since the pandemic began. Now, … continue reading
xMatters’ new adaptive incident management feature advancements provide increased automation across each stage of the incident management lifecycle – diagnosis and collaboration, resolution and post-incident learning. An increase in the number of change-related incidents and the furious speed of new software releases demand more automation be applied across the incident management lifecycle to accelerate actions. … continue reading
Puppet introduced the public beta of Relay, an event-driven automation platform that automates across any cloud infrastructure, tools and APIs that developers, DevOps engineers, and SREs are managing manually currently. “Without a way to manage and automate the flood of events and hundreds of APIs developers use – time, money and mental capital are being … continue reading
In order to bring more effective operational practices, DevOps and site reliability engineering (SRE) teams need to go through a culture change within the organization. Red Hat held its virtual summit this week where it talked about how to reinvent IT Ops as SRE. According to the company, change can happen by automating processes and … continue reading
Portshift has announced the release of Kubei, an open-source Kubernetes runtime vulnerabilities scanner tool. According to the company, while there are a lot of options already out there, not all scanners are the same and differ by the number of feeds they consume, updates they product and information they provide. ‘All tools, however, require some … continue reading