AI data cloud company Snowflake today said it has signed a definitive agreement to acquire Observe, a leader in AI-powered observability. The acquisition will integrate Observe’s platform directly into Snowflake, expanding that company’s capabilities in the IT operations management market, valued at more than $50 billion, according to Gartner research. The combination of Observe’s AI Site … continue reading
Managing cloud-native environments has never been harder. Modern Site Reliability Engineering (SRE) teams are buried under a flood of telemetry, incidents, and constantly-changing infrastructure. For years, the playbook was simple: add more dashboards, collect more metrics, write better runbooks, and automate what you can. But as systems scale, even the best-run teams are hitting a … continue reading
Mezmo, the active telemetry platform for AI agents, today launched its AI SRE (Site Reliability Engineering) agent for root cause analysis ahead of KubeCon, North America. The company’s secret sauce is context engineering, which supercharges AI agents with unmatched speed and precision. “We’ve built the fastest and most performant AI SRE in the world – … continue reading
Integrating AI into your workflows inherently creates risk, but it can also be the key to managing risk. No one understands the art of risk mitigation better than site reliability engineers, or SREs. From incident management to operational toil, SRE teams are built to handle the unpredictable. Now, with AI stepping into the picture, those … continue reading
The chaos engineering company Gremlin has updated its reliability testing suite with new features like custom test suites, reliability scores, and enterprise-wide dashboards. The new enhancements are designed to provide site reliability engineers with ways to customize their reliability standards and measure progress on those standards. Admins can now create their own test suites to … continue reading
Economic uncertainty, cloud-native technology, and demands for data sovereignty will reshape the cloud in 2023, according to the Predictions 2023: Cloud Computing report by Forrester. Despite the economic uncertainty that lies ahead, cloud-native is projected to go mainstream as companies plan to freeze investment in legacy systems and invest more in technologies such as Kubernetes. … continue reading
Over the past seven years, we’ve seen Kubernetes become the de facto platform for building modern applications. With this shift, application architectures have become increasingly distributed, dynamic, and modular. As a byproduct, logging data has exploded – depending on the company, anywhere from one terabyte to multiple petabytes of data can be generated each day. … continue reading
Lightstep announced that it is creating a differentiated portfolio for app development with the general availability of Lightstep Incident Response. The new solution will enable developers and site reliability engineers (SREs) to reduce downtime through the integration of service context and automation for responding to incidents, such as a software bug, power outage, or down … continue reading
New Relic announced the general availability of a new infrastructure monitoring solution that helps DevOps, SRE and ITOps teams isolate offending infrastructure components and view all related telemetry — including logs, events, and alerts — in context. The new solution aims to tackle the three key issues that surround infrastructure: the complexity of infrastructure, handling … continue reading
Although companies have been adopting cloud native technologies at higher rates than ever before, 85% of companies have yet to fully cross the chasm to adoption when it comes to Kubernetes and cloud native. However, they are quickly moving in that direction. Nearly 78% of companies in Canonical’s Kubernetes and Cloud Native Operations Report had … continue reading
Just over half of SREs (53%) said that the number one cloud application monitoring challenge is unified visibility across the stack. Organizations are looking toward AI and machine learning to solve these problems, but adoption of AIOps is slow. This is according to the 2021 SRE Report that was conducted by the digital experience monitoring … continue reading
A new study found that the increased need for process automation and SREs has been fueled by companies’ increase in digital transformation initiatives as well as remote and hybrid work policies. Industries have seen a 90% increase in customer-affecting issues and 68% of businesses reported an increased cost of downtime since the pandemic began. Now, … continue reading