This week’s highlighted open-source project is Litmus, which is a platform for conducting chaos engineering tests, which identify weaknesses in infrastructure so that they can be addressed before they become real issues. The project is made up of a control plane called chaos-center, which allows you to construct, schedule, and visualize chaos workflows, and an … continue reading
The chaos engineering company Gremlin has updated its reliability testing suite with new features like custom test suites, reliability scores, and enterprise-wide dashboards. The new enhancements are designed to provide site reliability engineers with ways to customize their reliability standards and measure progress on those standards. Admins can now create their own test suites to … continue reading
The mean time to resolve (MTTR), the industry gold standard for success and efficiency, proves to be an inaccurate metric for success. The 2021 VOID report by chaos engineering company Verica set to come to conclusions about how to tackle software-based failures, but do to the distribution of data, the company found MTTR wasn’t a … continue reading
There may never be a more perfect time to experiment with chaos engineering than right now, in 2020, while many IT teams and their end customers continue to work remotely as COVID-19 rages on. There’s been much written about chaos engineering, particularly its impact on DevOps organizations. Chaos engineering (CE) entails experimenting on a system … continue reading
Since Google released its Site Reliability Engineering (SRE) book in 2016, the field has gained widespread attention. However, adopting SRE as defined by Google is not as applicable to most organizations as it may seem, according to Sanjeev Sharma, a principal analyst of Accelerated Strategies, who spoke at Catchpoint’s “SRE from Home” virtual event last … continue reading
Chaos engineering company Gremlin is introducing an easier way for DevOps teams to start using chaos engineering. According to the company, chaos engineering is an approach to detecting failures before outages happen. “With the ongoing migration to microservice, serverless, and cloud environments, we believe the industry has answered ‘why do chaos engineering,’ and has begun … continue reading
Gremlin announced the release of Application Level Fault Injection (ALFI) earlier this week, introducing application-level failure injection and support for serverless environments to their Failure-as-a-Service platform offerings, alongside their successful Series B funding of $18 million led by Redpoint Ventures. Gremlin was launched a year ago by former Amazon and Netflix developers and was designed … continue reading