Lightstep Announces Root Cause Analysis in Three Clicks
Lightstep, the leading provider of distributed tracing and observability software for organizations adopting microservices and serverless, announced major updates to its observability solution to help developers optimize root cause analysis and simplify incident response. With the introduction of log analysis and “Top Changes”, developer teams are able to zero-in on a single line of code to identify the cause of a regression in under a minute.
Read More: SalesTechStar Interview With Mehmet Eroglu, Chief Commercial Officer Of Foxxum GmbH
“Microservices and serverless architectures make it extremely difficult for developers to quickly assess the impact of a regression and isolate the root cause. Whether it’s due to our reliance on tribal knowledge, a lack of context, technology fatigue or red herrings that distract us from looking at the right data, this is a roadblock many developers are all too familiar with. By adding log search and aggregation, and building on our automated intelligence solutions, we’re uniquely positioned to allow any developer working on a deployment to side-step this issue and quickly pinpoint the root cause of a regression in under one minute,” said Katia Bazzi, Senior Software Engineer, Lightstep.
This update builds on Lightstep’s Service Health feature by introducing logs as part of Lightstep’s telemetry data set. As an essential part of the root cause analysis workflow, log search and aggregation help developers pinpoint a regression to a single line of code – allowing them to use the context of traces to paint a full picture of what’s changed.
With this update, Lightstep customers can:
- Identify the most frequently occurring logs in an error or latency regression
- Search across logs to narrow down the root-cause
- Investigate logs along the critical path to understand the root cause of a latency spike
In addition, Lightstep’s automated intelligence algorithms automatically surface which operations have experienced the greatest changes during a specific time period, whether it’s in-real-time, or during a deployment that occurred hours ago. “Top Changes” identifies which error rates, latency, throughput or other service level indicators (SLIs) experience the greatest change, enabling teams to streamline investigations and rapidly resolve incidents.
Read More: Brick-And-Mortar Brands Must Adapt More To Survive COVID-19