Wikipedia

Search results

Friday, June 17, 2022

Show HN: Root Cause as a Service – Never dig through logs again https://ift.tt/R0oaMJc

Show HN: Root Cause as a Service – Never dig through logs again Hey Folks – Larry, Ajay and Rod here! We address the age old painful problem of digging through logs to find the root cause when a problem occurs. No-one likes searching through logs, and so we spent a few years analyzing 100’s of real world incidents to understand how humans troubleshoot in logs. And then we built a solution that automatically finds the same root cause indicators a human would have had to manually search for. We call it Root Cause as a Service. RCaaS works with any app and does not require manual training or rules. Our foundational thoughts and more details can be found here: https://ift.tt/1JTbwuz. Obviously, everyone is skeptical when they hear about RCaaS. We encourage you try it yourself, but we also have a really strong validation point. One of our customers performed a study using 192 actual customer incidents from 4 different products and found that Zebrium correctly identified the root cause indicators in the logs in over 95% of the incidents – see https://ift.tt/T0X8RIq. For those that are interested, this is actually our second SHOW HN post, our first was last June - https://ift.tt/jFZmx6E. The link in that post points to our current home page but our initial comment was, "We're excited to share Zebrium's autonomous incident detection software". At the time, our focus was on a tool that used unsupervised ML to automatically detect any kind of new or unknown software incident. We had done a lot of customer testing and were achieving > 90% detection accuracy in catching almost any kind of problem. But what we underestimated is just how high the bar is for incident detection. If someone is going to hook you up to a pager, then even an occasional false positive is enough for a user to start cursing your product! And users quickly forget about the times when your product saved their bacon by catching problems that they would otherwise have missed. But late last year we had a huge aha moment! Most customers already have monitoring tools in place that are really good at detecting problems, but what they don't have is an automated way to find the root cause. So, we built some really elegant integrations for Datadog, New Relic, Elastic, Grafana, Dynatrace, AppDynamics and ScienceLogic (and more to come via our open APIs) so that when there's a problem, you see details of the root cause directly on your monitoring dashboard. Here's a 2 minute demo of what it looks like: https://youtu.be/t83Egs5l8ok. You're welcome to sign-up for a free trial at https://www.zebrium.com and we'd love to hear your questions and feedback. June 17, 2022 at 04:55PM

No comments:

Post a Comment