📢 Webinar Alert! Reliability Automation - AI, ML, & Workflows in Incident Management. Register Here

Silencing the Noise: Redis' Journey to Efficient Incident Management With Squadcast

Silencing the Noise: Redis' Journey to Efficient Incident Management With Squadcast

As a Global Cloud Operations Manager, replacing any monitoring system could be an immense effort involving a lot of concerns. Our deployment with Squadcast has been a great experience from A to Z. The team has been constantly responsive and helpful throughout the deployment and adoption process. Having such a partnership gives me the peace of mind I need to be 100% sure no alerts go unnoticed. Since the implementation of Squadcast, we’ve managed to reduce the number of incoming alerts from tens of thousands to hundreds, thanks to the flexible deduplication mechanism. CloudOps work has never been more organized and for the first time, we are able to use the Postmortem feature as a true knowledge base for repeating alerts. Squadcast brings simplicity and flexibility and has a direct effect on decreasing alert fatigue and increasing awareness.

Avner Yaacov
Senior Manager of Cloud Operations
key indicator 1
key indicator 2
key indicator 3
key indicator 4
key indicator 1
key indicator 2
key indicator 4
key indicator 1
About
Redis
Industry:  
Storage, Streaming & Messaging
Location:  
Mountain View, California

About Redis
Redis is the world's leading real‑time data platform. It is an open-source, in-memory data store used as a database, cache, streaming engine, and message broker. Redis makes apps faster by creating a data foundation for a real-time world.

Since Redis is a product used by developers in real time, uptime is critical. Any downtime or outage has to be immediately addressed by alerting their engineers. Redis also offers SLAs that guarantee 99.999% availability for select deployments, ensuring less than 20 minutes of downtime per year.

Task at Hand
Prior to Squadcast, the CloudOps and DevOps teams were reliant on email-based alerting for monitoring most of their services. As their systems have grown, handling alerts and managing incidents via Email became challenging. 

Not only were they unable to handle alerts effectively, but the tool they were using for alerting on high-severity incidents didn’t provide the flexibility to merge duplicate incidents the way they wanted. As a result of this, they started evaluating other tools in the market. They zeroed in on Squadcast, as the best alternative for their use case as the platform has all the necessary Security and Compliance requirements, such as the EU-US Privacy Shield, ISO 27001, SOC2 Type II, and GDPR compliance. Since they started using Squadcast, they’ve enjoyed numerous benefits not only improving Incident Response but also their overall Reliability.

Challenges and Solutions
Challenges
Solutions

Rudimentary Alert Management System: Before introducing Squadcast, Redis used an email-based alerting system. This meant every alert generated from their Monitoring Tools landed in their inboxes, demanding acknowledgement. For just a single incident, one could receive up to 100 repetitive emails, creating significant noise. This inundation became particularly cumbersome given the operation of multiple servers.

Streamlined Alert Management With Deduplication Rules: Upon integrating Squadcast and its Deduplication Rules, Redis was able to significantly reduce alert noise. With Deduplication Rules, Incidents are systematically organized and similar incidents are grouped together, granting easy access to related Incidents when necessary. This not only minimized the influx of redundant Incidents but also fostered a more structured approach to alert management.

Absence of Quantifiable Metrics: When Redis relied on emails for managing incident actions, they faced a significant void in metrics. Without a structured system, there was no way to gauge when an incident was acknowledged or when a response was initiated, leaving them in the dark regarding Incident Management efficacy.

Implementation of MTTA and MTTR Metrics With Incident Analytics: Introducing the Mean Time to Acknowledge (MTTA) and Mean Time to Resolve (MTTR) metrics became a game-changer for the Redis operations team with the Incident Analytics feature. Previously making decisions was based on intuition; these metrics provided concrete data to understand which incidents impacted them the most and the volume of incidents they were encountering. By realizing the extent of the incidents, they were better positioned to determine staffing needs for their operations. This led to a more structured ramp-up in their operations, making Incident Management more professional.

Fragmented Toolset: Prior to Squadcast, Redis found itself navigating through scattered tools for operational efficiency. The primary concern was their reliance on Confluence for managing runbooks, which often led to decentralization and potential inefficiencies.

Centralized Management With Runbooks Integrated into Squadcast: Transitioning to Squadcast transformed Redis' approach to runbook management. By consolidating runbooks within Squadcast, they achieved a unified, streamlined process, eliminating the pitfalls associated with scattered tooling. This transition not only enhanced accessibility but also promoted operational consistency and efficiency.

Lost Incident Histories in Email-Based Alert System: With the email alerting system that Redis previously employed, there was a gap in tracking and documenting actions taken for specific incidents. This limitation was particularly challenging for newcomers, who found it nearly impossible to trace the steps undertaken during past incidents and understand the organization's response patterns.

Incident Documentation With Incident Notes: Redis adopted the innovative approach of using Incident Notes within Squadcast as their Postmortems. This strategic move ensures that anyone in the organization, be it a new recruit or an existing employee, can readily access historical data and actions taken during past incidents. This vastly helps in reducing toil across the organization.

Key Benefits
Easy Setup & Outstanding Support

With the unwavering assistance of Squadcast's Support and Sales teams, Redis experienced a seamless migration. By offering an in-depth insight into Squadcast's capabilities, they not only saved Redis invaluable time but also ensured a smooth and effortless transition to their platform.

Transition to Efficient Incident Management

Where basic alert systems once inundated teams with emails, requiring manual acknowledgment, Squadcast has now streamlined the Incident Management process. With a remarkable reduction of 130,000 email alerts, Squadcast has effectively mitigated alert noise, ensuring teams can focus on what truly matters.

SRE (Site Reliability Engineering) Adoption

Squadcast’s features, including Runbooks, and Incident Notes, serve as pillars in establishing and nurturing a profound SRE culture at Redis.

Key Benefits
Easy Setup & Outstanding Support

With the unwavering assistance of Squadcast's Support and Sales teams, Redis experienced a seamless migration. By offering an in-depth insight into Squadcast's capabilities, they not only saved Redis invaluable time but also ensured a smooth and effortless transition to their platform.

Transition to Efficient Incident Management

Where basic alert systems once inundated teams with emails, requiring manual acknowledgment, Squadcast has now streamlined the Incident Management process. With a remarkable reduction of 130,000 email alerts, Squadcast has effectively mitigated alert noise, ensuring teams can focus on what truly matters.

SRE (Site Reliability Engineering) Adoption

Squadcast’s features, including Runbooks, and Incident Notes, serve as pillars in establishing and nurturing a profound SRE culture at Redis.

What our customers have to say
Avner Yaacov
Senior Manager of Cloud Operations
Squadcast is a leader in Incident Management on G2 Squadcast is a leader in IT Service Management (ITSM) Tools on G2 Squadcast is a leader in IT Alerting on G2
Efficient Incident Management With Squadcast

As a Global Cloud Operations Manager, replacing any monitoring system could be an immense effort involving a lot of concerns. Our deployment with Squadcast has been a great experience from A to Z. The team has been constantly responsive and helpful throughout the deployment and adoption process. Having such a partnership gives me the peace of mind I need to be 100% sure no alerts go unnoticed. Since the implementation of Squadcast, we’ve managed to reduce the number of incoming alerts from tens of thousands to hundreds, thanks to the flexible deduplication mechanism. CloudOps work has never been more organized and for the first time, we are able to use the Postmortem feature as a true knowledge base for repeating alerts. Squadcast brings simplicity and flexibility and has a direct effect on decreasing alert fatigue and increasing awareness.

Efficient Incident Management With Squadcast

As a Global Cloud Operations Manager, replacing any monitoring system could be an immense effort involving a lot of concerns. Our deployment with Squadcast has been a great experience from A to Z. The team has been constantly responsive and helpful throughout the deployment and adoption process. Having such a partnership gives me the peace of mind I need to be 100% sure no alerts go unnoticed. Since the implementation of Squadcast, we’ve managed to reduce the number of incoming alerts from tens of thousands to hundreds, thanks to the flexible deduplication mechanism. CloudOps work has never been more organized and for the first time, we are able to use the Postmortem feature as a true knowledge base for repeating alerts. Squadcast brings simplicity and flexibility and has a direct effect on decreasing alert fatigue and increasing awareness.

Avner Yaacov
Senior Manager of Cloud Operations
Read more like this
FinBox
Learn how Squadcast helped FinBox in solving their real-time monitoring challenges and simplified their On-call process
REad case study
Udaan
Udaan found the perfect partner in Squadcast in their journey to deliver super-reliable services
REad case study
Klever
Discover how Klever enhanced their response time by transitioning from manual to automated On-Call Scheduling for their globally distributed workforce using Squadcast
REad case study

Looking for Something Specific?

Explore all our resources
Start your journey now
From startups to Fortune 500s, the world's most effective teams use Squadcast to turbocharge their Reliability journey.
Incident Response Mobility
Manage incidents on the go with Squadcast mobile app for Android and iOS devices
Squadcast is a leader in Incident Management on G2 Squadcast is a leader in Mid-Market IT Service Management (ITSM) Tools on G2 Squadcast is a leader in Americas IT Alerting on G2 Best IT Management Products 2024 Squadcast is a leader in Europe IT Alerting on G2 Squadcast is a leader in Enterprise Incident Management on G2 Users love Squadcast on G2
Squadcast is a leader in Incident Management on G2 Squadcast is a leader in Mid-Market IT Service Management (ITSM) Tools on G2 Squadcast is a leader in Americas IT Alerting on G2 Best IT Management Products 2024 Squadcast is a leader in Europe IT Alerting on G2 Squadcast is a leader in Enterprise Incident Management on G2 Users love Squadcast on G2
Squadcast is a leader in Incident Management on G2 Squadcast is a leader in Mid-Market IT Service Management (ITSM) Tools on G2 Squadcast is a leader in Americas IT Alerting on G2
Best IT Management Products 2024 Squadcast is a leader in Europe IT Alerting on G2 Squadcast is a leader in Enterprise Incident Management on G2
Users love Squadcast on G2
Copyright © Squadcast Inc. 2017-2024