The path to becoming a successful SRE lies in continuous learning. There are a plethora of great open source projects out there for SREs/DevOps,each with new and exciting implementations and often tackling unique challenges. These open-source projects do the heavy lifting so you can do your job more easily.
In this blog we look at some of the top and sought out open source projects in the areas of monitoring, deployment and maintenance. Among the projects we have covered are those that simulate network traffic and allow you to model unpredictable(chaotic) events to develop dependable systems.
And, while you are at it, we thought we could help a little more by providing some essential DevOps and SRE reading suggestions as well for all you tech folks out there.
We hope this keeps you good company.
Cloudprober is an active tracking and monitoring application to spot malfunctions before your customers do. It uses an "active" monitoring model to check that your components are operating as intended. It runs probes proactively, for instance, to ensure if your frontends can access your backends. Similarly, a probe can be run to verify that your on-premise systems can actually reach your in-Cloud VMs. This method of tracking makes it easy, independent of the implementation, to track the configurations of your applications and lets you easily pin down what is broken in your system.
Features:
Cloud Operations Sandbox is an open-source platform that lets specialists learn about Google's Service Reliability Engineering practices and adapt them to their cloud systems using Ops Management (formerly Stackdriver). It is based on the Hipster Shop, a cloud-based platform for native microservices. Note: This requires a Google cloud services account.
Features:
Kubernetes utility that allows you to observe existing versions of images that are running in the cluster. This tool also allows you to see the current image versions in table format on a Grafana dashboard.
Features:
Istio is an open framework for incorporating microservices, monitoring traffic movement through microservices, implementing policies and aggregating telemetry data in a standardised way. The control plane of Istio offers an abstraction layer over the underlying platform for cluster management, such as Kubernetes.
Features:
Checkov is an Infrastructure-as-Code static code review tool. It scans Terraform, Cloud Details, Cubanet, Serverless or ARM Models cloud infrastructure, and detects security and compliance misconfigurations.
Features:
Cloud-Native Chaos Engineering
Litmus is a cloud-based chaos modelling toolkit. Litmus provides tools to orchestrate chaos on Kubernetes to help SREs discover vulnerabilities in their deployments. SREs use Litmus to conduct chaos tests first in the staging area and finally in development to discover glitches and vulnerabilities. Fixing the deficiencies leads to improved system resilience.
Features:
Locust is a simple to use, scriptable and flexible performance testing application. You define the behaviour of your users in standard Python code, instead of using a clunky UI or domain specific language. This enables Locust to be extensible and developer friendly.
Features:
Prometheus, a Cloud Native Computing Foundation project, is a systems and service monitoring system. It extracts metrics from configured destinations at specific times, tests rules and shows outcomes. If specified criteria are violated, it will trigger notifications.
Features:
Kube-monkey is a Kubernetes cluster implementation of Netflix's Chaos Monkey. The random deletion of kubernetes pods facilitates the creation of failure-resistant resources and validates them at the same time.
Features:
PowerfulSeal injects failure into Kubernetes clusters, helping you to recognise issues as quickly as possible. It enables scenarios that portray complete chaos experiments to be created.
Features:
The great benefit of open source technologies is their extensible nature. You can add features to the tool if required to better fit your custom architecture. These open source projects have extensive support documentation and a community of users. As microservice architecture is slated to dominate the cloud computing space, reliable tools to monitor and troubleshoot these instances are sure to become part of every developer's arsenal.
You can also find more such awesome DevOps and SRE open source projects here. Meanwhile, we’d love to hear from you on other projects/tools that should make this list! Leave us a comment or reach out over a DM via Twitter and let us know your thoughts.