Should You Run a Database on Kubernetes?

Historically, stateful workloads were run outside container orchestrators. Platforms built on top of orchestrators like Kubernetes were not, yet, proficient at dealing with data. But these systems have matured and now offer features the allow stateful workloads to be efficiently managed by such systems. If you want to find how can databases be run on Kubernetes, what mechanisms it offers, and what type of databases and data are best suited for it, check this out....

November 17, 2021

How to Measure System Reliability

Originally published on Cprime. As businesses grow, new requirements arise for teams. The technology ecosystem becomes ever more complex and it is really important to understand each change and how it affects the overall system, as well as the service provided to users, who have high expectations. They expect systems to be up, responsive, fast, consistent, and reliable. System ReliabilityReliability for systems means that a system is doing what its users need it to do....

November 11, 2021

How to improve your influence as an SRE

Originally published on Squadcast. Balancing fast-paced business requirements with the demands of keeping production services stable is not an easy task. SRE is an opinionated implementation of DevOps and is defined by Ben Sloss, VP of Engineering at Google as “what happens when you ask a software engineer to design an operations function”. And it even comes with a completely free manual and workbook. Although SRE aims to be a “prescription” on how to run complex systems the right way, reliability can mean different things in different contexts....

November 10, 2021

Troubleshooting Kubernetes FailedAttachVolume and FailedMount

You’ve been working with Kubernetes and sometime your stateful workloads fail. You regularly get FailedAttachVolume and FailedMount errors and you have no idea what’s going on. If you want to find out what’s going on and how you can go about fixing quickly them, check this out.

November 9, 2021

From Zero to SRE

Originally published on Squadcast. Traditionally, developing applications and running them in production was seen as completely separate worlds, usually being the focus and concern of different teams. This kind of separation gives birth to the proverbial wall that separates development and operations, where developers “throw” their code over the wall and expect operations teams to run and manage them in production. This results in teams having different and conflicting goals: development teams prioritize building and shipping new features while operations teams focus on system stability, where code changes are seen as potential threats....

September 14, 2021