DevOps Journey Part 4 – Site Reliability Engineering

This entry is part 4 of 6 in the series DevOps Journey

We’ve talked about why stuff breaks in the production environment when it didn’t in development (see “Works for me”), how Continuous Improvement / Continuous Deployment helps in failure detection, and how we can set up a “like-production” staging environment in which to test your features. At this point in the journey, we act as consultants…


DevOps Journey Part 3 – Staging Environment

This entry is part 3 of 6 in the series DevOps Journey

We’ve talked already about two parts of the journey: “Works for me” and Continuous Improvement / Continuous Development (CI/CD). This time, we’ll talk about the importance of an effective staging environment in which to test a bunch of stuff before going into production. The main things we do in our staging environments are testing QA…


DevOps Journey Part 2 – CI/CD

This entry is part 2 of 6 in the series DevOps Journey

In our client’s DevOps journey, we are shooting for an efficient, stable, and reactive development-to-production workflow in your infrastructure. Let’s talk about the Continuous Improvement/Continuous Development (CI/CD) phase along the journey! Failure IS an option! It’s in your best interest to fail and FAIL FAST. What? No, that’s not a confused marketing strategy to simply get…


Optimizing Grafana and Prometheus rendering performance using Trickster

Trickster is a reverse proxy cache for the Prometheus HTTP APIv1 that dramatically accelerates dashboard rendering times for any series queried from Prometheus. See our previous post about Why we Love Grafana and Prometheus. We are always super impatient so love cool things like Trickster. Dashboards that automatically refresh should now load on average 90% faster. Oh yeah!…


CI/CD in Kubernetes

Continuous improvement and continuous development (CI/CD) is a landmark of solid Linux and DevOps work, specifically in Kubernetes and Jenkins. The key here is to create automated tools around the process of failure detection (at Crafty Penguins, we refer to this as “failing fast!”) Too often, we see that it takes too long to get…