Showing posts with the label DevOps

Production Docker: Dropping Alpine for Distroless to kill CVEs and bloated layers

If you run a vulnerability scan on your "slim" production images right now, the results might terrify you. I recently audited a fleet of microservices running on standard debian:bullseye-…
Production Docker: Dropping Alpine for Distroless to kill CVEs and bloated layers

Prometheus Storage Full? Scaling to S3 with Thanos Sidecar

It started with a classic paging alert at 3:14 AM: DiskUsageHigh: 95% on prometheus-data . We were running a standard Prometheus setup on Kubernetes, collecting metrics from about 400 microservices…
Prometheus Storage Full? Scaling to S3 with Thanos Sidecar

Prometheus HA: De discos llenos a retención infinita con Thanos Sidecar

Hace dos semanas, nuestro clúster de producción en Kubernetes (v1.28, ejecutándose sobre instancias AWS m5.xlarge) disparó una alerta crítica a las 3:00 AM: DiskPressure en el nodo que alojaba nue…
Prometheus HA: De discos llenos a retención infinita con Thanos Sidecar

Surviving the 2-Minute Warning: Zero Downtime on EKS Spot Instances

It started with a subtle anomaly in our Datadog dashboards. Every day at roughly 10:00 AM UTC—coinciding with the daily market price fluctuation in our chosen availability zone—our API gateway succe…
Surviving the 2-Minute Warning: Zero Downtime on EKS Spot Instances
OlderHomeNewest