Troubleshoot a Worker Service That Keeps Disappearing
A systemd-managed worker process starts cleanly, prints a few heartbeat lines, then silently vanishes - over and over again. Diagnose why the process keeps dying and stabilize the service.
Focused hands-on problems designed to help you hone your DevOps or Server Side skills. Some challenges are more educational, while others are based on real-world scenarios. The platform provides hints and feedback for each challenge, including automated solution checks.
A systemd-managed worker process starts cleanly, prints a few heartbeat lines, then silently vanishes - over and over again. Diagnose why the process keeps dying and stabilize the service.
Practice pausing and resuming a running container: start a resource-hungry container, pause it, inspect its state, and then unpause it back to life.
Practice pausing (freeze) and resuming (thaw) a resource-hungry Linux process using the cgroup v2 freezer mechanism.
This challenge focuses on debugging memory usage issues in a Go application deployed in a Kubernetes cluster. The goal is to ensure the application can handle moderate traffic without crashing, even when it runs with significantly constrained memory resources.
Hack your way through this challenge making the OOM kills invisible again.
Prove your SRE skills - deploy a resource-greedy application to a Kubernetes cluster and make it run for a while without disrupting the service.
Run a multi-container Docker Compose application limiting its total CPU and memory usage without specifying the individual container's limits.
Learn how to fine-tune the container's cgroup to make the container exit when one of its processes runs out of memory.
Learn how to set up a cgroup v2 to make the OOM killer terminate the entire process group when one process goes out-of-memory.
Start a Linux process and limit its CPU and memory usage with cgroups.