Troubleshoot a Worker Service That Keeps Disappearing
A background worker.service has been deployed on this server.
It's supposed to run continuously, picking up batches from a job queue.
But there's a problem - the service starts cleanly, prints a few heartbeat lines,
then picks up a batch of work for processing and shortly afterwards silently crashes.
systemd brings it back up again, but the job handling never advances beyond the first batch.
Your task: diagnose the root cause of the service's repeated death and find a way
to make worker.service stay up and running for an extended period of time.
Hint 1
Since the worker is a systemd service, you can use the journalctl command to browse its logs
and the systemctl command to check its status.
Tail the worker's logs and see if you can spot any issues.
Then review the systemctl status output for any additional clues.
Hint 2
When a service is launched by systemd, it's placed inside a dedicated cgroup,
and any resource limits configured on the unit
(MemoryMax, MemoryHigh, CPUQuota, etc.) are enforced through that cgroup.
If a process inside such a cgroup tries to use more memory than the cgroup allows,
the kernel terminates it with SIGKILL and records the event in the kernel log
(accessible via journalctl -k).
Inspect the unit file to see what limits are currently in place:
systemctl cat worker.service
Hint 3
Editing /etc/systemd/system/worker.service directly works,
but the systemd-native way to override a single directive is via a drop-in file:
sudo systemctl edit worker.service
This opens an editor where you can add a small [Service] section that overrides
specific directives - no need to copy the rest of the unit.