Ubuntu Server Troubleshooting Checklist

By VPS.DO
February 22, 2026

This checklist provides a structured, layered approach to diagnose and resolve the most common problems on Ubuntu Server (24.04 LTS Noble Numbat and later point releases in 2026). Start at the top and work down, gathering evidence at each step before jumping to fixes. The goal is systematic elimination rather than random guessing.

1. Quick Health Snapshot (First 30–60 seconds)

Collect baseline data before making changes:

Current load, uptime, and CPU/memory overview
uptime — high load with low CPU usage usually points to I/O wait or uninterruptible sleep
top or htop (install if missing: sudo apt install htop) — sort by %CPU / %MEM, check %wa (iowait), %st (steal in VMs)
free -h + vmstat 1 5 — look for swap activity (si/so >0), high memory pressure
df -h + df -i — disk space / inode exhaustion
ss -tulnp — listening ports and owning processes
ip -c addr + ip route — network interfaces and default route

2. Boot & Early System Problems

Symptoms: server won’t boot, stuck in initramfs, no login prompt, very long boot time.

Check last boot logs: journalctl -b -1 (previous boot)
Slow boot units: systemd-analyze blame + systemd-analyze critical-chain
Initramfs emergency shell: check dmesg for disk/RAID/LUKS failures
GRUB issues: boot parameters missing/wrong → edit GRUB at boot (e or c key), or update-grub from recovery
Kernel panic or hardware faults: dmesg | grep -i error, check /var/log/kern.log
Fix common: fsck on unmounted filesystems, regenerate initramfs (update-initramfs -u -k all)

3. Networking & Connectivity Failures

Symptoms: no SSH, can’t ping, DNS fails, intermittent drops.

Interface status: ip link show, ip -c addr
Test layers: ping 8.8.8.8 → ping google.com → traceroute 8.8.8.8
DNS: resolvectl status, systemd-resolved logs (journalctl -u systemd-resolved)
Netplan config: netplan generate –debug, netplan try to test changes safely
Firewall: sudo ufw status verbose or sudo nft list ruleset
Common fixes: fix YAML syntax, restart systemd-networkd, check carrier (ethtool eth0), driver/firmware (dmesg | grep firmware)

4. Service Startup & Runtime Failures

Symptoms: service won’t start, restarts loop, crashes after start.

Status & logs: sudo systemctl status servicename + journalctl -u servicename -xe
Recent errors: journalctl -u servicename –since “10 minutes ago”
Dependency chain: systemctl list-dependencies servicename
Config syntax: nginx -t, apachectl configtest, sshd -t
Fix common: missing dependencies (apt install), wrong permissions, socket activation conflicts, cgroup/resource limits

5. Resource Exhaustion & Performance Degradation

Symptoms: high load, slow response, unresponsive services.

CPU/Memory: htop (sort by CPU/MEM), ps -eo pid,%cpu,%mem,rss,cmd –sort=-%cpu | head
Disk I/O: iostat -xmdz 1, iotop –only (install sysstat & iotop)
Swap thrashing: vmstat 1, free -h, cat /proc/meminfo | grep -i swap
Open files / processes: ulimit -n, lsof | wc -l, ps aux | wc -l
Fix common: kill high-resource processes (SIGTERM first), add zram/swap, tune vm.swappiness, upgrade storage (NVMe), optimize application

6. Disk Space & Filesystem Issues

Symptoms: “No space left on device”, services fail to write logs.

Usage: df -h, du -h –max-depth=1 / | sort -hr
Inodes: df -i
Top consumers: ncdu / (install ncdu) or sudo du -sh /var/* /home/*
Common culprits: /var/log (huge journals/logs), /var/cache/apt, old snaps/kernels, deleted-but-open files (lsof +L1)
Fix: apt clean/autoclean/autoremove, journalctl –vacuum-size=500M, snap cleanup, truncate logs

7. Security & Authentication Problems

Symptoms: login fails, brute-force noise, unauthorized access.

SSH/auth logs: tail -f /var/log/auth.log, journalctl -u ssh
Failed logins: grep “Failed password” /var/log/auth.log | wc -l
Firewall blocks: sudo ufw status, journalctl -u ufw
Fix common: harden SSH (keys only, no root), install fail2ban, review sudoers, check AppArmor (aa-status)

8. Package & Update-Related Issues

Symptoms: broken packages, failed upgrades, dependency hell.

Fix broken installs: sudo apt update –fix-missing && sudo apt install -f
Held packages: apt-mark showhold
Partial upgrades: sudo dpkg –configure -a
PPA conflicts: remove problematic PPAs, apt policy packagename

General Rules of Thumb (2026 Context)

Always check logs first — journalctl -b -p err..emerg for critical errors since boot
Use systemd-analyze suite for boot/performance insight
Prefer non-destructive tests: netplan try, nginx -t, sshd -t
Document before/after states — screenshot top, df -h, free -h
If VM/cloud: check hypervisor metrics (CPU steal, burst credits, IOPS limits)
For persistent issues: boot older kernel from GRUB, use recovery mode, or live USB diagnostics

Most real-world Ubuntu Server issues fall into one of these eight categories and resolve quickly once you follow the layered diagnostic path. Start broad (uptime/top/df), narrow to logs/services/network/disk, then apply targeted fixes.

If you describe your specific symptom (e.g. “SSH refused”, “high load low CPU”, “boot hangs 2 minutes”, paste output from uptime, top, journalctl -b -p err), I can walk through the exact checklist section for your case.

Ubuntu Server Troubleshooting Checklist