Ubuntu Server Troubleshooting Checklist

Ubuntu Server Troubleshooting Checklist

This checklist provides a structured, layered approach to diagnose and resolve the most common problems on Ubuntu Server (24.04 LTS Noble Numbat and later point releases in 2026). Start at the top and work down, gathering evidence at each step before jumping to fixes. The goal is systematic elimination rather than random guessing.

1. Quick Health Snapshot (First 30–60 seconds)

Collect baseline data before making changes:

  • Current load, uptime, and CPU/memory overview
  • uptime — high load with low CPU usage usually points to I/O wait or uninterruptible sleep
  • top or htop (install if missing: sudo apt install htop) — sort by %CPU / %MEM, check %wa (iowait), %st (steal in VMs)
  • free -h + vmstat 1 5 — look for swap activity (si/so >0), high memory pressure
  • df -h + df -i — disk space / inode exhaustion
  • ss -tulnp — listening ports and owning processes
  • ip -c addr + ip route — network interfaces and default route

2. Boot & Early System Problems

Symptoms: server won’t boot, stuck in initramfs, no login prompt, very long boot time.

  • Check last boot logs: journalctl -b -1 (previous boot)
  • Slow boot units: systemd-analyze blame + systemd-analyze critical-chain
  • Initramfs emergency shell: check dmesg for disk/RAID/LUKS failures
  • GRUB issues: boot parameters missing/wrong → edit GRUB at boot (e or c key), or update-grub from recovery
  • Kernel panic or hardware faults: dmesg | grep -i error, check /var/log/kern.log
  • Fix common: fsck on unmounted filesystems, regenerate initramfs (update-initramfs -u -k all)

3. Networking & Connectivity Failures

Symptoms: no SSH, can’t ping, DNS fails, intermittent drops.

  • Interface status: ip link show, ip -c addr
  • Test layers: ping 8.8.8.8 → ping google.com → traceroute 8.8.8.8
  • DNS: resolvectl status, systemd-resolved logs (journalctl -u systemd-resolved)
  • Netplan config: netplan generate –debug, netplan try to test changes safely
  • Firewall: sudo ufw status verbose or sudo nft list ruleset
  • Common fixes: fix YAML syntax, restart systemd-networkd, check carrier (ethtool eth0), driver/firmware (dmesg | grep firmware)

4. Service Startup & Runtime Failures

Symptoms: service won’t start, restarts loop, crashes after start.

  • Status & logs: sudo systemctl status servicename + journalctl -u servicename -xe
  • Recent errors: journalctl -u servicename –since “10 minutes ago”
  • Dependency chain: systemctl list-dependencies servicename
  • Config syntax: nginx -t, apachectl configtest, sshd -t
  • Fix common: missing dependencies (apt install), wrong permissions, socket activation conflicts, cgroup/resource limits

5. Resource Exhaustion & Performance Degradation

Symptoms: high load, slow response, unresponsive services.

  • CPU/Memory: htop (sort by CPU/MEM), ps -eo pid,%cpu,%mem,rss,cmd –sort=-%cpu | head
  • Disk I/O: iostat -xmdz 1, iotop –only (install sysstat & iotop)
  • Swap thrashing: vmstat 1, free -h, cat /proc/meminfo | grep -i swap
  • Open files / processes: ulimit -n, lsof | wc -l, ps aux | wc -l
  • Fix common: kill high-resource processes (SIGTERM first), add zram/swap, tune vm.swappiness, upgrade storage (NVMe), optimize application

6. Disk Space & Filesystem Issues

Symptoms: “No space left on device”, services fail to write logs.

  • Usage: df -h, du -h –max-depth=1 / | sort -hr
  • Inodes: df -i
  • Top consumers: ncdu / (install ncdu) or sudo du -sh /var/* /home/*
  • Common culprits: /var/log (huge journals/logs), /var/cache/apt, old snaps/kernels, deleted-but-open files (lsof +L1)
  • Fix: apt clean/autoclean/autoremove, journalctl –vacuum-size=500M, snap cleanup, truncate logs

7. Security & Authentication Problems

Symptoms: login fails, brute-force noise, unauthorized access.

  • SSH/auth logs: tail -f /var/log/auth.log, journalctl -u ssh
  • Failed logins: grep “Failed password” /var/log/auth.log | wc -l
  • Firewall blocks: sudo ufw status, journalctl -u ufw
  • Fix common: harden SSH (keys only, no root), install fail2ban, review sudoers, check AppArmor (aa-status)

8. Package & Update-Related Issues

Symptoms: broken packages, failed upgrades, dependency hell.

  • Fix broken installs: sudo apt update –fix-missing && sudo apt install -f
  • Held packages: apt-mark showhold
  • Partial upgrades: sudo dpkg –configure -a
  • PPA conflicts: remove problematic PPAs, apt policy packagename

General Rules of Thumb (2026 Context)

  • Always check logs first — journalctl -b -p err..emerg for critical errors since boot
  • Use systemd-analyze suite for boot/performance insight
  • Prefer non-destructive tests: netplan try, nginx -t, sshd -t
  • Document before/after states — screenshot top, df -h, free -h
  • If VM/cloud: check hypervisor metrics (CPU steal, burst credits, IOPS limits)
  • For persistent issues: boot older kernel from GRUB, use recovery mode, or live USB diagnostics

Most real-world Ubuntu Server issues fall into one of these eight categories and resolve quickly once you follow the layered diagnostic path. Start broad (uptime/top/df), narrow to logs/services/network/disk, then apply targeted fixes.

If you describe your specific symptom (e.g. “SSH refused”, “high load low CPU”, “boot hangs 2 minutes”, paste output from uptime, top, journalctl -b -p err), I can walk through the exact checklist section for your case.

Fast • Reliable • Affordable VPS - DO It Now!

Get top VPS hosting with VPS.DO’s fast, low-cost plans. Try risk-free with our 7-day no-questions-asked refund and start today!