Why Ubuntu Server Load Is High but CPU Usage Is Low
On Ubuntu Server (and Linux in general), a high load average combined with low CPU utilization (high %id in top/htop, low user/system time) is one of the most common yet confusing performance symptoms. The key misunderstanding is that load average is not a direct measure of CPU usage—it reflects how many processes are runnable (ready to run on CPU) or in uninterruptible sleep (usually waiting for I/O).
What Load Average Actually Measures on Linux
Linux calculates the load average (shown by uptime, top, cat /proc/loadavg) as the exponentially decaying average number of tasks that are:
- In state R (runnable / running on CPU)
- In state D (uninterruptible sleep — typically waiting for synchronous disk I/O, NFS, or certain kernel operations)
Unlike traditional Unix systems (where load mostly ignored I/O wait), Linux includes D-state tasks in the load average. This makes load average a much broader indicator of system contention.
Rule of thumb for interpretation (on an N-core system):
- Load average << N → system is underutilized
- Load average ≈ N → healthy saturation
- Load average >> N (especially 5–10× N or more) → serious contention, even if CPU appears mostly idle
When CPU %usage is low but load is high, the extra load almost always comes from many processes stuck in D state (uninterruptible I/O wait).
Most Common Causes on Ubuntu Server
- Disk I/O Bottleneck (by far the #1 reason) Slow or saturated storage causes processes to block in D state while waiting for read/write completion. Typical culprits:
- Failing HDD/SSD
- Overloaded mechanical disks (high seek times)
- Database checkpoints, large log writes, backups/rsync
- Many small random reads/writes (e.g., high-concurrency web app hitting SQLite/MySQL/PostgreSQL on slow disk)
- NFS mounts with high latency Symptoms: High %wa (iowait) in top, high %util and await in iostat -x, processes in D state in ps aux | grep D
- Memory Pressure Leading to Thrashing or Swap Even without visible swapping (si/so in vmstat = 0), heavy anonymous memory pressure can cause short stalls during reclaim. More extreme: actual swapping to slow disk → massive D-state spikes. Check: vmstat 1, free -h, swapon –show, AnonHugePages / Swap in /proc/meminfo
- Network File System or Remote Storage Latency NFS, Ceph, iSCSI, EBS volumes on cloud (especially burstable gp2/gp3 without credits) → processes block waiting for network I/O. Similar effect: high D-state count, low local CPU/disk activity.
- Excessive Context Switching or Many Short-Lived Processes Thousands of quick system calls or fork/exec (e.g., bad CGI/PHP-FPM setup, log spamming) can inflate load without high sustained CPU.
- Virtualization Steal Time (cloud/VM environments) In hypervisors (KVM, VMware, cloud providers), stolen cycles show as %st in top. If high, the guest appears idle but load climbs because scheduler can’t get CPU time.
Rare cases: kernel bugs, certain drivers blocking in uninterruptible paths, but these are uncommon on modern Ubuntu.
Step-by-Step Diagnosis on Ubuntu Server
Run these in order to pinpoint the cause:
- Check current load and CPU breakdown
Bash
uptime top # Press 1 for per-core; look at %wa and %id htop # (install if needed) — sort by STATE (D = bad) - Look for processes in D state
Bash
ps -eo pid,ppid,state,pcpu,comm | grep '^....D' # or ps aux | awk '$8 ~ /D/ {print}' - Inspect disk I/O
Bash
sudo apt install sysstat iotop iostat -xmdz 1 # %util near 100%, high await/svctm = saturated/slow disk iotop --only # which processes are doing heavy I/O vmstat 1 # high bi/bo, non-zero wa - Check memory/swap pressure
Bash
free -h vmstat 1 5 # si/so columns sar -r 1 5 # %memused, kbcommit - Historical view (if sysstat enabled)
Bash
sar -u # CPU + iowait history sar -d # per-device disk stats sar -r # memory - If in a VM/cloud Look for %st (steal) in top/htop or virt-top / cloud metrics.
Quick Fixes Depending on Root Cause
- Disk bottleneck → Upgrade to NVMe, tune I/O scheduler (none/kyber), add noatime mount option, move hot data to tmpfs/redis/memcached, optimize queries/indexes.
- Swap thrashing → Increase RAM, lower swappiness (vm.swappiness=10 or lower), enable zram.
- NFS/remote storage → Increase read/write sizes, tune mount options (rsize,wsize,actimeo), or cache locally.
- Application-level → Fix chatty loops, batch writes, use async I/O where possible.
In short: High load + low CPU = too many processes waiting for something that isn’t the CPU—usually slow storage or network I/O. Start with iostat -x, iotop, and counting D-state processes; that resolves ~80–90% of these cases on Ubuntu Server.
If you can share output from uptime, top (header + top processes), iostat -x 1 5, or describe your workload (web, DB, containers, cloud VM?), more precise diagnosis is straightforward.