Mastering Linux Disk Usage: Essential Commands and Tools

Mastering Linux Disk Usage: Essential Commands and Tools

Master Linux disk usage with practical, no-nonsense guidance—from filesystems and inodes to df, du, and snapshot pitfalls. Learn the commands and workflows that help you find hidden space hogs, prevent outages, and plan storage growth with confidence.

Managing disk usage is a core responsibility for system administrators, developers, and website operators. On Linux systems, efficient disk management avoids performance bottlenecks, prevents service outages, and ensures predictable growth planning. This article dives into the technical details of disk usage — from low-level filesystem concepts to practical command-line tools and workflows — helping you master storage on VPS and dedicated servers.

Understanding the fundamentals: filesystems, inodes, and blocks

Before running commands, it’s important to understand how Linux stores data. Filesystems like ext4, XFS, and Btrfs abstract physical storage into blocks and metadata structures.

  • Blocks: Files are stored in fixed-size blocks (commonly 4 KiB). The filesystem maps files to these blocks; partial blocks still consume the full block on disk.
  • Inodes: Each file/directory has an inode containing metadata (ownership, permissions, timestamps) and pointers to blocks. A filesystem can run out of inodes before running out of space, causing “No space left on device” errors.
  • Reserved space: Ext-family filesystems reserve ~5% by default for root to prevent fragmentation and allow recovery. On small partitions, this reserved space can be adjusted using tune2fs -m.
  • Logical volumes & snapshots: LVM and Btrfs/ZFS snapshots complicate usage reporting because deleted files may remain referenced by snapshots, still consuming space.

Essential commands to inspect disk usage

The CLI provides powerful tools to get both high-level and granular views of disk usage.

df — filesystem-level usage

df reports disk space usage for mounted filesystems. Familiar options:

  • df -h human-readable sizes (KiB/MiB/GiB).
  • df -i show inode usage to detect inode exhaustion.
  • df -T show filesystem types (ext4, xfs, etc.).

Example: df -h /var quickly shows space for the partition hosting /var. If usage is unexpectedly high, drill down with directory-level tools.

du — directory and file sizes

du summarizes disk usage for directories. Useful options:

  • du -sh * human-readable totals for items in the current directory.
  • du -ah /path | sort -rh | head -n 20 list the top 20 largest files and directories under /path.
  • du --max-depth=1 -h /var quickly enumerates top-level usage inside /var.

Tip: Run du as root to include files inaccessible to your user. Combine with --apparent-size when dealing with sparse files to view logical sizes.

lsblk, blkid, and fdisk/parted — block device insights

  • lsblk -f shows device tree and filesystem labels/UUIDs.
  • blkid prints device UUIDs and types.
  • fdisk -l or parted -l lists partition layouts and sizes.

These are crucial when resizing partitions, managing LVM, or troubleshooting mount points that point to unexpected devices.

lsof and fuser — deleted-but-open files

Processes can hold file handles open even after the file is deleted, continuing to consume space. To find such files:

  • lsof / | grep deleted lists deleted-but-open files.
  • fuser -m /path shows PIDs using a mount. Restarting or terminating the process frees the space.

ncdu and graphical utilities

ncdu (NCurses Disk Usage) is an interactive, fast directory-size viewer ideal for remote servers. Install via the distro package manager and run ncdu / to browse usage and delete items directly from the interface.

iotop and atop — I/O performance monitoring

High disk I/O is often correlated with high disk usage or inefficient operations. Use iotop to find processes with the most read/write bandwidth in real-time. atop provides longer-term historic views if enabled.

Common causes of unexpected disk usage and how to resolve them

Log growth and rotation issues

Logs in /var/log can grow quickly. Use logrotate to compress and rotate logs. Check configuration in /etc/logrotate.d/. For immediate relief, compress old logs with gzip or remove logs older than a threshold with find /var/log -type f -mtime +30 -exec gzip {} ;.

Cache directories and package caches

Package managers and application caches (e.g., /var/cache/apt/archives) consume space. Clean them periodically:

  • APT: apt-get clean
  • DNF/YUM: dnf clean all or yum clean all
  • Composer/NPM/Yarn caches: use their respective clean commands

Sparse files and virtual machine disks

VM disk images or databases may contain large sparse files. Use du --apparent-size vs du to detect sparsity. To reclaim space inside QCOW2 images or virtual disk files, use the hypervisor’s tools (e.g., qemu-img convert/compress) and zero free space inside the guest, then shrink the image.

Snapshots and backups

Snapshots (LVM, Btrfs, ZFS) can keep deleted data reachable. Regularly prune old snapshots with retention policies and monitor snapshot space consumption. For backups stored locally, consider offloading to remote storage to reduce local usage.

Advanced techniques and preventive measures

Filesystem tuning and resizing

When you need more capacity, options include:

  • Resize the filesystem: For ext4, unmount and use resize2fs (or online resize with ext4/xfs where supported). XFS requires xfs_growfs and can only grow online.
  • Extend logical volumes: LVM makes it easy to extend storage by adding PVs to a VG and then performing lvextend followed by a filesystem resize.
  • Adjust reserved blocks: Free small partitions by lowering the reserved block percentage with tune2fs -m 1 /dev/sda1 (use cautiously).

Quotas and access control

Implement user or group quotas via the kernel quota system to prevent single users from exhausting space. Enable with quotaon and configure limits via edquota. This is especially valuable on shared hosting or multi-tenant VPS instances.

Automated housekeeping

Automate cleanup using systemd timers, cron jobs, or tools like tmpreaper and systemd-tmpfiles to prune temporary directories. Pair automation with alerting (monitor filesystems via Nagios, Zabbix, Prometheus) to catch growth trends early.

Comparing tools and approaches: choose the right fit

There is no one-size-fits-all tool. Choose based on typical tasks and scale.

  • Interactive cleanup: ncdu excels at interactive exploration and quick manual pruning.
  • Scripted reporting: combine du, find, and sort in scripts to generate regular reports.
  • Performance troubleshooting: iotop, atop, and sar provide I/O and historical metrics for investigating slowdowns related to disk activity.
  • Long-term capacity planning: Use monitoring systems to record trends and base procurement/resizing decisions on growth rates rather than single measurements.

Practical usage scenarios

Scenario: a web server running out of /var

Steps to diagnose and resolve:

  • Run df -h /var and du --max-depth=1 -h /var to find large subdirectories.
  • Inspect logs: ls -lh /var/log and rotate/compress using logrotate.
  • Check for open-but-deleted files: lsof | grep deleted.
  • Clear package caches: apt-get clean.
  • If more space is needed, extend the partition/LV or move large data directories to another mount and symlink them.

Scenario: database growth on a VPS

For databases, avoid ad-hoc deletions. Instead:

  • Run VACUUM or OPTIMIZE commands appropriate to the DBMS (e.g., PostgreSQL VACUUM, MySQL OPTIMIZE TABLE) to reclaim space.
  • Consider moving database files to dedicated disks or LVM logical volumes for independent scaling.
  • Implement retention policies and partitioning to automatically purge old data.

Choosing storage for VPS and cloud environments

When selecting a VPS plan or storage tier, evaluate these criteria:

  • IOPS and throughput: For high database or CMS activity, disk I/O matters more than raw capacity. Choose SSD-backed plans with predictable IOPS.
  • Scalability: Look for options that allow online volume resizing or easy migration to larger instances.
  • Snapshot and backup features: Built-in snapshotting simplifies backups but remember snapshots consume space and need lifecycle management.
  • Filesystem flexibility: If you require advanced features like deduplication or compression, consider filesystems (or providers) that offer them.

For operators in or targeting the US market, providers with local presence can reduce latency for end-users. Learn more about one such provider here: USA VPS. For general hosting options, visit the provider site at VPS.DO.

Summary

Effective disk usage management on Linux blends understanding filesystem internals, using the right diagnostic tools, and applying suitable remediation and preventive measures. Regular monitoring, automated housekeeping, and sensible storage choices reduce the risk of outages and keep systems performant. Start with df and du for quick diagnostics, use ncdu for interactive cleanup, and combine LVM/filesystem resizing when you need to grow capacity. Implement quotas and snapshot lifecycle policies on shared systems to avoid surprises. When selecting VPS storage, prioritize IOPS, scalability, and snapshot/backup capabilities to match your workload.

If you are evaluating VPS options that make storage management easier, consider providers offering SSD-backed VPS, flexible volume resizing, and snapshot support for efficient backups. For a US-based option, see USA VPS, and for more details about offerings, visit VPS.DO.

Fast • Reliable • Affordable VPS - DO It Now!

Get top VPS hosting with VPS.DO’s fast, low-cost plans. Try risk-free with our 7-day no-questions-asked refund and start today!