Linux Backup & Restore: A Practical, Step-by-Step Guide

Mastering Linux backup and restore doesnt have to be daunting—this practical, step-by-step guide walks you through the core tools, workflows, and storage choices that keep your systems resilient. Whether youre protecting a single VPS or hundreds of servers, youll learn how to pick the right backup types and ensure consistent, application-aware restores when you need them most.

Reliable backups are the backbone of any resilient Linux deployment. Whether you’re running a small business website on a single VPS or managing hundreds of servers in production, a well-designed backup and restore strategy prevents downtime, data loss, and compliance headaches. This article presents a practical, step-by-step guide to backing up and restoring Linux systems with rich technical detail—covering core tools, workflows, storage considerations, and how to choose the right approach for your environment.

Fundamental concepts and backup types

Before diving into tools and commands, it’s important to understand key concepts and backup types so you can choose the right approach.

Full backup: captures an entire filesystem, disk image, or dataset. Simple but often slow and storage intensive.
Incremental backup: stores only data changed since the last full or incremental backup. Saves space and bandwidth but requires a chain of increments to restore.
Differential backup: captures changes since the last full backup. Faster restore than incremental but grows larger over time.
Image vs file-level backups: image (block-level) backups like dd or LVM snapshots copy an entire partition or disk, including metadata and boot sectors. File-level backups (tar, rsync, borg) copy files and preserve metadata like permissions and extended attributes.
Offline vs online (hot) backups: offline backups require unmounting or downtime; online backups use filesystem-aware snapshots (LVM/ZFS/Btrfs) or application-aware tools (MySQL hot backups) to capture consistent state without downtime.

Consistency and application-aware backups

For databases and transactional services, consistency matters. File-copying a live database can lead to corruption or incomplete transactions. Use application-aware utilities:

MySQL/MariaDB: mysqldump for logical backups, or Percona XtraBackup for hot, non-blocking physical backups.
PostgreSQL: pg_dump for logical dumps, pg_basebackup or filesystem snapshots combined with transaction log shipping (WAL archiving) for point-in-time recovery.
MongoDB: mongodump or filesystem snapshots; for replica sets prefer secondary nodes for backups.

Core Linux backup tools and how to use them

Below are common tools with practical usage patterns and example commands.

rsync (file-level, efficient transfers)

rsync is ideal for file-level synchronization and incremental transfers over SSH. It preserves permissions, hard links, and extended attributes when used with appropriate flags.

Example: mirror /var/www to a remote backup host

rsync -aAXv --delete --numeric-ids --exclude={"/var/www/cache","/var/www/tmp"} /var/www/ backupuser@backup.example.com:/backups/www/

-a: archive mode (recursive, preserves permissions/ownership)
-A: preserve ACLs
-X: preserve extended attributes
–delete: remove files on destination that were deleted on source

tar (simple file-level archives)

tar is universally available and ideal for creating single-file archives. Combine with gzip or xz for compression. Use extended options to preserve SELinux context and sparse files.

tar --numeric-owner --acls --xattrs -cpf /backups/www-$(date +%F).tar --files-from=/tmp/www-filelist.txt

To extract while preserving permissions:

tar -xpf /backups/www-2025-11-12.tar -C /restore/

dd (block-level, raw images)

dd creates a byte-for-byte image of a disk or partition. Use with caution—it’s slow and the resulting image is size of the source device unless piped through compression.

dd if=/dev/sda bs=4M conv=sync,noerror | gzip -c > /backups/sda.img.gz

Note: dd is not snapshot-aware; use LVM snapshot before dd to avoid capturing an inconsistent state.

LVM, Btrfs, and ZFS snapshots (hot consistent backups)

Filesystem-level snapshots provide point-in-time consistency with minimal downtime. Common pattern: create snapshot, copy snapshot contents to backup storage, then remove snapshot.

LVM example:

lvcreate -L1G -s -n root_snap /dev/vg0/root

Copy snapshot with rsync:

rsync -aAX --delete /dev/vg0/root_snap/ backupuser@backup:/backups/root/

Then remove snapshot:

lvremove /dev/vg0/root_snap

Borg and Restic (deduplicating, encrypted backups)

Both borg and restic are modern choices for encrypted deduplicated backups with remote repository support and pruning policies.

Borg init and backup example:

borg init --encryption=repokey /srv/backup/repo

borg create --stats --progress /srv/backup/repo::$(date +%F) /etc /var/www

Automate pruning:

borg prune -v --list /srv/backup/repo --keep-daily=7 --keep-weekly=4 --keep-monthly=6

Backup architecture and storage considerations

Design your backup storage and retention with the 3-2-1 rule: at least three copies of data, two different media types, and one copy offsite.

Onsite fast storage for quick restores—local disks or attached SAN.
Offsite/remote storage for disaster recovery—object storage (S3-compatible), remote VPS, or another datacenter.
Cold vs hot storage: longer-term archives can be placed in cheaper cold storage, but restore times will be greater.
Encryption: always encrypt backups containing sensitive data. Use repository-level encryption (borg/restic) or filesystem-level encryption with keys stored separately.
Retention and compliance: implement retention based on RPO/RTO and legal requirements; use automated pruning to control storage costs.

Bandwidth and transfer optimization

Use deduplication (borg/restic), compression, and chunking to reduce transfer sizes. For large datasets, seed initial backups via shipping encrypted disks or using a cloud provider’s import service.

Backup scheduling and automation

Automate backups using cron or systemd timers. Ensure jobs run under an account with least-privilege and with secure key management for remote access.

Example cron entry for borg at 03:00 daily:

0 3 * /usr/local/bin/borg create --compression lz4 /srv/backup/repo::$(date +%F) /etc /var/www && /usr/local/bin/borg prune /srv/backup/repo --keep-daily=7 --keep-weekly=4

Use logging and alerting (email, Slack, PagerDuty) to notify on failures. Monitor repository health and run periodic test restores.

Restore strategies and testing restores

Restoring quickly and reliably is the true test of any backup strategy. Plan restores for the following scenarios:

Single file recovery—use borg/restic/tar to extract a single path.
Filesystem or server recovery—use image backups or recreate VM from template and restore application data.
Disaster recovery (site failure)—ensure offsite backups and documented runbooks for RTO.

Example borg restore single file:

borg extract /srv/backup/repo::2025-11-12 var/www/html/index.php

For databases, verify that restored data files are consistent and rebuilt indexes if necessary. For physical restores from dd images, ensure partition table and bootloader are restored correctly (use grub-install if needed).

Regular testing

Schedule periodic restores to a staging environment to validate the backup chain, scripts, and recovery time. Automate a daily smoke test for critical services: spin up a temporary VM, restore the most recent backup, and run health checks.

Comparison of approaches: pros and cons

Choosing the right toolset depends on data type, RTO/RPO, infrastructure, and budget.

rsync + remote host: simple, robust, great for file systems. Cons: no built-in encryption/dedup unless layered.
tar: ubiquitous and simple, but not optimized for incremental or remote backups without extra tooling.
dd: perfect for full-disk forensic images; cons: large images and potential inconsistency unless snapshotted.
LVM/Btrfs/ZFS snapshots: excellent for live systems; cons: filesystem-specific and requires planning.
Borg/Restic: deduplication, encryption, and efficient remote storage. Cons: initial learning curve and repository management.
Commercial/cloud backup: managed, integrates with cloud providers; cons: cost and dependency on provider.

How to choose the right backup solution

Follow these steps to select an approach:

Identify critical data and services, and define RTO (recovery time objective) and RPO (recovery point objective).
Choose file-level vs image-level based on whether you need bare-metal restores or application portability.
Prefer snapshot-capable filesystems or application-aware tools for databases and transactional systems.
Use deduplicating, encrypted repositories (borg/restic) if bandwidth and storage cost matter.
Automate and integrate monitoring and alerting. Ensure keys and access credentials are securely managed.
Test restores regularly and maintain runbooks that cover common failure scenarios.

Practical checklist for deployment

Inventory data and services that need protection.
Decide backup frequency and retention policy by data importance.
Implement encryption and key rotation policies for backup repositories.
Automate jobs via cron or systemd, add logging and alerts.
Store one copy offsite and verify integrity with checksums.
Document recovery steps and run periodic restore drills.

Example real-world setup for a VPS-hosted web app:

Use LVM snapshots or filesystem snapshot (if available) before backup.
Use borg to back up /etc, /var/www, and database dumps to a remote repository hosted on a separate VPS.
Set up MySQL logical dumps using a cron job that runs before the snapshot or use Percona XtraBackup for physical hot backups.
Prune with borg to keep 7 daily, 4 weekly, and 6 monthly archives.
Store an encrypted copy of the repository in offsite object storage for disaster recovery.

Summary

Effective Linux backups blend the right tools, consistent processes, and diligent testing. Use snapshots and application-aware utilities to ensure consistency, choose deduplicating encrypted repositories for efficient remote storage, and automate with monitoring and alerting. Most importantly, practice restores frequently—backups are only as good as your ability to recover.

If you run your services on VPS infrastructure and need reliable offsite backup targets or additional nodes for remote storage, consider a provider with global presence and flexible plans. For example, VPS.DO offers high-performance US VPS options suitable for hosting backup repositories and mirror targets. Learn more at USA VPS from VPS.DO.

Linux Backup & Restore: A Practical, Step-by-Step Guide