Learn Linux Data Recovery: Step-by-Step Recovery from Corrupted Drives

Learn Linux Data Recovery: Step-by-Step Recovery from Corrupted Drives

Lost files or a corrupted filesystem dont have to be a disaster — this practical Linux data recovery guide walks you through core concepts and a step‑by‑step workflow for safely imaging, diagnosing, and restoring drives. Follow non‑invasive diagnostics and the golden rule of dont write to the affected disk to maximize your chances of getting data back.

Data loss on Linux systems can be costly for site operators, developers, and businesses. Whether caused by hardware failure, filesystem corruption, accidental deletion, or misconfigured RAID/LVM, a systematic recovery approach greatly improves the chance of restoring critical data. The following guide explains the underlying principles and provides a step-by-step technical workflow for recovering data from corrupted drives, with practical recovery commands, tool choices, and selection guidance for different scenarios.

Understanding the fundamentals of Linux data recovery

Successful recovery depends on understanding how Linux storage is organized and what can be repaired. Key concepts include:

  • Partition tables (MBR/GPT) – define where partitions start and end; corruption here can make entire filesystems invisible.
  • Filesystems – ext4, XFS, Btrfs, and others each manage metadata (inodes, superblocks, journals) differently; recovery tools and approaches vary accordingly.
  • Logical volume management (LVM) – LVM introduces another abstraction layer; physical corruption may require PV/LV metadata recovery.
  • Software RAID (mdadm) – degraded arrays need reassembly before filesystems can be mounted.
  • Encryption (LUKS) – encrypted volumes require correct passphrases/keys to access underlying data; recovery may focus on header backups.

Before any repair action, the golden rule is: do not write to the affected disk. Always work from a copy (image) whenever possible to avoid making irreversible changes.

Initial diagnostic steps

Begin with non-invasive checks and documentation. These help you choose the appropriate tools and avoid compounding damage.

  • Physically inspect hardware for obvious failure signs (noise, overheating).
  • Check SMART attributes: smartctl -a /dev/sdX to detect impending hardware failure.
  • Record partition layout: fdisk -l /dev/sdX or sgdisk -p /dev/sdX.
  • Check dmesg/kernel logs for errors related to the device: dmesg | tail -n 100.
  • Document everything you do (commands, outputs, timestamps). This is essential for audits and when escalating to professionals.

When to image the drive

If SMART or dmesg indicates hardware issues, or if the drive contains critical data, immediately create a block-level image and operate on that image. Imaging preserves evidence and allows multiple recovery attempts without further risking the original.

Recommended imaging approaches:

  • GNU ddrescue (preferred for failing drives): ddrescue -f -n /dev/sdX /mnt/recovery/image.dd /mnt/recovery/image.log. Use the log file to resume and retry slow sectors.
  • Classic dd (only for stable drives): dd if=/dev/sdX of=/mnt/recovery/image.dd bs=4M conv=noerror,sync.

Filesystem-specific recovery techniques

Each filesystem has tools optimized for its metadata layout. Below are practical, technical steps for common filesystems.

Ext2/Ext3/Ext4

  • Mount read-only where possible: mount -o ro /dev/loop0 /mnt/recovery.
  • Check superblocks and backups: dumpe2fs /dev/loop0 | grep -i superblock.
  • If the main superblock is corrupted, run e2fsck using a backup superblock: e2fsck -b 32768 /dev/loop0 (replace 32768 with a valid backup number).
  • Run full filesystem check: e2fsck -f -y -v /dev/loop0. Avoid automatic fixes (-y) unless you understand the changes; consider running interactively.
  • For file carving, use photorec (part of TestDisk) to recover raw files when metadata is lost.

XFS

  • Do not run xfs_repair on a mounted filesystem. First mount read-only: mount -o ro /dev/loop0 /mnt/recovery.
  • Attempt a metadata replay by mounting; if that fails, use xfs_repair. If metadata is severely corrupted, an initial dry run is helpful: xfs_repair -n /dev/loop0.
  • In extreme cases, you may need xfs_repair -L /dev/loop0 to zero the log, which sacrifices some recent transactions — use only when necessary.

Btrfs

  • Try to mount read-only first: mount -o ro /dev/loop0 /mnt/recovery.
  • Use built-in rescue tools: btrfs check --repair is aggressive — try btrfs check --readonly first.
  • Btrfs has multiple copies of metadata; explore btrfs rescue subcommands to restore superblocks or device maps.

LVM, RAID, and encrypted volumes

  • LVM: scan and activate volumes: pvscan, vgscan, vgchange -ay. If metadata is damaged, check /etc/lvm/backup and /etc/lvm/archive for backups to restore with vgcfgrestore.
  • RAID/mdadm: attempt safe assembly: mdadm --assemble --run /dev/md0 /dev/sdX1 /dev/sdY1. Use a read-only assemble if available, then mount an image of the assembled array.
  • LUKS: ensure you have correct headers and passphrase. You can back up and restore LUKS headers with cryptsetup luksHeaderBackup and cryptsetup luksHeaderRestore. Without the header or key, data is effectively unrecoverable.

Partition table and boot metadata recovery

If partitions have been deleted or the table corrupted, tools like TestDisk and gdisk can help rebuild partition tables. Typical steps:

  • Run testdisk /path/to/image and use the guided interface to locate lost partitions and rewrite the partition table.
  • For GPT specifically, gdisk can restore GPT entries from the backup header: gdisk /dev/sdX then use the recovery options.
  • After rebuilding partitions, check filesystems as described above. Always operate on an image copy first.

File carving and when metadata is gone

When filesystem metadata is destroyed, traditional fsck won’t help. File carving tools scan raw data for file signatures and extract recoverable files. Common tools:

  • photorec — works well for photos, documents, and many common file types.
  • scalpel — configurable signature-based carver for advanced users.

File carving recovers raw files without filenames, directory structures, or timestamps. Use it as a last resort when metadata-based approaches fail.

Best practices and advantages of each approach

Recovery strategies are chosen based on failure type and urgency:

  • Imaging then analysis — safest; preserves original media and allows repeated attempts. Best when hardware is failing or data is critical.
  • Filesystem repair tools — effective for logical corruption (journal replay, missing inodes). Faster than carving and preserves metadata when successful.
  • File carving — useful when metadata is irrecoverable; recovers data but loses structure and names.
  • Professional services — recommended for physical damage or when imaging fails; they have clean-room hardware tools but are costly.

Selection guidance and preventive considerations

Choosing the right recovery plan depends on your environment and risk tolerance:

  • If you run production servers or host customer sites, implement regular backups (off-host snapshots, incremental backups). Cloud backups and offsite replication reduce recovery complexity.
  • For mission-critical servers, consider using stable storage classes and RAID with hot spares, but remember RAID is not a substitute for backups.
  • For small teams without appliance-level backups, prioritize taking images and engaging a specialist earlier rather than later when failing hardware is suspected.
  • Automate health checks (SMART monitoring) and scheduled filesystem checks to detect issues proactively.

Summary

Recovering data from corrupted Linux drives is a disciplined process: diagnose without writing to the disk, create a block-level image when possible, and then apply filesystem-aware repair tools or file carving as appropriate. Ext filesystems typically allow superblock recovery and e2fsck fixes; XFS and Btrfs have their own rescue utilities and caveats; LVM, RAID, and encryption add complexity and require additional metadata handling. Throughout, documenting steps and avoiding writes to the original device are critical to preserving recoverability.

For hosting environments and recovery workflows, reliable infrastructure for backups and test environments is essential. If you operate remote or cloud-based services, consider combining on-site imaging with off-site backups and regular snapshot schedules. If you need resilient hosting for recovery testing or staging, check out VPS.DO’s USA VPS offerings as a reliable platform to run recovery tools, store disk images, or automate backup workflows: https://vps.do/usa/.

Fast • Reliable • Affordable VPS - DO It Now!

Get top VPS hosting with VPS.DO’s fast, low-cost plans. Try risk-free with our 7-day no-questions-asked refund and start today!