Repairing Linux Disks and Recovering Lost Partitions
π
Disk failures and lost partitions can lead to boot failures, data loss, and extended downtime in Linux systems. Whether due to accidental deletion, filesystem corruption, or hardware failure, knowing how to diagnose and recover disk partitions is critical for system administrators and DevOps engineers.
π In this guide, you will learn:
β
How to diagnose disk failures & missing partitions
β
Step-by-step recovery methods using fdisk
, parted
, and testdisk
β
Enterprise case studies on real-world partition loss
β
Best practices to prevent disk failures and data loss
π Next in the series: Troubleshooting RAID Failures & Recovery Techniques
π 1. Understanding Linux Disk Failures & Lost Partitions
π Common Causes of Disk & Partition Loss
Failure Type | Cause | Error Message |
---|---|---|
Accidental Deletion | fdisk or parted used incorrectly |
Partition table missing |
Filesystem Corruption | Power failure, bad shutdown | ext4-fs error (device sda1) |
Disk Failure | Bad sectors, aging storage device | I/O error or disk read failure |
Bootloader Issues | Corrupt MBR or GPT | error: no such partition |
π 2. Diagnosing Disk & Partition Issues
π Step 1: Check Disk Health & Errors
Before attempting recovery, check if the disk is physically damaged:
πΉ Verify disk connectivity and partitions:
lsblk
fdisk -l
parted -l
π Expected Output Example:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 500G 0 disk
ββsda1 8:1 0 500M 0 part /boot
ββsda2 8:2 0 50G 0 part /
π‘ If partitions do not appear, the partition table may be corrupted or deleted.
πΉ Check for bad sectors on the disk:
smartctl -a /dev/sda
π Expected Output:
SMART overall-health self-assessment test result: PASSED
π‘ If the test fails, the disk may be physically damaged.
π 3. Repairing Disk & Partition Issues
π‘ Below are recovery methods for common disk partition failures.
π οΈ Fix 1: Recover Lost Partitions Using testdisk
If a partition was accidentally deleted, testdisk
can help recover it.
1οΈβ£ Install testdisk
:
sudo apt install testdisk # Debian/Ubuntu
sudo yum install testdisk # CentOS/RHEL
2οΈβ£ Launch testdisk
and select the affected disk:
sudo testdisk
3οΈβ£ Choose "Analyze" β Select "Quick Search"
4οΈβ£ Identify and restore the lost partition
5οΈβ£ Write changes and reboot:
reboot
π Expected Outcome: If successful, the lost partition will be restored.
π οΈ Fix 2: Repair Corrupt Filesystems with fsck
If the partition exists but is unreadable, use fsck
to repair filesystem corruption.
1οΈβ£ Run fsck
in emergency mode:
fsck -y /dev/sda2
π Command Breakdown:
fsck
β Checks and repairs filesystem errors-y
β Automatically accepts fixes
2οΈβ£ Remount the filesystem:
mount -o remount,rw /dev/sda2
π Expected Outcome: If successful, the filesystem will be repaired and mounted correctly.
π οΈ Fix 3: Restore Partition Table with parted
If the partition table is corrupted, use parted
to recreate it.
1οΈβ£ Launch parted
:
sudo parted /dev/sda
2οΈβ£ Check for partitions:
print
3οΈβ£ If the partition table is missing, recreate it:
mklabel gpt
mkpart primary ext4 1MiB 100%
quit
π Expected Outcome: The partition table will be restored.
π οΈ Fix 4: Reinstall GRUB on a Corrupt Bootloader
If GRUB fails to detect partitions, reinstall it:
1οΈβ£ Boot into a Linux Live USB
2οΈβ£ Mount the root partition:
mount /dev/sda2 /mnt
3οΈβ£ Chroot into the system:
chroot /mnt
4οΈβ£ Reinstall GRUB:
grub2-install /dev/sda
grub2-mkconfig -o /boot/grub2/grub.cfg
5οΈβ£ Reboot the system:
exit
reboot
π Expected Outcome: If successful, GRUB will detect all partitions.
π 4. Enterprise Case Study: Data Recovery After Partition Loss
π Scenario:
A financial services company accidentally deleted a key partition on a production database server.
π Symptoms:
- The server refused to boot (
No such partition
) - Running
fdisk -l
showed missing partitions - The database was inaccessible, causing an outage
π Investigation:
- Engineers used a Live USB to inspect disk structure
testdisk
detected deleted partitions- The partition table was corrupt, preventing boot
π Solution:
πΉ Used testdisk
to restore the deleted partition
πΉ Ran fsck -y /dev/sda2
to repair filesystem errors
πΉ Reinstalled GRUB to detect and boot into the restored partition
π Lesson Learned:
β οΈ Always backup partition tables before making changes
β οΈ Use LVM snapshots for rapid recovery
β οΈ Automate disk monitoring with smartctl
π 5. Best Practices to Prevent Disk Failures
π To minimize disk-related failures, follow these best practices:
β
Enable disk health monitoring (smartctl -a /dev/sda
)
β
Schedule periodic fsck
checks to prevent filesystem corruption
β
Keep multiple backups of partition tables (sfdisk -d /dev/sda > backup.txt
)
β
Use RAID for redundancy in production environments
β
Automate disk failure alerts with monitoring tools (Prometheus
, Zabbix
)
π Summary
Issue | Cause | Solution |
---|---|---|
Lost Partition | Accidental deletion | Restore using testdisk |
Corrupt Filesystem | Improper shutdown | Run fsck -y /dev/sda2 |
Missing Partition Table | MBR/GPT corruption | Recreate using parted |
Bootloader Not Detecting Partitions | GRUB corruption | Reinstall GRUB (grub2-install ) |
π‘ Want to learn more? Check out the next article: "Troubleshooting RAID Failures & Recovery Techniques" π
π¬ Join the Discussion!
π¬ Have you experienced partition loss or disk failure in production?
π‘ What strategies do you use to prevent data loss?
π Share your experience in the comments!
π Next Up: Troubleshooting RAID Failures & Recovery Techniques
π Continue to the next guide in this series!
π© Would you like a downloadable PDF version of this guide? Let me know! π