In April 1992 the first purpose built file system for use with the Linux kernel called the extended file system (ext) was created. It contained metadata structures very similar to the traditional Unix File System (UFS) with some extensions enabling a data volume up to 2GB in size. The second incarnation of the file system, ext2 was introduced as the replacement in January 1993. The Linux operating system has become one of the most popular platforms for server systems, particularly for ISP’s and Cloud service providers.
The maximum possible volume size for the ext2 file system is dependent upon the kernel implementation allowing volumes of 2TB up to 32TB. Likewise the largest size of a file is correspondingly between 16GB and 2TB. Although ext3 and ext4 versions of the file system have been introduced, ext2 is still in use, although it’s mainly used on SD cards and USB flash drives, where journaling is not required. All information about the data structures and implementation of each file system is available from the open source community. Using this information data recovery is possible in most situations.
Linux Extended File System Features
Journaling was introduced with Ext3 in 1991, which allows the automatic recovery file and directory metadata following a system crash or power failure. As with UFS data allocation is block based, with indirect allocation blocks used to store the metadata for large files. Although a driver was developed which allows data compression it is however rarely used, as it provides little advantage.
In 2008 the Ext4 file system was introduced, which extends the maximum possible volume size to 1EB while the maximum file size is increased to 16TB. An enhancement to the date time values was also introduced which allows a much larger date range, while also providing nanosecond resolution, a huge improvement over the previous resolution of a single second.
Internals of the Linux Extended File System
Internally the structure is very similar to UFS, where inodes and usage bitmaps are allocated in fixed positions for each cylinder group. With Ext3 the performance when using large directories was significantly improved by introducing HTree indexing which allows for much faster searching, deletion and insertions of items within a large directory, than with the previous versions of the file system.
Storing each block of allocation explicitly is inefficient, both for disk space usage and random access of data from a file. To address this, extents were added in Ext4, which allows a set of contiguous blocks to be defined more efficiently, in chunks of up to 128MB. A maximum of four extents can be stored in the inode, while an HTree is used in additional data blocks to store any remaining data allocation runs. Until Ext4 the number of subdirectories allowed was limited to 32,000 but is now in theory unlimited in number.
Data Recovery From Linux Extended File System
The Linux Extended file systems still in common use, ext2, ext3 and ext4 are clearly defined making the data recovery process relatively simple as all the data structures are well known. File systems of this nature however suffer a problem when directory index information is lost, as the name of the file or directory associated with an inode is not contained with the inode. The reformat of a Linux Extended volume will overwrite the contents of all inodes, resulting in the loss all metadata. The only viable data recovery option left open in such a situation is a raw data trawl, looking for known file headers.
Should the operating system being unable to automatically repair and mount the extended file system, it is important that no third party tools are used in an attempt to fix the problem, as they may destroy important data structures required during data recovery. If the operating system is unable to mount the file system, it tends to indicated severe damage has occurred, requiring professional data recovery services to analyse the data volume, as a full understanding of the underlying data structures is essential.