How NTFS Reads a File
How NTFS Reads a File
How NTFS Reads a File
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>How</strong> <strong>NTFS</strong> <strong>Reads</strong> a <strong>File</strong><br />
On <strong>NTFS</strong> (New Technology <strong>File</strong> System) there is metadata (data which "lives above" your data).<br />
Metadata are special files that the <strong>NTFS</strong> file system driver uses to manage an <strong>NTFS</strong> volume. The most<br />
famous piece of metadata is the MFT (Master <strong>File</strong> Table), which is a special file typically consisting of<br />
1024-byte records. Each file or directory on the volume is described by at least one of these MFT<br />
records. It may take several MFT records to fully describe a file—especially if it is compressed. A 271MB<br />
compressed file can require over 450 MFT records!<br />
Below is an example of exactly what occurs when <strong>NTFS</strong> goes to read in a 1-cluster 1 file that we’ll name<br />
\Flintstone\Barney.txt.<br />
1. The volume's boot record is read to find the cluster address of the first cluster of the MFT.<br />
2. The first cluster of the MFT is read, which is used to locate all of the pieces of the MFT.<br />
3. MFT record 5 is read as it is predefined to be the MFT record of the root directory 2 .<br />
4. Data cluster 0 of the root directory is read in and searched for "Flintstone".<br />
5. If "Flintstone" is not found, then at least one other data cluster of the root directory needs to be<br />
read to find it.<br />
6. The MFT record for the "Flintstone" directory is read in.<br />
7. Data cluster 0 of the "Flintstone" directory is read in and searched for "Barney.txt".<br />
8. If "Barney.txt" is not found, then at least one other data cluster of the "Flintstone" directory needs<br />
to be read to find it.<br />
9. The MFT record for the "Barney.txt" file is read in<br />
10. Data cluster 0 of the "Barney.txt" file is read in.<br />
This is a worst-case scenario. It presumes the volume is not yet mounted, so the <strong>NTFS</strong> cache 3 is empty<br />
at step 1, and the MFT itself needs to be located. But it shows how many I/Os 4 are required to get at a<br />
file that is only one level removed from the root directory: 10. Each one of those 10 I/Os requires a head<br />
movement.<br />
As these I/Os are to different pieces of data on the volume (different MFT records, directories, and files);<br />
the contiguity of the pieces would only matter if the different segments were in close proximity to the MFT<br />
records representing them.<br />
If you follow the step-by-step I/O sequence outlined above, you'll see that every time a new directory is<br />
encountered in the path it requires an additional two or three I/Os. For obvious performance reasons it is<br />
beneficial to keep the depth of your directory structure at a minimum.<br />
1 cluster—A group of sectors on a disk which is treated as a unit.<br />
2 root directory—The top level in a hierarchical filing system, which comes up when the computer is first turned on. On a PC<br />
the root directory is called C:. Other directories branch down from the root directory.<br />
3 cache—A temporary storage area for frequently-accessed or recently-accessed data.<br />
4 I/O— an acronym for input/output; the communication operation between computer components and/or users.
© 2005 Executive Software International, Inc. All Rights Reserved. Re published in part from articles published in 2001. Executive Software and<br />
Diskeeper are registered trademarks or trademarks owned by Executive Software International, Inc. All other trademarks are the property of<br />
their respective owners.