29.07.2013 Views

How NTFS Reads a File

How NTFS Reads a File

How NTFS Reads a File

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>How</strong> <strong>NTFS</strong> <strong>Reads</strong> a <strong>File</strong><br />

On <strong>NTFS</strong> (New Technology <strong>File</strong> System) there is metadata (data which "lives above" your data).<br />

Metadata are special files that the <strong>NTFS</strong> file system driver uses to manage an <strong>NTFS</strong> volume. The most<br />

famous piece of metadata is the MFT (Master <strong>File</strong> Table), which is a special file typically consisting of<br />

1024-byte records. Each file or directory on the volume is described by at least one of these MFT<br />

records. It may take several MFT records to fully describe a file—especially if it is compressed. A 271MB<br />

compressed file can require over 450 MFT records!<br />

Below is an example of exactly what occurs when <strong>NTFS</strong> goes to read in a 1-cluster 1 file that we’ll name<br />

\Flintstone\Barney.txt.<br />

1. The volume's boot record is read to find the cluster address of the first cluster of the MFT.<br />

2. The first cluster of the MFT is read, which is used to locate all of the pieces of the MFT.<br />

3. MFT record 5 is read as it is predefined to be the MFT record of the root directory 2 .<br />

4. Data cluster 0 of the root directory is read in and searched for "Flintstone".<br />

5. If "Flintstone" is not found, then at least one other data cluster of the root directory needs to be<br />

read to find it.<br />

6. The MFT record for the "Flintstone" directory is read in.<br />

7. Data cluster 0 of the "Flintstone" directory is read in and searched for "Barney.txt".<br />

8. If "Barney.txt" is not found, then at least one other data cluster of the "Flintstone" directory needs<br />

to be read to find it.<br />

9. The MFT record for the "Barney.txt" file is read in<br />

10. Data cluster 0 of the "Barney.txt" file is read in.<br />

This is a worst-case scenario. It presumes the volume is not yet mounted, so the <strong>NTFS</strong> cache 3 is empty<br />

at step 1, and the MFT itself needs to be located. But it shows how many I/Os 4 are required to get at a<br />

file that is only one level removed from the root directory: 10. Each one of those 10 I/Os requires a head<br />

movement.<br />

As these I/Os are to different pieces of data on the volume (different MFT records, directories, and files);<br />

the contiguity of the pieces would only matter if the different segments were in close proximity to the MFT<br />

records representing them.<br />

If you follow the step-by-step I/O sequence outlined above, you'll see that every time a new directory is<br />

encountered in the path it requires an additional two or three I/Os. For obvious performance reasons it is<br />

beneficial to keep the depth of your directory structure at a minimum.<br />

1 cluster—A group of sectors on a disk which is treated as a unit.<br />

2 root directory—The top level in a hierarchical filing system, which comes up when the computer is first turned on. On a PC<br />

the root directory is called C:. Other directories branch down from the root directory.<br />

3 cache—A temporary storage area for frequently-accessed or recently-accessed data.<br />

4 I/O— an acronym for input/output; the communication operation between computer components and/or users.


© 2005 Executive Software International, Inc. All Rights Reserved. Re published in part from articles published in 2001. Executive Software and<br />

Diskeeper are registered trademarks or trademarks owned by Executive Software International, Inc. All other trademarks are the property of<br />

their respective owners.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!