(updated 12/16/2012)

Hard drive - defragmentation

by Myles White

To understand defragmentation, it helps to know how files are stored on a hard drive (or any other erasable, rewriteable medium). When a disk is empty, and you choose to save a file to disk the operating system looks for the first empty portion of it (called a sector) and begins laying down the data. Each sector is divided into clusters (the smallest portion the system can use—more on this later) and to fill a portion of a cluster is to fill all of it.

The file is written, cluster by cluster and sector by sector, until it's done. The next time you go to write to disk, the same process occurs and, regardless of the file system in use, a record of what the file is called, where it starts and where it ends are all left on the disk in another area. At this point, all of the clusters in use are right next door to each other and are said to be contiguous.

When you erase a file, or edit it to make it longer or shorter the operating system has a dilemma. If it could only write a file to contiguous clusters on the disk, it would soon run out of space. It still looks for the first empty one when beginning the writing process, but it will only store as much of the file in contiguous clusters as room allows.

Additional portions of the file, known as fragments, are scattered all over the disk surface. Now we not only have the file's name and starting and ending location stored, but we also have pointers to tell the operating system where the rest of its pieces are. Again, regardless of your operating system, in today's computer market you will be using some form of data cache to speed up disk activity. This cache may be implemented through software and use your system's electronic memory (RAM), it may be physical memory on your computer's drive controller or on the disk drive itself, or it may be any combination of the three. All data caching schemes operate on the concept that what the system will want from the disk is right next door to the last thing requested. In short, the cache reads ahead and parks extra data in memory. The next time a read request takes place, the operating system looks in the much faster cache first, instead of the slower hard drive.

Among other pieces of data that will be stored in the cache is whatever file system is in use with all those location pointers and it means the hard drive doesn't physically have to go back and forth to the file location storage area for instructions. This breaks down when a disk is badly fragmented, something that will have taken place or soon will after you've gone and scrubbed lots of files you didn't need.

Whether you use a third-party defragmentation utility (such as from Symantec's Norton Group) or one that comes with your operating system - it doesn't matter. That it be compatible with whatever operating system you have definitely does matter. For example, defragmenters written for Windows 3.lx don't understand long file names. A defragmenter for Windows 95a with its 16-bit File Allocation Table (FAT} won't work with what is known as OEM Service Release 2 or OSR/2) and its 32-bit file structure (FAT-32). The same problems will occur with OSR's file structure and with Windows NT's NTFS.

The point here is that a badly defragmented disk drive (over 20 per cent) can cause a real speed hit to your system and virtually destroy the effectiveness of the cache.


© Products of Concord North Ltd. Home