Jump to content


 


Register a free account to unlock additional features at BleepingComputer.com
Welcome to BleepingComputer, a free community where people like yourself come together to discuss and learn how to use their computers. Using the site is easy and fun. As a guest, you can browse and view the various discussions in the forums, but can not create a new topic or reply to an existing one unless you are logged in. Other benefits of registering an account are subscribing to topics and forums, creating a blog, and having no ads shown anywhere on the site.


Click here to Register a free account now! or read our Welcome Guide to learn how to use this site.

Photo

Why do disks write in a way that causes fragmentation?


  • Please log in to reply
10 replies to this topic

#1 paul1149

paul1149

  • Members
  • 9 posts
  • OFFLINE
  •  
  • Local time:09:12 AM

Posted 16 May 2010 - 07:46 AM

I pondered where to pose this question. It probably relates more to the OS than to the disk, but since it's common to so many OSes, I decided to place it here.

I've been watching how quickly and easily disks frag up, and I'm wondering why OSes - all the Windows I've used, anyway, which is up to XP - write to disk in such a way that files immediately begin to fragment. Let's say I have a 1M Word document that is written to disk, with no extra elbow room, and I open it and add one word on the first page. As I understand it, Windows will save the file to the same spot, and place the overflow somewhere else.

Why doesn't it instead write the file to a larger slot that will accept the whole thing, and free up the old space for a smaller file? Doing so would cause more free-space gaps but far less fragmentation. File tables would have to be updated frequently, but I think it would be less than maintaining hundreds of locations for a large file.

This seems to be very common-sense to me, so there must be a good reason why it's not the case, if I understand correctly what's happening.

BC AdBot (Login to Remove)

 


#2 MrBruce1959

MrBruce1959

    My cat Oreo


  • BC Advisor
  • 6,377 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Norwich, Connecticut. in the USA
  • Local time:09:12 AM

Posted 16 May 2010 - 08:30 AM

Well all hard drives are is just storage devices.

The OS does reserve sectors with-in blocks for future data, however, once those blocks are filled, it moves on to the next set of available blocks on the disk and so-on.

Once this has happened the programs data bits are separated into two different blocks in two different areas of the hard drive, thus fragmentation starts happening.

As more of this data is sent to the drive from the same file structure, it goes to the next available block.

See in theory, your operating system has no clue how many data bits your files will consume a head of time, there is no way, that can be determined, so it uses the next available set of blocks as it continues to store all the data bits the file consists of, sending that information to the drives root directory so it is mapped for future retrieval.

Using a disk defragging program, checks the root directory, sort of like a directory in a library that tells you what self and section of the library a book is located in, and checks for a flag that tells it there is a program with data segments in different sections of the drive, it then collects those files and moves them together in the same blocks with its other counter-parts, reconnecting the broken up files to each other, in one steady stream on the disk.

It is sort of like how a librarian would go about a library and reorganize the books that a kid had strown about the library, thus putting them back on the shelf in the correct order again.

My explanation may not be the best technical way of explaining it, but then again if I was too technical, those who were not lucky like me to have gone to trade school for computer electronics would not understand the lingo we were taught to use.
Welcome to Bleeping Computer! :welcome:
New Members: Please click here for the Bleeping Computer Forum Board Rules
 
My Career Involves 37 Years as an Electronics Repair Technician, to Which I am Currently Retired From.

I Am Currently Using Windows 10 Home Edition.

As a Volunteer Staff Member of Bleeping Computer, the Help That I Proudly Provide Here To Our BC Forum Board Membership is Free of Charge. :wink:

#3 bigalexe

bigalexe

  • Members
  • 170 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Michigan, USA
  • Local time:08:12 AM

Posted 16 May 2010 - 09:11 AM

So let me re-state your analogy and see if I understand this.

A library receives a shipment of books, and the library has to shelve the books for storage but this particular library has their shelves divided into sectors of 2-inches in order to place books there (Memory sectors). This is because the librarians and pages who have to shelve the books have no idea how much space the books will take up before shelving them (computer doesn't know file size).

So the staff goes about shelving all the books (writing data) and instead of placing them cover to cover they shelve the books in line with the next available section (because this is quicker than adjusting the placement of each book individually). So when the shelving is completed we end up with a bunch of gaps between the books.

So then Mr. Manager (Hard Drive Defrag tool) comes around and shoves all of the books together so they are cover to cover. He is upset his staff wasted so much shelf space. Then at the end of many of the shelves there are large spaces of varying sizes and so he goes around and finds the books that will fit there best. Then in the end we end up with all of the books cover to cover on the shelves, and all of the free shelf space in one spot.

At this point we just installed the OS and programs for the first time, and defragged.

Then the doors of the library open for business and patrons remove books and place them on tables (RAM?), the staff of the library replaces the items on the shelves again but only by the sectors. Also many patrons bring in donations which must be fitted into the system (data creation). So then in a few weeks we need to do the process of organization all over again because things aren't in order anymore.

There, now you can setup defragmentation of your local library.
AMD Phenom II X6 2.8ghz
8GB DDR3 RAM
XFX ATI Radeon HD6850 1 GB DDR5, 26" Widescreen HDMI
500GB + 80GB HDD
Windows 7 Pro, Mozilla Firefox, AutoCAD 2011, Solidworks 2009
1/19/2012

#4 paul1149

paul1149
  • Topic Starter

  • Members
  • 9 posts
  • OFFLINE
  •  
  • Local time:09:12 AM

Posted 16 May 2010 - 09:31 AM

If the OS has no idea how large the file will be on disk, then I definitely can see that putting overflow in the first available slot is about the best it can do. Evidently there is no way to correlate the size of a file's live memory usage to eventual hard disk spacial requirements. Food for thought. Thanks for the input.

p.

#5 MrBruce1959

MrBruce1959

    My cat Oreo


  • BC Advisor
  • 6,377 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Norwich, Connecticut. in the USA
  • Local time:09:12 AM

Posted 16 May 2010 - 10:01 AM

Bigalexe very good way of putting it and thanks for adding in the part about RAM, which is like the patron at the Library who posesses the book at the time and is reading it.


Team work!

That's what I love about the cool members we have here at BC!! YOU GUYS ROCK!!!!! :thumbsup:
Welcome to Bleeping Computer! :welcome:
New Members: Please click here for the Bleeping Computer Forum Board Rules
 
My Career Involves 37 Years as an Electronics Repair Technician, to Which I am Currently Retired From.

I Am Currently Using Windows 10 Home Edition.

As a Volunteer Staff Member of Bleeping Computer, the Help That I Proudly Provide Here To Our BC Forum Board Membership is Free of Charge. :wink:

#6 Eyesee

Eyesee

    Bleepin Teck Shop


  • BC Advisor
  • 3,539 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:In the middle of Kansas
  • Local time:08:12 AM

Posted 16 May 2010 - 01:57 PM

Exellent explanations of defragmentation guys!
In the beginning there was the command line.

#7 Platypus

Platypus

  • Moderator
  • 13,727 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Australia
  • Local time:11:12 PM

Posted 17 May 2010 - 01:07 AM

The library analogy is a good one.

Paul, the two approaches you mention to locating files on disk are in fact two legitimate storage allocation strategies used in file systems. They're called "First (or Next) Fit" and "Best Fit". Each has advantages and disadvantages, the most obvious being the one you've observed - the conflicting outcome regarding file fragmentation and drive (free-space) fragmentation.

First fit is faster to allocate locations. Next fit should be faster again as it works the same way, but the next available physical location is used instead of continually re-scanning from the beginning.

Best fit is slower because potentially all of the available empty slots have to be assessed in order to find the one closest to the required size.

File/operating systems tend to use faster strategies to enhance performance.

The extension of an existing file (concatenation) you've mentioned is another consideration. Adding content to a file like a Word .doc file may or may not change its size. A plain text file enlarges in direct response to its content, a file like a .doc file is a database in which adding a word may not affect the filesize, or may make a disproportionate adjustment in space allocation. And this is complicated by Word creating a backup file while an existing file is being edited. So that will also move the location of available space for the active file.

The choice is made to aviod the complexity and processing overhead of fragmentation reduction during file access, as computers are productivity orientated, especially in commercial environments. The "time-wasting" overhead is offloaded into the separate process of defragmentation, which can then be done in otherwise unproductive "offline" time. The strategies actually trace right back to early computing days when processing time was very expensive and clients paid for it by the minute.

Edited by Platypus, 17 May 2010 - 01:08 AM.

Top 5 things that never get done:

1.


#8 bigalexe

bigalexe

  • Members
  • 170 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Michigan, USA
  • Local time:08:12 AM

Posted 26 May 2010 - 09:23 PM

Now here is the real challenge... go defrag your local library and use computer terms to explain how you were "helping" them. Yes I am that Evil.
AMD Phenom II X6 2.8ghz
8GB DDR3 RAM
XFX ATI Radeon HD6850 1 GB DDR5, 26" Widescreen HDMI
500GB + 80GB HDD
Windows 7 Pro, Mozilla Firefox, AutoCAD 2011, Solidworks 2009
1/19/2012

#9 paul1149

paul1149
  • Topic Starter

  • Members
  • 9 posts
  • OFFLINE
  •  
  • Local time:09:12 AM

Posted 26 May 2010 - 09:37 PM

Platypus, thanks much for that explanation. Sorry for my lateness back here, somehow I missed the thread notification.

What you're saying makes a lot of sense. It speaks of real world trade-offs among some interesting potential solutions, and I like the way you describe what's happening as "off-loading" the problem via fragmentation.

I think implicit in all that is the fact that the OS does indeed have an idea of how large the file print will be on disk. Otherwise the various strategies would be meaningless.

Anyway, when I see fragmentation pile up so quickly, at least I no longer will feel that it's completely unnecessary! -though I personally would wish for a more "best-fit" solution.

BW,
p.

#10 Andrew

Andrew

    Bleepin' Night Watchman


  • Moderator
  • 8,250 posts
  • OFFLINE
  •  
  • Gender:Not Telling
  • Location:Right behind you
  • Local time:06:12 AM

Posted 27 May 2010 - 01:12 AM

Here is an excellent and understandable article that explains what fragmentation is, why it happens, and how some filesystems are designed to avoid/mitigate it.

#11 paul1149

paul1149
  • Topic Starter

  • Members
  • 9 posts
  • OFFLINE
  •  
  • Local time:09:12 AM

Posted 27 May 2010 - 07:26 AM

Andrew, that was an excellent introduction to filesystems and fragmentation. It raises other questions, such as how disks access information, and how differing access speeds for different parts of the disk factor into the writing strategy, over against causing fragmentation.

Also interesting are the comments to the article, which I'm in the process of reading. The noise filtered out, they add more insight.

bw,
p.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users