Go Back  DVD Talk Forum > General Discussions > Tech Talk
Reload this Page >

Effective storage reduction due to format - how much?

Tech Talk Discuss PC Hardware, Software, Internet and Other Technology

Effective storage reduction due to format - how much?

Old 10-12-07, 01:08 PM
  #1  
Senior Member
Thread Starter
 
Join Date: Mar 2003
Posts: 789
Likes: 0
Received 0 Likes on 0 Posts
Effective storage reduction due to format - how much?

Hi,

A 500GB drive does not allow you to actually store 500GB. There is some loss due to the format of the drive.

I have a couple of questions for which I need your technical expertise:

1. Does anybody know how to calculate how much effective storage a formatted 500GB drive gives you?
2. Does this depend on the file systems? Assuming I would go with NTFS, how much would I have?
3. Does this depend on the vendor's implementation? Assume I go with Seagate?
4. Is this a fixed % loss, or does this depend on the actual size of the raw harddisk? For instance, how much would I have if I formatted a 250GB drive or an 80 GB drive?

Thanks for your help!
Old 10-12-07, 01:39 PM
  #2  
DVD Talk Hero
 
Join Date: Aug 2001
Location: in da cloud
Posts: 26,196
Likes: 0
Received 0 Likes on 0 Posts
if you need help for school you will have to do the math yourself

all drive manufacturers use 1000MB per GB even though in reality it's a different #. you first have to figure out the true unformatted size of your drive and then account for the formatting
Old 10-17-07, 01:12 AM
  #3  
Senior Member
Thread Starter
 
Join Date: Mar 2003
Posts: 789
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by al_bundy
if you need help for school you will have to do the math yourself

all drive manufacturers use 1000MB per GB even though in reality it's a different #. you first have to figure out the true unformatted size of your drive and then account for the formatting
Hmm. Thank you for your enlightening answer.

Not exactly a school project - I have been given the task to build a storage unit for work of somewhere between 300 and 900 TB of data. In shelves of 15 drives at a time, 500 GB per disk, RAID 5 with one hot spare drive, I need to figure out the effective storage capacity per shelf, allowing me to calculate the number of shelves needed for the SAN.

So, back to my question. I do realize that 1GB=1024MB thank you. Is the data loss then really only a factor of the conversion due to the fact that manufacturers say that 1GB=1000MB?

What about my other questions? Does it depend on the file system? Does it depend on the drive manufacturer etc?

Please no remarks on school projects. I wish I didn't get this project on my lap, but I do, and I am looking for guidance, not condecending comments.

My sincere thanks for any help.
Old 10-17-07, 11:48 AM
  #4  
DVD Talk Legend
 
Join Date: Jan 2000
Posts: 16,173
Likes: 0
Received 0 Likes on 0 Posts
LolaRennt

The exact physical amount (raw) will vary from drive model to drive model and from manufacturer to manufactuer as to what they exactly consider a 500gb drive to be in the raw.

The math stays the same once you have the raw capacity. I would suggest you find the exact model of hard drive you want and then work the math to determine formatted size. Each drive has a data sheet with the exact physical specs on it that can be downloaded.

here is some info on the math

http://seagate.custhelp.com/cgi-bin/...hp?p_faqid=336

more info

http://physics.nist.gov/cuu/Units/binary.html

Once you have the basic binary vs decimal math down, the file system choice will determine how much useable space is left for the OS to actually use after formatting

The seagate article above touches on that at the bottom.

NTFS is generally more efficient, since this is a server, you should use NTFS anyway

a not so accurate rule of thumb, you will lose about 7.5% to 8% of the drives unformatted capacity once you format it with NTFS

500gb drive = 460gb useable formatted with NTFS
to make the math easy, if you had 6 of these drives in a hardware RAID 5, you will have about 2.3TB of useable space to a Windows OS using the NTFS file system

However, depending on the size of each data file stored, you could lose more to overhead in the file system.
Old 10-17-07, 12:16 PM
  #5  
Senior Member
Thread Starter
 
Join Date: Mar 2003
Posts: 789
Likes: 0
Received 0 Likes on 0 Posts
Thank you. This is the info I needed.
Old 10-17-07, 01:54 PM
  #6  
DVD Talk God
 
twikoff's Avatar
 
Join Date: Feb 2000
Location: Right Behind You!!!
Posts: 79,497
Likes: 0
Received 0 Likes on 0 Posts
generally speaking...
multiply by .931
Old 10-17-07, 08:25 PM
  #7  
DVD Talk Gold Edition
 
Join Date: Mar 2000
Posts: 2,827
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by 4KRG

However, depending on the size of each data file stored, you could lose more to overhead in the file system.
I know with other operating systems I've worked with that you can specify the cluster/blocking factor, etc on the drive. We would do that depending if we were going to use the volume for fewer large files vs large numbers of smaller files, so that each time a file was created/extended the allocation was appropriate to the usage of the volume.

It seems from a quick google search that you can do that with NTFS and Windows...

Just one quick example:

http://www.pcguide.com/ref/hdd/file/...Cluster-c.html

Again not sure about windows and NTFS, but on other OS/file systems it was worthwhile to review this...
Old 10-17-07, 09:29 PM
  #8  
DVD Talk Special Edition
 
danwiz's Avatar
 
Join Date: Dec 2000
Location: Fairbanks, Alaska
Posts: 1,858
Received 0 Likes on 0 Posts
Yes, you can certainly specify cluster size when you format a drive as NTFS. The smallest cluster size my Windows XP-SP2 offers me is 512 bytes. From what I know about this cluster size business (not very much), the smaller cluster size you specify the better you will utilize the drive. It doesn't necessarily depend on how big or small the files you are storing are as to what cluster size you should specify. Even if every file you store on the drive is 500 MB you have to keep in mind that if you specify 4,096 bytes as cluster size that there may still be 20 bytes of overflow of the file into an extra 4,096 byte cluster, which will essentially "occupy" the entire cluster. On the other hand, if you format the drive with a cluster size of 512 bytes, then if that 50 MB file overflows by a few bytes onto a 512 byte cluster you have ended up "saving" 4,096 - 512 bytes in "overhead". Does this make sense?! I just recently started understanding this myself.
Old 10-17-07, 10:01 PM
  #9  
DVD Talk Gold Edition
 
Join Date: Mar 2000
Posts: 2,827
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by danwiz
Yes, you can certainly specify cluster size when you format a drive as NTFS. The smallest cluster size my Windows XP-SP2 offers me is 512 bytes. From what I know about this cluster size business (not very much), the smaller cluster size you specify the better you will utilize the drive. It doesn't necessarily depend on how big or small the files you are storing are as to what cluster size you should specify. Even if every file you store on the drive is 500 MB you have to keep in mind that if you specify 4,096 bytes as cluster size that there may still be 20 bytes of overflow of the file into an extra 4,096 byte cluster, which will essentially "occupy" the entire cluster. On the other hand, if you format the drive with a cluster size of 512 bytes, then if that 50 MB file overflows by a few bytes onto a 512 byte cluster you have ended up "saving" 4,096 - 512 bytes in "overhead". Does this make sense?! I just recently started understanding this myself.
Yes, but there are other concerns too besides just space utilization, such as how much a file fragments because its constantly grabbing smaller allocations all over the disk vs grabbing a larger amount at one time that it may actually use (or not). There are also performance concerns. Again most of my understanding comes from other OS/file systems but I would think the basics would still be similar...
Old 10-18-07, 07:38 AM
  #10  
DVD Talk Limited Edition
 
Join Date: Nov 1999
Location: In the ATL
Posts: 6,740
Likes: 0
Received 0 Likes on 0 Posts
ps why dont you go with larger drives ? for that much data, having the minimum number of shelves would be nice I'd think. Drives may be a bit more per mb, but you'll get less heat, noise, power consumption, failed drives and use less space.

pps - in that big a project, buy a drive of the kind you want, format it and see how much usable space it has
Old 10-18-07, 11:04 AM
  #11  
DVD Talk Gold Edition
 
Join Date: Oct 2004
Location: Anytown, U.S.A.
Posts: 2,213
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by twikoff
generally speaking...
multiply by .931
This appears to be good enough.
Old 10-18-07, 11:18 AM
  #12  
Senior Member
Thread Starter
 
Join Date: Mar 2003
Posts: 789
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by ravan
ps why dont you go with larger drives ? for that much data, having the minimum number of shelves would be nice I'd think. Drives may be a bit more per mb, but you'll get less heat, noise, power consumption, failed drives and use less space.

pps - in that big a project, buy a drive of the kind you want, format it and see how much usable space it has
First of all: thank you to everybody who has replied.

I understand the cluster size quite well. I know this reduces the storage available per HD, but it is hard to predict beforehand. Luckily, I will be storing large files, so the impact should be minimal.

Ravan, you bring up an interesting point. I could go with 1TB drives, but I have heard (various internet sources, and various podcasts - TWIT) that drives larges than 250 GB have serious issues with errors like bad sectors. An unofficial quote from one twit episode sounds like "drives larger than 250GB spend most of their time error correcting".

A limited-scope observation kind of confirms this. I have 2 storage systems in my network. One which is a 2TB storage unit, raid 5 over 8 drives, each 250 GB (Maxtor drives). This system has been on 24x7 for about 4 years now, and not a single drive has failed. I have not had to rebuild the raid even once!

My other storage system is a 90 harddisk volume, each harddisk is 500 GB (seagate drives). This system has been running 24x7 for a little under 2 years, and I have had to replace about 7 drives now.

Anecdotal evidence, I agree, but it does point to some issues that exist in the one system, and not in the other one. I use different brands in both systems, but I do not believe that quality-wise the brands would be that different.

The main issue with bigger harddisks appears to be the too-high data density, which makes drives (bad sectors, drives heads broken etc) more sensitive and therefor error-prone.

Long story to say that currently, I don't trust these very big drives, and likely will stick to maximum of 500GB per drive.

Your ideas?
Old 10-18-07, 11:22 AM
  #13  
DVD Talk God
 
twikoff's Avatar
 
Join Date: Feb 2000
Location: Right Behind You!!!
Posts: 79,497
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by LolaRennt
Long story to say that currently, I don't trust these very big drives, and likely will stick to maximum of 500GB per drive.

Your ideas?
well.. the technology in the big drives is different, and 'supposed' to be better.. I havent really heard of many issues with them..

but reliability aside, 500gb is the sweet spot on price right now.. so I wouldnt go any higher than that, for that reason alone.
Old 10-18-07, 12:30 PM
  #14  
X
Administrator
 
X's Avatar
 
Join Date: Oct 1987
Location: AA-
Posts: 10,763
Likes: 0
Received 4 Likes on 3 Posts
It seems that the newer perpendicular-recording drives would give you higher capacity with lower power (and therefore heat generating) requirements while not packing the platters to the max data density requiring extensive error-correction.
Old 10-18-07, 12:36 PM
  #15  
DVD Talk Legend
 
Dr Mabuse's Avatar
 
Join Date: Jun 2007
Location: 75 clicks above the Do Lung bridge...
Posts: 18,950
Likes: 0
Received 0 Likes on 0 Posts
if reliability of drives is an issue...

you should use SCSI disks...

they are designed, and tested in ways serial drives aren't for reliabilty over life of the drive...
Old 10-18-07, 12:54 PM
  #16  
DVD Talk Hero
 
Join Date: Aug 2001
Location: in da cloud
Posts: 26,196
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by LolaRennt
First of all: thank you to everybody who has replied.

I understand the cluster size quite well. I know this reduces the storage available per HD, but it is hard to predict beforehand. Luckily, I will be storing large files, so the impact should be minimal.

Ravan, you bring up an interesting point. I could go with 1TB drives, but I have heard (various internet sources, and various podcasts - TWIT) that drives larges than 250 GB have serious issues with errors like bad sectors. An unofficial quote from one twit episode sounds like "drives larger than 250GB spend most of their time error correcting".

A limited-scope observation kind of confirms this. I have 2 storage systems in my network. One which is a 2TB storage unit, raid 5 over 8 drives, each 250 GB (Maxtor drives). This system has been on 24x7 for about 4 years now, and not a single drive has failed. I have not had to rebuild the raid even once!

My other storage system is a 90 harddisk volume, each harddisk is 500 GB (seagate drives). This system has been running 24x7 for a little under 2 years, and I have had to replace about 7 drives now.

Anecdotal evidence, I agree, but it does point to some issues that exist in the one system, and not in the other one. I use different brands in both systems, but I do not believe that quality-wise the brands would be that different.

The main issue with bigger harddisks appears to be the too-high data density, which makes drives (bad sectors, drives heads broken etc) more sensitive and therefor error-prone.

Long story to say that currently, I don't trust these very big drives, and likely will stick to maximum of 500GB per drive.

Your ideas?
just ask EMC, the crack dealers of the storage industry

seriously for a storage unit this size you should be buying a real SAN and not building something yourself

the switch fabric alone will benefit in other areas like backup

Last edited by al_bundy; 10-18-07 at 12:57 PM.
Old 10-18-07, 12:57 PM
  #17  
X
Administrator
 
X's Avatar
 
Join Date: Oct 1987
Location: AA-
Posts: 10,763
Likes: 0
Received 4 Likes on 3 Posts
Originally Posted by Dr Mabuse
if reliability of drives is an issue...

you should use SCSI disks...

they are designed, and tested in ways serial drives aren't for reliabilty over life of the drive...
Twice as many hard drives at 4 times the price each might make a difference to who's paying for it though.
Old 10-18-07, 01:17 PM
  #18  
DVD Talk Legend
 
Dr Mabuse's Avatar
 
Join Date: Jun 2007
Location: 75 clicks above the Do Lung bridge...
Posts: 18,950
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by X
Twice as many hard drives at 4 times the price each might make a difference to who's paying for it though.
this is true...

but the design and testing i mentioned is why they cost more...

sometimes buying quality and performance is a better move...

i am pretty much a pure SCSI user even in my PC's... exclusively on my SUN stations...

when i set up arrays i strongly suggest SCSI to clients...

sometimes the cheaper way is more pressing than the quality though...

so to every thing... turn turn turn...
Old 10-18-07, 01:36 PM
  #19  
DVD Talk God
 
twikoff's Avatar
 
Join Date: Feb 2000
Location: Right Behind You!!!
Posts: 79,497
Likes: 0
Received 0 Likes on 0 Posts
but the money you save could purchase quite a few hot spare for the raid config
Old 10-18-07, 02:07 PM
  #20  
DVD Talk Hero
 
Join Date: Aug 2001
Location: in da cloud
Posts: 26,196
Likes: 0
Received 0 Likes on 0 Posts
i use SCSI at work and cheapo at home. i never saw a difference in quality. last year we had a bunch of 300GB drives fail. these were rebranded HP's. Two of them were from the same RAID5 array in one week.
Old 10-18-07, 02:16 PM
  #21  
X
Administrator
 
X's Avatar
 
Join Date: Oct 1987
Location: AA-
Posts: 10,763
Likes: 0
Received 4 Likes on 3 Posts
Originally Posted by twikoff
but the money you save could purchase quite a few hot spare for the raid config
You could actually create a RAID 5 array of the 300 - 900 TB array and have a few hot spare arrays available if you're willing to spend 8 times as much.
Old 10-18-07, 03:04 PM
  #22  
DVD Talk Legend
 
Join Date: Jan 2000
Posts: 16,173
Likes: 0
Received 0 Likes on 0 Posts
Didn't there use to be a forum member here that swore up and down that SCSI drives were not built any better? I can't remember the name and don't feel like searching. The argument was the SCSI drive was just meant for a customer that could pay more and felt that if they paid more they got a better product.

My experience has been that SCSI drives seem more reliable, but I have far fewer of those in operation, usually have them connected to a machine with high end power filtering, static free environment, temperature controlled environment (68F) and they are never turned off.

Conversly, the drives I have the most problems with are laptop drives. Constantly turned off and on, never have clean power, and are tossed about like garbage, and are hit with ESD constantly.

hmmm, is it the drive or is it the environment and handling?

I lean towards it being the environment and handling. If you take a well built drive and treat it like shit, it will break. If you take a not so well built drive and baby it, it will probably run a long time.


life's lessons

and never ever never put drives from the same lot into a RAID 5 array. Always buy them from different sources. If there is a defect you WILL lose more than one ask me how I know.

Also, don't locate your data center under the men's room bathroom that is on the floor above ask me how I know that one.
Old 10-18-07, 03:23 PM
  #23  
DVD Talk Legend
 
Dr Mabuse's Avatar
 
Join Date: Jun 2007
Location: 75 clicks above the Do Lung bridge...
Posts: 18,950
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by 4KRG
Didn't there use to be a forum member here that swore up and down that SCSI drives were not built any better? I can't remember the name and don't feel like searching. The argument was the SCSI drive was just meant for a customer that could pay more and felt that if they paid more they got a better product.

My experience has been that SCSI drives seem more reliable, but I have far fewer of those in operation, usually have them connected to a machine with high end power filtering, static free environment, temperature controlled environment (68F) and they are never turned off.

Conversly, the drives I have the most problems with are laptop drives. Constantly turned off and on, never have clean power, and are tossed about like garbage, and are hit with ESD constantly.

hmmm, is it the drive or is it the environment and handling?

I lean towards it being the environment and handling.
SCSI drives are designed and tested to be much more reliable... and they are...

it's not a marketing gimmick or whatever this person "swore" to...

on that idea i know a guy on a forum who 'swears to' aliens being in charge of the US Government... and we've all been duped...

the entire notion of serial drives was conceived as a way to make a cheap piece of hardware for non-essential uses... long ago...

i always like those Macs for being pure SCSI so long when everyone else went to cheap drives... SUN too...
Old 10-18-07, 03:33 PM
  #24  
DVD Talk Legend
 
Join Date: Jan 2000
Posts: 16,173
Likes: 0
Received 0 Likes on 0 Posts
Originally Posted by Dr Mabuse
on that idea i know a guy on a forum who 'swears to' aliens being in charge of the US Government... and we've all been duped...
Right, just like I know a guy on the forum that swears SCSI drives are built better

I am not a complete retard, thanks though.

I didn't say I agreed with him entirely, but he claimed to work for Maxtor or something like that.

I am sure someone else here remembers the thread, it was some time ago. He said something to the extent of keep paying more for the SCSI drives then, my company can use the profit or it helps his retirement or something to that extent.
Old 10-19-07, 11:21 AM
  #25  
Senior Member
Thread Starter
 
Join Date: Mar 2003
Posts: 789
Likes: 0
Received 0 Likes on 0 Posts
Hi,

Again, thanks for this discussion. It is good to see different viewpoints.

Summary for my implementation:
I need the large amount of storage, within my budget. I'd LOVE to go SCSI and create raid5 over the entire 900TB, have everything in a secure bunker, and a mens washroom beneath the serverroom ;-), but cost is of course the limiting factor.

With the perpendicular recording, that brings in an interesting point. My "claim" that bigger harddisk (larger than 250GB) are less reliable, is a little older, and it predates the perpendicular technology.

I have been trying to find some harder evidence (reports on long term burn-in tests etc) to see what is the most reliable, and yet cheap, harddisk. Does anybody know of any test centre or review board that has its' findings posted on the internet?

Cost-wise unfortunately, I am stuck with going the sata way. I am thinking of seagate Barracuda ES drives. These are certified for 24x7 operation.

I am torn between these 2 drives:
500 GB sas version:
http://www.seagate.com/ww/v/index.js...&reqPage=Model

500 GB SATA version:
http://www.seagate.com/ww/v/index.js...&reqPage=Model

BUT. I honestly do not know the difference between SATA and SAS. I know the abbreviations (Serial ATA and Serial Attached SCSI), but that SAS drive is not a SCSI drive, right? Can anybody explain me the difference, and which one I should go for?

Thread Tools
Search this Thread

Archive - Advertising - Cookie Policy - Privacy Statement - Terms of Service - Do Not Sell My Personal Information

Copyright 2018 MH Sub I, LLC dba Internet Brands. All rights reserved. Use of this site indicates your consent to the Terms of Use.