Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

1000^n is only really used by HDD manufacturers to make their storage look larger. Everyone else uses 1024^n. XiB was invented to solve the confusion of everyone using XB but adoption was low and personally I think XiB is more confusing anyways (given people naturally assume XB is 1024^n).

I think what confuses consumers more is byte vs bit, or more specifically, that the capitalisation of the ‘B’ matters.



1000^n is used by:

- hard drive and storage manufacturers

- measurements of network speeds (1Mbps is 1000000 bits per second)

- online storage providers (Dropbox's paid plan for 2TB is 2000GB for instance)

- macOS and iOS

- Ubuntu (see https://wiki.ubuntu.com/UnitsPolicy )

- most modern GNOME applications and most GUI applications on desktop Linux

- hard drive manufacturers have been sued over this, and US courts have agreed with hard drive manufacturers that 1 GB = 1000 MB

- the International System of Units (SI) and the International Electrotechnical Commission (IEC) both use the decimal definitions (1 kB = 1000 B)

The last major hold-outs are Microsoft Windows and old command-line applications that want to preserve backwards compatibility with any script that might parse their output.

Personally, I don't find the XiB units confusing, as they are the only units with a consistent unambiguous definition. The XB units are much more confusing because you can't be sure whether the 1000^n definition is intended or not.


The thing is, that’s all relatively recent in computing terms. For the vast majority of computing history it’s been 1024^n because that’s how the hardware works — you can’t have 1000^n in a binary system where bytes are multiples of 8. All the 1000^n values you get are where the raw value in bytes is taken and then divided by powers of 1000 but that leads to values that are technically not accurate (eg values that are not addressable).

And here lies the problem, before HDD manufacturers decided to change things up, computer science was already standardised on using 1024^n. Sure there was some outliers in network theory but it was pretty easy because if you needed a precise value then you knew it was 1024^n and if you just wanted an approximate value you could still divide by 1000 in your head. It worked, everyone understood it and everyone was happy.

This whole “1000^n is more human friendly” only appears so because we now have multiple interpretations and people without a tech background making decisions about it. But frankly, if you can’t wrap your head around 1024^n then you’re already in the group of users who honestly don’t need to worry about the precision of getting the scaling right. Those who it does matter for honestly find 1024^n easier.


> And here lies the problem, before HDD manufacturers decided to change things up, computer science was already standardised on using 1024^n.

And long before that, the SI units defined K as 10^3, G as 10^6 etc.. This is how the prefix is used everywhere for every unit, with the one exception of computer scientist playing it the US way.

I personally prefer 1024 as well, but honestly, the scientific side of that argument is a lost cause.


It went downhill as early as 1.44MB floppies being 1440*1024.


The failure was the initial use of K=1024. Anyone should had invented KiB early.


Different operating systems prefer one way or the other, or a mix. MacOS uses powers of 1000, most Linux desktop environments use the correct symbol (1000 and kB, or 1024 and kiB etc), but Unix tools tend to use 1024 and "K". I believe Windows uses the non-standard symbol.

I much prefer to work with powers of 1000. Running "df" on our storage cluster shows

  2675230000214900
Although, since my terminal has this font [1] installed, it actually displays like this, with the 2 underlined:

  2͟6752͟3͟0͟0002͟1͟4͟900
It's easier to think about 2.6PB than 2.3PiB (how many 200GB files do I have space for, etc).

[1] https://blog.janestreet.com/commas-in-big-numbers-everywhere...


That’s fine if storage is your only concern, but memory doesn’t work like that. RAM has to be grouped in powers of 8 and assigned in binary. 1000^n doesn’t make any sense at a low level. It was a convention that followed later when HDD manufacturers wanted to make their drives look bigger.


Well it makes more sense to think about memory in pages or at least words (or cache lines if you think performance), disks in blocks and network in packets anyway, memory is very rarely byte addressable natively... not being picky, just to underline that power of two adressable bytes does not sum it perfectly either.


NAND chips are built in same way, but SSD manufacturers sell 256GB(not GiB) drives. That's partially because some SSD uses the difference (256GiB - 256GB) for reserved area for wear leveling. Of course primary reason is to align with HDD.


Except that's not how it's stored on the disk itself. All of the sectors on your drives are in base 2.

2, 4, 8, 16, 32, 64, 128, 256, 512, 1024 sector sizes...

So even if your files are 200GB, the actual space they consume are not base 10 as you count on your storage cluster, but 200GB plus the last remaining sector that is used but unfilled as the rest of that sector cannot be used by another file.

You can count your sizes in Base 10 but the actual use of space is still Base 2.

If a file is exactly 1000 bits and your sector sizes are 1024... that 1000 bit file is using 1024 bits of space on that drive.

And no, just because they advertise or display drives as having 1,000,000 Bytes = 1MB... it doesn't change how space is sectored out on that drive itself.


A 200,000,000B file vs 200,000,512B makes essentially no difference when purchasing hardware, planning storage, moving data, sizing systems etc.

Someone telling me a file is 200MB when they mean 200MiB (210MB) can make a difference, which is why we should strive for accuracy.


I do use 1000^n, but I agree that most people tend to use 1024^n. 1000^n kind of makes more sense since "kilo, mega" etc. are the actual SI prefixes for multiples of 1000s. I don't know who or what caused this chaos but 1000^n is definitely more human friendly.


I feel the problem may be that, unlike just about every other unit in SI, bits are discrete, not continuous. Except in few very specific subfields of theoretical CS, there's no concept of a fractional bit. You can have kilometers and millimeters, you can have kilobits and kilobytes but not milibits and milibytes.

The nature of bits is that of a base-2 system, so using power of 10s for counting them is only superficially human friendly - in practice it's human-unfriendly, because it flies in the face of how bits are used. All hardware and all software groups them by powers of 2, that's inherent to what bits are.


> 1000^n is definitely more human friendly.

It may be more human friendly, but 1024^n is more programmer friendly, especially at the low level.


Yes, powers of two match physical reality of binary computer architectures, while powers of ten in computing are a marketing concept.


1000^n might be more human friendly but computers aren’t decimal machines and a byte isn’t 10bits. 1024^n technically makes sense as a unit for binary machines that have 8bits to a byte.

Everyone was happy with 1024^n convention in the 80s. The problem was HDD manufacturers got greedy and switched to 1000^n to make their drives sound like they had more storage. Thats what started the confusion.


RFC 1951 (NTP) was published in 1988 and refers to 56k modems. Does a 56k modem operate at 57344 bits per second or 56000 bits per second? Your claim implies the former, but I'm pretty sure it was always the latter.

> a byte isn’t 10bits

It could be. Historically, the number of bits per byte varied somewhat from machine to machine. Many standards used the term 'octets' to avoid ambiguity.


Historically yes. But even as early as the 60s 8bit was the norm. IIRC C then “standardised” 8bits (though ASCII went some way to doing that prior to C).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: