Measuring digital data

Article By : Dr. Lauro Rizzatti

Where better to look for proof of exponential progress of science than in the mindboggling escalation of prefixes associated with physical metrics?

« Previously: Behind the mind-boggling growth of digital data storage

Let's consider an oft-overlooked anomaly with regard to measuring digital data. This anomaly is that digital data is measured using a binary system, not a decimal or metric system. The basic unit of digital data is the bit ("b"), and eight bits make up a byte ("B"). Alphanumeric characters are coded in bytes, one per character. The storage industry uses bytes, while the networking industry refers to transmission speeds using bits-per-second.

In a metric system, 1,000 is equal to 10 to the power of 3 (103), but 1kb (kilobit) or 1kB (kilobyte) correspond to two to the power of 10 (210), which equates to 1,024 bits or 1,024 bytes, respectively. In other words, 1kB is a little larger than 1,000 bytes. This is a small difference that, oftentimes, no one cares about. However, when the amount of information reaches a trillion bytes (1TB), the difference amounts to 10%, and that's no longer trivial. Table 2 illustrates the multiplying factor associated with using a binary system.

[storage metric prefixes table 02]
__Table 1:__ *Examples of prefixes used to measure digital data with a binary system (Source: Lauro Rizzatti)*

Attempts to solve this conundrum have been made by several organisations who have suggested the use of a different set of prefixes for the binary system, such as kibi for 1,024, mebi for 1,048,576, gibi for 1,073,741,824, and so forth. To date, none of these are in general use.

Consumers continue to ignore the difference, while disk drive and computer manufacturers targeting consumers only mention it in passing in the "small print." Enterprise storage companies, on the other hand, now live in the terabyte/petabyte era and do distinguish between the two—at least when calculating and comparing costs.

Digital data storage supply and demand

The advent of the computer accelerated our ability to create data, but it also brought a new challenge. Now that we can generate data blazingly fast, how do we store it?

My Compaq 386 desktop from around 1989 had a hard disk drive (HDD) with a capacity of about 100MB. In 2001, about 10 years later, the data storage capacity of my laptop HDD amounted to about 2GB—roughly an increase of one order of magnitude or 10X. My 2016 laptop boasts a solid state hard drive (SSHD) with 1TB of capacity. That's in the ballpark of one thousand times increase in less than 15 years.

It's far easier to generate zettabytes of data than to manufacture zettabytes of data storage capacity. A wide gap is emerging between data generation and hard drive and flash production. In Figure 3, the blue bar chart maps data growth—actual and estimated—over a 20-year period. The orange bar chart tracks storage factory capacity.

[storage supply demand table]
__Figure 1:__ *This chart shows storage supply and demand growth over two decades (Source: Recode)*

By 2020, demand for capacity will outstrip production by six zettabytes, or nearly double the demand of 2013 alone.

Next: EDA tools double design data annually »

« Previously: Behind the mind-boggling growth of digital data storage

Leave a comment