A Byte of Rice - The Scale of Enterprise Data

[fa icon="calendar"] Jul 3, 2020 4:06:05 PM / by John Bald


If we begin with this analogy, that ONE BYTE OF DATA = ONE GRAIN OF RICE. Here is what that volume looks like if we apply it to the larger units of measure for data sets.

Byte: one grain of rice
Kilobyte: handful of rice
Megabyte: Big pot of rice
Gigabyte: Truck full of rice
Terabyte: Container ship full of rice
Petabyte: Covers the Province of Nova Scotia (55,284 km²)
Exabyte: Covers Western Canada
Zettabyte: Fill the Pacific Ocean (660,000,000 km3)
Yottabyte: Earth-sized rice ball


Not long ago, a Terabyte was considered a significantly large amount of data. Enter the new age of smartphones and the Internet of Things (IoT), data is being created at a mind-boggling rate. Not only is it being created directly by humans, it is being generated automatically by machines and thousands of other types of internet connected devices.


Worldwide data is expected to hit 175 zettabytes by 2025, representing a 61% Compound Annual Growth Rate (CAGR)

- IDC: “Data Age 2025”


This is the internet, right?  Who cares how many funny cat videos, memes, and pictures of people’s breakfast are clogging up the servers at Facebook and Twitter etc.?


irrelevant data


The truth is, nobody cares. However, this is not the data that hackers and manipulators are looking for, it is the personal information of the 2.2 billion monthly Facebook users and the 336 million monthly Twitter users that they are interested in. After Facebook was fined 5 billion dollars by the FTC (Federal Trade Commission) in 2019 for violating consumers’ privacy rights, they are now very concerned about data security and the protection of PII (Personally Identifiable Information).

There are additional problems arising that organizations need to be prepared for, that not only applies to this newly generated data, but also to the legacy data that already exists. Yes, the dark data.

New personal data regulations like CCPA (California Consumer Privacy Act) and GDPR (General Data Protection Regulation) are being deployed in order to protect global citizens and fines are being levied against organizations that have had data leaks, security breaches, or cannot prove compliance when audited. Fines are fine, but they don’t solve the problem.

It’s not just the social media giants that are concerned, and it’s not just consumers’ information to be concerned about.

Almost every organization whether they are financial institutions, utilities, government agencies, manufacturers etc. has sensitive data that must be managed, secured, or even legally disposed of by a certain time. Files containing IP (Intellectual Property) information can pose a huge risk if it was to fall into the wrong hands.

The largest of organizations are generating terabytes and petabytes all on their own.


In the case of General Electric “Just one of their gas power station turbines generates around 500 gigabytes of data a day

-  Bernard Marr “GE – Big Data and the industrial internet”


Throw in a pandemic like COVID-19 and data security is now at even greater risk with employees working from home and accessing data from remote locations. Some organizations with deeper pockets, and more technologically prepared, will have reduced risk, but that is not the case for every company.

The ability to manage the amount of data that is being generated is beyond human scale.

managing accelerating data growth


The information illustrated in the graph above, taken from our webinar (Reach Your Full Potential By Maximizing Corporate Knowledge), has put a huge magnifying glass on data security, risk management and regulatory compliance.

And finally, the good news! Companies like Shinydocs have created, and are continually developing, software and automated workflows that address this growing problem. By leveraging AI (Artificial Intelligence), RPA (Robotic Process Automation), machine learning and other current technologies, we are solving this problem at machine speed.

By doing this, we are helping the humans; the CIO’s, the CDO’s, the records managers, the data governance specialists the members of the legal department, and therefore the CEO…sleep at night.

Topics: Enterprise, Security, AI, Dark Data, GDPR, Machine Learning, Automation

John Bald

Written by John Bald

Recent Posts

Subscribe to Email Updates