all bits considered data to information to knowledge


How big is Big Data?

There is no shortage of definitions for the ‘Big Data’ buzzword. Usually it is described in multiples of “V” – volume, velocity, variety (plug in your favorite data-related problem).

I believe that Big Data is defined only by our ability to process it.

There has always been Big Data, since the time when it was chiseled into stone, one symbol at the time.

We were talking about big data when it was written onto the papyri, vellum, paper; we have invented libraries, Dewey system, Hollerith cards, computers, – all in the name to process ever-increasing volumes of data, ever faster. Once upon a time a terabyte of data was “unimaginably big” (hence a company named “Teradata”), now a petabyte appears to be the “BIG” yardstick, only to be replaced with exabyte, zettabyte etc. in the near future; instead of batch processing we are moving to real-time, and, as with every bit of digital information, we are still storing numbers that to us represent text, video, sound and - yes - numbers.

The electronic data processing made a complete circle – from unstructured sequential files to structured hierarchical/network/relational database to NoSQL graph/doc databases and Hadoop sequential files processing.

Each round brings us closer to “analog data” – the ones that don’t have to be disassembled into bits and bytes to be understood and analyzed, the raw data.

Crossing the artificial chasm between digital and analog data will be the next frontier.


Ecclesiastes 9-11:
What has been is what will be,
and what has been done is what will be done,
and there is nothing new under the sun.

Is there a thing of which it is said,
“See, this is new”?
It has been already
in the ages before us.

There is no remembrance of former things,
nor will there be any remembrance
of later things yet to be among those who come after.