You are here
Big data — what it is and why does it matter
By Jean-Claude Elias - Sep 29,2016 - Last updated at Sep 29,2016
Each trade, each specialty has its own jargon, but Information Technology (IT) is particularly good at introducing new terms all the time, challenging you to keep learning relentlessly, at the risk of being quickly outpaced if you don’t. Words that apparently are ordinary vocabulary suddenly have new meaning.
Most everybody now knows that in the world of IT the word cloud is not “a visible mass of condensed water vapour floating in the atmosphere” but refers to the Web or the Internet. Big data, on the other hand, is a relatively new inception in the field, has a very specific meaning, and is not just a vague word about the size of the information stored digitally.
Big data refers to the huge amounts of data that, because precisely of the quantity, make it possible, thanks to their analysis, to extract and obtain conclusions, decisions, patterns and results that would be meaningless or impossible to obtain or reach with smaller amounts of data.
Big data examples are the entire population of a country, the army of a superpower, the clients base of a multinational bank or corporation. It could be the huge database of subscribers to a social network such as Facebook, LinkedIn and the like, or the patrons of Amazon giant online shop, or the subscribers to Google’s gigantic Gmail service. On the other hand the database of your local country club members does not exactly qualify as big data, understandably. It really has to be tens of millions or even billions to qualify.
Analysing big data, typically with advanced mathematics, lets researchers and scientists develop methods and reach decisions that can lead to concrete action. The scope of applications is extremely wide and we have only seen the tip of the iceberg so far.
It could be for straightforward commercial purpose; for example to better understand customer trends, taste, needs, purchasing patterns and behaviour, so as to come up with more efficient advertising and in the end to achieve higher sales and profit.
When medical big data is shared between a large number of hospitals and laboratories across several countries — an increasingly common process these days — is analysed, it could lead to finding cures and medicine for diseases. The “bigger” the data and the higher the probability to reach concrete, tangible results.
Only modern networks and fast server computer structures make the collection of big data possible, which is why it was not such a hot subject 10 or 15 years ago. And yes, the cloud is a major contributor to the global big data effort; it is an integral part of the process.
Data mining, another typical IT expression, is one aspect of using big data. It consists of analysing big data but with a very specific purpose in mind, not a general one. It could be, for instance, searching for the occurrence of the text “audio CD” in e-mails, studying the frequency of occurrence, the country of the senders, their age, their gender, and so forth, for marketing purposes. In other instances it could help the authorities fight crime and terrorism, by analysing emails or messages exchanged over social networks and searching for specific, meaningful keywords.
The more the world is connected, the bigger big data will become and the more significant the results of their analyses.
Some will see in this the dreaded aspect of the Big Brother notion, whereas others will see better business, more efficient scientific research, improved healthcare and human communication. It all depends on what you do with it and who is doing it. One thing is certain, collecting, using and analysing big data make it a very powerful tool.
Related Articles
Statistical capacity is defined by the World Bank as “a country’s ability to collect, analyse and disseminate high quality data about its po
By Marisa Kendall SAN JOSE, California — Delivering an eBay order in under 30 minutes, mining 11 million financial documents for eviden
Walk into a room of people, and your smartphone can tell if you have a connection to any of them, if it can find the right data.