Why is Big Data growing so fast?

TheStoneAgedDeveloper
5 min readJul 16, 2022

The big data industry is forecasted to increase with a growth rate (CAGR) of 10% per annum. The global tech council have written that revenues will exceed $123.2 billion by 2025 in the big data market. Global information technology spending amounted to $2.7 trillion with big data accounting for 7.5% of this amount. Several questions arise as a result of these significant values, why is the big data market growing so rapidly? What is the driving force? I will be addressing these questions in this short and sweet article and also shedding some light on the challenges we face in this data-driven ecosystem today!

Photo by Markus Winkler on Unsplash

The experts have made 4 key predictions that will propel the data market forward over the next few years:

  1. The increase in data volumes

Most data enthusiasts agree and consumers of data generating devices believe that the amount of data generated will increase exponentially in the future. Ever heard of a zettabyte? The global data volume or data sphere will reach the amount of 181 zettabytes by 2025. The amount of users on the internet is driving the constant churn of data. We have billions of connected devices and systems which are creating, collecting and sharing data every day on a global scale. The IDC (International Data Corporation) has estimated that 6 billion people will interact with each other over the internet by 2025 with each connected user interacting every 18 seconds.

Volume of data/information created, captured, copied, and consumed worldwide from 2010 to 2025 | Statista.com

2. Machine Learning

Machine Learning and intelligent systems had made a rather impressive introduction into the tech world. At this moment in time, several industry experts and research institutions are trying to figure out the powerful phenomenon behind machine learning. Machine learning can be defined as ‘a developing technology used to augment everyday operations and business processes’ or ‘a type of intelligence which allows applications to become more accurate at predicting outcomes without being programmed to do so’.

The unsupervised machine learning approach will improve a computer’s ability to learn from data through cognitive services and deeper personalisation. The use of predictive analytics will reach a staggering $22 billion by the end of 2026. Machine learning is yet an untapped market. A true gem.

3. High demand for data engineers/scientists

Data science is a relatively new field but the demand for roles in this field is already high. As data volumes continue to grow in size and complexity, more data professionals will eventually be needed to bridge the gap between need and availability. A recent 2019 survey (Pre-COVID) by KPMG found that from 108 countries, 67% of them struggled to fill roles because of skill shortages in data. The top 3 in-demand skills were around big data/analytics, security and AI/Machine Learning. What makes a good data professional?

I will touch on four core components a competent data professional should possess and have a deep understanding of:

  • Data platforms and tools e.g. Cloud data platforms (Databricks, AWS, Azure, GCP) and data visualisation tools.
  • Programming languages: Python, Java, Scala, Golang.
  • Machine learning algorithms.
  • Data manipulation techniques: building data pipelines, ETL processes and data processing.
Top 10 skills businesses say their sector has insufficient skills in | https://www.gov.uk/government/publications/quantifying-the-uk-data-skills-gap/quantifying-the-uk-data-skills-gap-full-report

4. The rise of ‘fast and actionable data’

Faster data allows for processing in real-time. NoSQL databases and those which are open source often use batch mode which opposes the idea of fast data. Streaming data or stream processing allows for real-time data analytics. This is an advantage and can increase the value of an organisation allowing them to make decisions quickly and efficiently when the data arrives. The IDC predicts that 30% of all global data will be real-time by 2025. Big data is essentially worthless without analysis and its volume is very large and the data is unstructured. Actionable data is often the missing link between an organisation falling behind in its market or thriving as a leader.

Challenges:

  1. Data Volumes — Most datasets in the past have been stored and processed using open source ecosystems such as Hadoop and NoSQL. However, these open source ecosystems require manual configuration, and troubleshooting and are very complicated systems for some companies to use.
  2. Machine Learning — There is an ethical and regulatory issue surrounding the use of machine learning in some sectors such as banking. Giants such as IBM are calling for transparency in these machine learning models by accompanying these models with algorithms which monitor bias.
  3. The data security problem — data security and privacy have and will continue to be an issue. Growing data volumes creates additional challenges to data security in protecting it from security attacks and trying to keep data protection levels at optimal levels. The average cyber-related losses amounted to $4.7 million across all companies in the US. Some of the reasons behind the data security problem are as follows:
  • The skill gap in security — last year in 2021, it was estimated that the gap in unfilled cybersecurity stood at 3.5million.
  • Cyber attacks are evolving — threats/attacks used by hackers are becoming more complex and keeping up with protection is tougher than ever before.
  • Security standards — companies and organisations are still deciding to ignore data security standards which makes it tougher for the industry as a whole to be ‘data secure’.

4. Lack of proper understanding — creating a modern culture of data within an organisation is tough. Several companies have failed and will continue to fail when it comes to trying to propel data initiatives due to insufficient understanding. Not having a transparent picture, or having employees/data professionals not understand the problem at hand is detrimental on a small and large scale. A potential solution to this is to continually develop in line with market demands, hold workshops and seminars to make sure the company as a whole understand what is happening in the fast-changing world of data.

I will conclude my short article on a quote by Hal Varian, Chief Economist, at Google:

“The ability to take data — to be able to understand it, to process it, to extract value from it, to visualise it, to communicate it — is going to be a hugely important skill in the next decades.”

--

--

TheStoneAgedDeveloper

Data and Cloud Enthusiast | Self Improvement | Active reader 📚 turned writer