Top 20 Latest Research Problems in Big Data and Data Science
Problem statements in 5 categories, research methodology and research labs to follow
Towards Data Science
Dr. Sunil Kumar Vuppala
Jun 27 2020
Even though BigData is in the mainstream of operations as of 2020, there are still potential issues or challenges the researchers can address. Some of these issues overlap with the data science field. In this article, the top 20 interesting latest research problems in the combination of big data and data science are covered based on my personal experience (with due respect to the Intellectual Property of my organizations) and the latest trends in these domains [1,2]. These problems are covered under 5 different categories, namely
Core Big data area to handle the scale
Handling Noise and Uncertainty in the data
Security and Privacy aspects
Intersection of Big data and Data science
The article also covers a research methodology to solve specified problems and top research labs to follow which are working in these areas.
I encourage researchers to solve applied research problems which will have more impact on society at large. The reason to stress this point is that we are hardly analyzing 1% of the available data. On the other hand, we are generating terabytes of data every day. These problems are not very specific to a domain and can be applied across the domains.
Let me first introduce 8 V’s of Big data (based on an interesting article from Elena), namely Volume, Value, Veracity, Visualization, Variety, Velocity, Viscosity, and Virality. If we closely look at the questions on individual V’s in Fig 1, they trigger interesting points for the researchers. Even though they are business questions, there are underlying research problems. For instance, 02-Value: “Can you find it when you most need it?” qualifies for analyzing the available data and giving context-sensitive answers when needed.
Fig 1: 8V’s of Big data Courtesy: Elena
Having understood the 8V’s of big data, let us look into details of research problems to be addressed. General big data research topics  are in the lines of:
Scalability — Scalable Architectures for parallel data processing
Real-time big data analytics — Stream data processing of text, image, and video
Cloud Computing Platforms for Big Data Adoption and Analytics — Reducing the cost of complex analytics in the cloud
Security and Privacy issues
Efficient storage and transfer
How to efficiently model uncertainty
Quantum computing for Big Data Analytics
Next, let me cover some of the specific research problems across the five listed categories mentioned above.