Image Credit: Alleksana on Pexels
By Lilian H. Hill
Big data refers to extremely large and complex datasets generated at high speed from a wide variety of sources, including social media, sensors, transactions, and mobile devices. These datasets are so vast and varied that traditional data processing tools cannot handle them efficiently; therefore, advanced technologies and analytics are required to extract meaningful insights. Due to its size and complexity, AI is being used to make sense of the data. However, Jones (2025) points out that we cannot abdicate our responsibility for making sense of data to machines. Instead, we need to identify the mistakes AI is making and the opportunities it is missing. Relating data literacy to big data underscores the importance of developing data analysis skills in today’s world.
Big data is often characterized by the 5 Vs (Saeed & Husamaldin, 2021):
1. Volume: Refers to the massive amount of data generated every second from sensors, social media, transactions, and more that organizations must store, manage, and analyze.
2. Velocity: The speed at which data are generated, processed, and analyzed. Real-time or near-real-time data processing is crucial for making informed decisions promptly.
3. Variety: Describes the different types of data, including structured, semi-structured, and unstructured, such as text, images, videos, audio, and sensor data.
4. Veracity: Focuses on data quality, accuracy, and trustworthiness. Low veracity can lead to misleading insights if the data is incomplete, inconsistent, or biased.
5. Value: Emphasizes the importance of extracting meaningful and actionable insights from data to inform decisions and generate business or societal impact.
Some authors (Saeed & Husamaldin, 2021) refer to 8 or even 10 Vs and include:
6. Variability: Relates to data inconsistency and the changing meaning of data over time or across contexts. For example, the same word in different datasets may have different implications.
7. Visualization: Concerns how data are represented visually to enable human understanding and insight. Effective data visualization helps communicate complex patterns and support data-driven decisions.
8. Volatility: Refers to how long data remain relevant and how long it should be stored. Some data have a short shelf life and quickly lose value, requiring timely processing.
9. Validity: Refers to how accurately and appropriately data reflect what it is intended to measure or represent for a specific purpose. While it may seem like veracity, they are distinct concepts. A dataset can have high veracity, meaning it is trustworthy, yet still lack validity if it does not align with its intended application. Simply put, a dataset cannot be assumed to be suitable or reliable for decision-making without proper validation.
Wesson et al. (2022) propose an additional V relating to research ethics:
10. Virtuosity: Integrates frameworks of equity and justice. This includes analytical approaches to advancing equity, including social computational big data, fairness in machine learning algorithms, and data augmentation techniques. Wesson et al. (2022) emphasize the concept of data absenteeism, referring to who is left out of data collection and the role of positionality in shaping research outcomes. They further state that a fundamental aspect of any scientific endeavor is understanding both the methods used to collect or generate data and the disparities between the study population and the broader target population.
Big Data and Job Opportunities
Big Data presents both unprecedented opportunities and significant challenges. The demand for individuals who can critically and ethically navigate an information landscape characterized by its size and complexity is growing rapidly. The acceleration of digitalization has amplified the demand for digital competencies across various employment sectors. This trend is particularly evident in scientific fields, where employers increasingly seek candidates proficient in digital skills. A comprehensive analysis of 126,360 scientific job advertisements from Science Careers, spanning 2019 to 2023, highlights this shift (Zhang et al., 2024). The study reveals a consistent upward trajectory in the requirement for digital proficiencies, with higher-paying positions more frequently requiring such skills. Expertise in data analysis, statistics, and statistical software (e.g., Python, and R) has seen a growing demand, while traditional skills like data collection have become less critical.
This trend aligns with broader labor market projections. For instance, the U.S. Bureau of Labor Statistics (2025) anticipates a 36% growth in data scientist roles from 2023 to 2033, driven by the increasing reliance on data-driven decision-making across industries. Similarly, the World Economic Forum (2025) forecasts a 30-35% rise in demand for roles such as data analysts and scientists, propelled by advancements in frontier technologies. These projections underscore the crucial importance of integrating digital skills into educational curricula to equip the future workforce for the evolving demands of the scientific and technological sectors.
Data analytics is integral to various aspects of business operations, including informed decision-making, operational efficiency, customer understanding, competitive advantage, risk management, personalization, and innovation. By aligning curricula with these industry demands, educational institutions can prepare graduates to make effective contributions to data-driven strategies and innovations in their respective fields.
Big Data and Job Skills
Big data amplifies the importance of statistical reasoning and computational thinking, which are essential components of advanced data literacy. Machine learning and AI techniques used to analyze big data require users to understand how models are trained, what features are prioritized, and how predictions are generated. Without this understanding, users may misinterpret automated outputs as objective truth when, in fact, they may reflect biased or flawed assumptions embedded in the data (O’Neil, 2016).
Data visualization and storytelling are essential skills when working with large datasets. Given the overwhelming volume of information, the ability to distill meaningful patterns, trends, and insights through clear visuals becomes a necessary skill for decision-making in business, policy, and research. Tools such as Tableau, Power BI, and Python libraries (e.g., Seaborn, Matplotlib) make this possible, but their effective use requires both technical proficiency and ethical awareness.
Organizations generate increasing volumes of data daily, making the roles of data analysis and analytics pivotal in effectively managing and leveraging this information. Consequently, educational programs in data analysis and analytics must evolve to align with the industry's dynamic needs and meet professional expectations (Booker et al., 2024). In conclusion, the rise of big data transforms data literacy from a helpful skill into a critical form of digital citizenship. It enables individuals not only to work with complex information but also to scrutinize how data are collected, analyzed, and used. In a world where algorithms and data models increasingly drive decisions, widespread data literacy is essential to ensure that big data serves the public good rather than undermining it.
References
Booker, Q. E., Rebman, C. M., Wimmer, H., Levkoff, S. B., Powell, P. & Breese, J. L. (2024). Data analytics position description analysis: Skills review and implications for data analytics curricula. Information Systems Education Journal, 22(3), 76–87.
Jones, B. (2025). Data literacy fundamentals: Understanding the power and value of data (2nd ed.). Data Literacy Press.
Saeed, N. & Husamaldin, L. (2021). Big data characteristics (V’s) in industry. Iraqi Journal of Industrial Research, 8, 1-9. 10.53523/ijoirVol8I1ID52.
U.S. Bureau of Labor Statistics (2025, April 18). Fastest growing occupations. https://www.bls.gov/ooh/fastest-growing.htm
World Economic Forum (2025, January 7). Future of Jobs 2025. https://www.weforum.org/publications/the-future-of-jobs-report-2025/
Zhang, G., Wang, L. Shang, F. & Wang, X. (2024): What are the digital skills sought by scientific employers in potential candidates? Journal of Higher Education Policy and Management, 47(1), 20-37. https://doi.org/10.1080/1360080X.2024.2374392