What Is Big Data and How Is It Used for Decisions?

Series: HTQ Digital Technologies: The Study Podcast | Module: Unit 5: Big Data and Visualisation | Episode 22 of 80 | Hosts: Alex with Sam, Digital Technologies Specialist

Key Takeaways

✓Big data is characterised by the five Vs: Volume (the enormous scale of data generated), Velocity (the speed at which it is produced and processed), Variety (the diversity of data types and sources), Veracity (the uncertainty and reliability of the data) and Value (the potential insights and outcomes it enables).
✓Traditional relational databases and processing tools are not designed to handle big data: specialist technologies such as Hadoop, Spark and NoSQL databases have emerged to address the unique challenges of big data storage and processing.
✓Data visualisation is an essential complement to statistical analysis in big data contexts: the human brain can detect patterns in well-designed charts that are invisible in tables of numbers, making visualisation a critical decision-support tool.
✓Big data is not inherently valuable: the value depends entirely on asking the right questions, applying the right analytical techniques and communicating findings in a way that decision-makers can act on.
✓The ethical dimensions of big data, including privacy, consent, discrimination and transparency, are as important as the technical dimensions and must be considered at every stage of a data project.

Listen to This Episode

Listen to the full episode inside the course. Enrol to access all 80 episodes, plus assignments, tutor support and Student Finance funding.

Start learning →

Full Transcript

Alex: Welcome back to The Study Podcast. I'm Alex and today Sam and I are starting Unit 5, which is all about big data and visualisation. Sam, big data is one of those terms that gets thrown around enormously. What does it actually mean?

Sam: The term was popularised as a way of describing data sets that are too large, too varied or arrive too quickly to be processed by conventional database and analytics tools. The classic definition uses the five Vs: Volume, the sheer scale of the data; Velocity, the speed at which it is generated and needs to be processed; Variety, the range of formats from structured database records to images, videos, text and sensor readings; Veracity, the uncertainty about the accuracy and completeness of the data; and Value, the useful insights that can be extracted from it.

Alex: So this isn't just about having a bigger database?

Sam: Not at all. The volume dimension alone doesn't define big data. A single organisation might generate more raw data than a large corporation of thirty years ago, not because they're doing more fundamentally different things but because the sensors, systems and interactions that generate data have proliferated so dramatically. Every click, every sensor reading, every transaction, every social media post adds to the torrent.

Alex: What has changed to make this data useful rather than just overwhelming?

Sam: Three things, really. The dramatic reduction in the cost of storing data, so it has become economical to keep data indefinitely rather than deleting it. The development of distributed computing technologies like Hadoop and Spark that can process enormous data sets by spreading the work across many machines simultaneously. And the advancement of machine learning, which provides methods for finding patterns in data that are too complex for traditional analytical approaches to detect.

Alex: How does visualisation fit into this?

Sam: Visualisation is what transforms big data analysis from an exercise that only data scientists can understand into something that informs decisions made by non-technical people. The human visual system is extraordinarily good at detecting patterns, trends and outliers in well-designed charts and diagrams, even when those same patterns would be completely invisible in a table of numbers. A dashboard showing a company's sales performance across regions over time, with trends and anomalies highlighted, can drive better decisions in a ten-minute management meeting than a spreadsheet that would take hours to interpret.

Alex: Are there risks in data visualisation? Can it mislead?

Sam: Absolutely, and this is important to understand. A well-crafted chart can make a very weak trend look compelling, or make a large change look insignificant by manipulating the scale of the axes. Cherry-picking the time period or the subset of data to show can give a very misleading picture. Good data literacy includes the ability to critically evaluate visualisations, understanding that how data is presented is a choice that reflects the assumptions and sometimes the agenda of the person who made the chart.

Alex: So there's an ethical dimension to data visualisation.

Sam: Very much so. The tools of data visualisation are genuinely powerful, and with that power comes a responsibility to represent data honestly and to communicate uncertainty rather than hiding it. This is something that the data specialist roles we'll talk about later in this unit take seriously.

Alex: A really compelling introduction to this unit. Thanks, Sam. We'll dig into the specific analytical techniques in the next lesson.

What Is Big Data and How Is It Used for Decisions?

Related content

HTQ Computing: Full Curriculum

HTQ Computing: The Study Podcast

Welcome to Your HNC Computing: What to Expect

Your Basket