Normalisation: Eliminating Redundancy in Database Design

Series: HTQ Computing: The Study Podcast | Module: Unit 4: Database Design and Development | Episode 21 of 80 | Hosts: Alex with Sam, Computing Specialist

Key Takeaways

✓Normalisation is the process of organising a database to reduce redundancy and improve data integrity.
✓First Normal Form requires that all attribute values are atomic, meaning no repeating groups or multi-valued attributes.
✓Second Normal Form eliminates partial dependencies, ensuring that all non-key attributes depend on the entire primary key.
✓Third Normal Form removes transitive dependencies, so that non-key attributes depend only on the primary key, not on other non-key attributes.
✓A fully normalised database is easier to maintain and update, though in some high-performance scenarios deliberate denormalisation may be appropriate.

Listen to This Episode

Listen to the full episode inside the course. Enrol to access all 80 episodes, plus assignments, tutor support and Student Finance funding.

Start learning →

Full Transcript

Alex: Today we're exploring normalisation, which is one of the core techniques in database design. Sam, what problem is normalisation solving?

Sam: Normalisation solves the problems of data redundancy and update anomalies in databases. If the same information is stored in multiple places, keeping it consistent becomes very difficult. If you update it in one place and miss it in another, you have inconsistent data. Normalisation gives us a systematic way to organise data so that each piece of information is stored in exactly one place.

Alex: Can you give us a concrete example of the kind of problem normalisation addresses?

Sam: Imagine you have a flat table for orders that includes the customer's name, address, and email alongside the order details. If the same customer places ten orders, their name and address are stored ten times. If they move house, you have to update ten rows. And if you miss even one, the data is inconsistent. Normalisation would separate the customer data into its own table, storing the address once, with orders linking to it via a customer ID.

Alex: Let's walk through the normal forms. What's the first normal form?

Sam: First Normal Form, or 1NF, has a simple but important rule: every attribute value must be atomic, meaning it stores a single, indivisible value. You can't have a column that stores multiple values, like a comma-separated list of phone numbers. Each piece of information must be in its own column. This ensures the data is structured in a way that the database can process efficiently.

Alex: And Second Normal Form?

Sam: Second Normal Form builds on 1NF by requiring that every non-key attribute depends on the entire primary key, not just part of it. This is only relevant when the primary key is composite, made up of multiple columns. If you have a table where the primary key is a combination of OrderID and ProductID, but the product name depends only on the ProductID and not on the OrderID, the product name violates 2NF. The solution is to move it to a separate Products table.

Alex: And Third Normal Form?

Sam: Third Normal Form requires that non-key attributes depend only on the primary key, not on other non-key attributes. If you have a table with a customer ID, a city, and a postcode, and the city can be determined from the postcode alone, then city has a transitive dependency on the primary key through postcode. To fix it, you'd separate postcodes and their corresponding cities into their own lookup table.

Alex: What happens in practice? Do you always fully normalise?

Sam: For most transactional systems, you'd aim for Third Normal Form as a minimum. Going further, to Boyce-Codd Normal Form or beyond, is sometimes appropriate for particularly complex designs. However, in data warehousing and reporting contexts, you sometimes deliberately denormalise, accepting some redundancy in exchange for faster query performance. The key is to make conscious, informed decisions rather than normalising or denormalising without understanding why.

Alex: Is there a practical process for normalising an existing flat dataset?

Sam: Yes. Start with 1NF: make sure all attributes are atomic. Then move to 2NF by identifying and removing partial dependencies. Then 3NF by identifying and removing transitive dependencies. At each step you'll be creating new tables and establishing foreign key relationships. It's methodical work, and there are established techniques including functional dependency analysis that support the process.

Alex: Brilliant. Thanks Sam. In the next lesson we get into building the actual database using SQL.

Normalisation: Eliminating Redundancy in Database Design

Related content

HTQ Computing: Full Curriculum

HTQ Computing: The Study Podcast

Welcome to Your HNC Computing: What to Expect

Your Basket