DDD  ·  Database Design & Development

Database Concepts

Lesson DDD1 of 10 Approx 60 min Films database required
Learning intentions
  • Understand what a database is and why databases are used to store data
  • Identify key database structures: tables, records, and fields
  • Explain the role of primary keys and foreign keys in a relational database
  • Recognise common data types: TEXT, INTEGER, REAL, and NULL
Success criteria
  • I can define the terms: database, table, record, and field
  • I can explain what a primary key is and why every table needs one
  • I can explain what a foreign key is and how it links two tables
  • I can identify the correct data type for a given piece of data
  • I can give one advantage of a relational database over a flat-file spreadsheet
💾
Database file required for this lesson

Download films.sql and import it into DB Browser for SQLite before the warm-up. If you haven't set up DB Browser yet, follow the setup guide first.

↓ Download films.sql
Warm up — what do you already know?

Answer before the lesson begins. These check prior knowledge — it's fine if you're unsure.

1. You have used spreadsheets before. In a spreadsheet each row holds data about one thing. What is a single row of data called in a database?

2. Which of the following is not a real-world example of a database?

3. You imported films.sql into DB Browser for SQLite before this lesson. How many tables does the films database contain?

Key vocabulary

Database
An organised collection of structured data, stored electronically and managed by a DBMS.
DBMS
Database Management System — software that stores, retrieves, and manages data (e.g. SQLite, MySQL).
Table
A grid of rows and columns that stores data about one type of entity (e.g. all directors).
Record
A single row in a table — all the data about one item (e.g. one director or one film).
Field
A single column in a table — stores one piece of information shared by all records (e.g. last_name).
Primary Key (PK)
A field whose value uniquely identifies each record and is never NULL — e.g. director_id.
Foreign Key (FK)
A field in one table that matches the primary key of another — used to link two tables.
Data Type
Specifies the kind of value a field stores: TEXT, INTEGER, REAL, or NULL.
NULL
A special marker meaning a value is missing or unknown — not zero and not blank text.
Relationship
A link between two tables, created using a foreign key, that avoids storing duplicate data.

Database Concepts

What is a database?

A database is an organised collection of structured data stored electronically. Unlike a document or a spreadsheet, a database is designed so data can be stored, searched, sorted, and updated with precision and speed. Every digital service you use depends on databases: streaming a film, logging into a school app, searching for a product online — all of these involve database queries happening in the background.

A Database Management System (DBMS) is the software that controls access to the database. It handles storing data on disk, running queries, preventing two users from corrupting the same record at once, and enforcing rules about what data is allowed. Examples include SQLite, MySQL, and PostgreSQL. In this course we use SQLite via the free tool DB Browser for SQLite.

Tables, records, and fields

Data in a relational database is stored in tables. Each table stores information about one type of entity — a person, a product, a film, a director. You can think of a table like a spreadsheet grid, but with stricter rules:

  • Each column is a field — it stores one piece of information about every record (e.g. last_name, nationality).
  • Each row is a record — it holds all the information about one specific item (e.g. one director).
  • Every cell in a table holds exactly one value (or NULL).

Our films database has two tables. The director table has 15 records and 4 fields. The film table has 52 records and 8 fields. Each table stores a different type of entity, and the two tables are linked together using a key.

Primary keys

Every table must have a primary key (PK) — a field whose value uniquely identifies each record. A primary key must satisfy three rules:

  • Unique — no two records in the table can have the same value.
  • Not NULL — every record must have a value; it can never be missing.
  • Stable — ideally the value never changes once it has been assigned.

In the director table, the primary key is director_id. Every director is assigned a unique integer (1, 2, 3…). Using a number is more reliable than using a name: two different directors could share the same surname, but they can never share the same ID.

Foreign keys

A foreign key (FK) is a field in one table whose value refers to the primary key of another table. It creates a relationship between the two tables.

In the film table, the field director_id is a foreign key. For the film The Dark Knight, this field contains the value 2. Looking that value up in the director table tells us that director 2 is Christopher Nolan — so The Dark Knight was directed by Christopher Nolan.

Without foreign keys, we would have to copy Nolan's name, nationality, and all other details into every one of his films. Storing the same data in multiple places is called data redundancy: it wastes space, and if the data needs to change, it must be updated in every row — which is error-prone. Foreign keys eliminate this by storing director details once and linking to them.

Data types

Every field has a data type that defines what kind of value it can store. SQLite uses four main types:

  • TEXT — any sequence of characters (e.g. a film title, a genre label, a certificate like "12A").
  • INTEGER — whole numbers only, positive or negative (e.g. a year, a runtime in minutes, an ID).
  • REAL — decimal numbers (e.g. a rating of 8.4).
  • NULL — a special marker meaning the value is missing or unknown. It is not the same as zero or an empty string.

Choosing the right type matters. Storing a year as TEXT prevents the database from doing arithmetic on it (e.g. finding all films made after 2000 with year > 2000). Storing a rating as INTEGER would silently truncate the decimal — 8.4 would become 8.

NULL values

Some fields are allowed to be NULL, meaning the data simply was not recorded. In our films database, runtime_mins can be NULL — four films are missing this information. This is intentional: it teaches you how to handle missing data in queries.

NULL is not the same as zero (which would mean "zero minutes long") and not the same as a blank text string. It means the value is genuinely unknown. When querying for NULL values, you cannot use = NULL — you must use IS NULL or IS NOT NULL. (This will be covered in a later SQL lesson.)

Why use a database instead of a spreadsheet?

Spreadsheets work well for small, simple datasets handled by one person. Databases offer important advantages at scale:

  • No data redundancy — data is stored once and linked by foreign keys, not duplicated in every row.
  • Data integrity — the DBMS enforces rules (e.g. a foreign key value must exist in the related table), preventing invalid entries.
  • Scalability — databases handle millions of records efficiently; large spreadsheets slow to a crawl.
  • Multi-user access — many users can read and write simultaneously without corrupting data.
  • Powerful queries — SQL lets you search, filter, sort, and join across multiple tables in ways spreadsheet formulas cannot match.

Worked examples

Example 1 — Identifying tables, records, and fields

This is an extract from the director table in the films database:

director_id first_name last_name nationality
1StevenSpielbergAmerican
2ChristopherNolanBritish
3RidleyScottBritish
1
The table name is director. It stores information about film directors.
2
There are 4 fields (columns): director_id, first_name, last_name, nationality. Each field stores one type of information.
3
Each row is one record. This extract shows 3 records — though the full table has 15.
4
The cell containing Spielberg is the value of the last_name field in record 1. Every record has a value for every field (unless the field allows NULL).
Example 2 — Identifying primary keys and foreign keys

Below is one record from each table, showing how they are linked:

director
director_id PK first_name last_name nationality
2ChristopherNolanBritish
film
film_id PK title year certificate runtime_mins rating genre director_id FK
9The Dark Knight200812A1529.0Action2
1
In the director table, director_id is the primary key. It is unique for every director and is never NULL.
2
In the film table, film_id is the primary key. Every film has its own unique ID.
3
The director_id column in film is a foreign key. Its value (2) matches Christopher Nolan's director_id in the director table.
4
This tells us The Dark Knight was directed by Christopher Nolan — without repeating his name or nationality inside the film table.
Example 3 — Identifying data types

For each field in the film table, the correct data type and the reasoning:

Field Example value Data type Reason
film_id9INTEGERWhole number ID — no decimal needed.
titleThe Dark KnightTEXTA sequence of characters — any length.
year2008INTEGERWhole number — allows maths like year > 2000.
certificate12ATEXTA label (PG, 15, 18) — not used in arithmetic.
runtime_mins152INTEGER (nullable)Whole minutes; some records are NULL.
rating9.0REALDecimal — INTEGER would truncate 8.4 to 8.
genreActionTEXTA category label — stored as text.
director_id2INTEGERWhole number FK matching director.director_id.
Now you try

Look at this extract from the film table, which includes a record with a NULL value:

film_idtitleyearcertificate runtime_minsratinggenredirector_id
22Big Fish2003PG1258.0Fantasy5
23Corpse Bride2005PGNULL7.3Animation5

For this extract, answer the following:

  1. What is the name of the table?
  2. How many records are shown? How many fields does this table have?
  3. Which field is the primary key? Give a reason.
  4. Which field is a foreign key? What table does it link to, and what does the value 5 tell us?
  5. Why does runtime_mins show NULL for Corpse Bride? What does NULL mean in this context?
  1. Table name: film
  2. Records shown: 2. Fields: 8 (film_id, title, year, certificate, runtime_mins, rating, genre, director_id).
  3. Primary key: film_id — every film has a unique integer ID that is never NULL. No two films share the same ID.
  4. Foreign key: director_id. It links to the director_id field in the director table. The value 5 tells us both films were directed by the director with ID 5 — Tim Burton.
  5. NULL in runtime_mins: The runtime for Corpse Bride was not recorded in this database. NULL means the value is missing or unknown — it does not mean the film is zero minutes long.
Common mistakes
Confusing a field with a record. A field is a column — it defines one attribute that all records share (e.g. last_name). A record is a row — it is all the data about one specific item (e.g. one director). Fields run across the top; records run down the side.
Thinking NULL means zero or blank. NULL means the value is missing or unknown. A film with NULL in runtime_mins is not a film with zero runtime — it is a film where the runtime was not recorded. NULL + 1 = NULL, not 1. This distinction matters in every SQL query involving NULL.
Mixing up primary key and foreign key. The primary key is in the table it identifies (e.g. director_id in director). The foreign key is a copy of that value in another table (e.g. director_id in film). Same field name — different tables, different roles.
Assuming a primary key must be a number. A primary key just needs to be unique and never NULL. Auto-incrementing integers are most common, but a field like isbn (for books) or email could also serve as a primary key if every record is guaranteed to have a unique value.
Exam tip

The exam often asks you to "identify a suitable primary key and justify your choice." To earn full marks you must state both the field name and give a reason. A strong justification covers two points:

  1. The value is unique for every record — no two records share it.
  2. The value is never NULL — every record will always have one.

Example answer: "director_id is a suitable primary key because each director is assigned a unique ID number that is never empty, so every record can be individually identified."

Saying only "it identifies the record" will not score the justification mark. You must reference uniqueness and/or not NULL.

Task Set

Questions 1–5 are auto-checked. Questions 6–9 are self-marked — write your answer, then reveal the model answer to check your work.

1. In a database table, what is a single column that stores one attribute for every record called? TYPE 1

2. The film table stores a film's rating, e.g. 8.4. Which data type is correct for this field? TYPE 1

3. The film table has a director_id column that holds the ID of the director of each film. What type of key is this? TYPE 1

4. Some records in the film table show NULL in the runtime_mins field. What does NULL mean? TYPE 1

5. Which of the following is not a requirement for a primary key? TYPE 1

6. Describe the difference between a primary key and a foreign key. Use the director and film tables as examples in your answer. TYPE 2

A primary key uniquely identifies each record in its own table. It must be unique and never NULL. In the director table, director_id is the primary key — every director has a different ID number that is never empty.

A foreign key is a field in one table that links to the primary key of another table. In the film table, director_id is a foreign key — its value matches a director_id in the director table, telling us which director made each film. The same field name appears in both tables but plays different roles.

7. Give two advantages of storing film and director data in a relational database rather than a single flat-file spreadsheet. TYPE 2

Any two of the following:

1. No data redundancy — director details are stored once in the director table and linked by a foreign key, rather than repeated in every row of the film table. If a director's details change, they only need to be updated in one place.

2. Data integrity — the DBMS enforces rules (e.g. a director_id in the film table must match an existing director), preventing invalid or inconsistent data from being entered.

3. Scalability — databases handle very large numbers of records efficiently; a large spreadsheet becomes slow and difficult to manage.

4. Powerful queries — SQL allows complex searching, sorting, filtering, and joining across multiple tables in ways that spreadsheet formulas cannot match.

8. Open the director table in DB Browser for SQLite. Write down: (a) the name of the primary key field, (b) the data type of each field, and (c) the total number of records in the table. TYPE 3

(a) Primary key: director_id

(b) Data types:
    director_id — INTEGER
    first_name — TEXT
    last_name — TEXT
    nationality — TEXT

(c) Total records: 15

9. In the film table, four records have NULL in the runtime_mins field. (a) Name one of those films. (b) Explain what NULL means in this context. (c) Explain why a primary key can never be NULL. TYPE 3

(a) Any one of: Corpse Bride (2005), The Hateful Eight (2015), Roma (2018), or Nope (2022).

(b) NULL means the runtime for that film was not recorded in the database. It does not mean the film is zero minutes long — it means the value is missing or unknown.

(c) A primary key must uniquely identify every record. If a primary key were NULL, there would be no way to tell records apart — two records with NULL primary keys would be indistinguishable. The uniqueness guarantee would be broken, making it impossible to link records from other tables reliably using a foreign key.
Teacher notes — Shift+T to hide

Suggested timing: 60 minutes. Warm up + vocab 10 min; notes with live DB Browser demo 20 min; worked examples 10 min; now you try 5 min; task set 15 min.

Practical setup: Pupils should have already imported films.sql into DB Browser for SQLite before this lesson begins. Tasks 8 and 9 require the database to be open. Warm-up Q3 is a quick diagnostic — any pupil who answers 1 table probably hasn't imported the file yet.

Key misconception to address: NULL ≠ 0. A useful provocation: ask "what is NULL + 1?" The answer is NULL, not 1. Ask pupils why that might matter for a query that totals runtimes.

Live demo suggestion: Open DB Browser, browse the film table, and point out the four NULL values in runtime_mins. Show the schema tab to discuss data types.

Extension question: Could last_name in the director table ever serve as a primary key? (No — two directors could share a surname. This reinforces why integer IDs are used.)

SQA command words covered: "identify", "describe", "explain". Remind pupils that for "explain" they need a reason, not just a definition. For "describe" they need more than one point about the topic.