Learning Intentions
- Understand what a data dictionary is and why databases need one
- Identify the five types of validation: presence, restricted choice, field length, range, foreign key
- Create a complete data dictionary with all required fields and validation rules
- Explain how validation enforces data quality and referential integrity
Success Criteria
- You can name all five validation types with a real-world example for each
- You can create a complete data dictionary for a multi-entity database
- You can identify which validation type to apply to each attribute
- You can explain why validation prevents errors and protects data integrity
1. What is a data dictionary? TYPE 1
2. Which validation type ensures a field is not left blank? TYPE 1
3. What does a foreign key validation check? TYPE 1
Key vocabulary
Data Dictionary and Validation
What is a Data Dictionary?
A data dictionary is a document (often a table) that describes every aspect of a database's structure. It answers: What tables exist? What columns does each table have? What type of data does each column hold? What rules prevent bad data from being entered?
A data dictionary is created during the Design phase and updated as the database evolves. It serves as:
- Documentation: Future developers understand the database without reading code.
- Specification: Database developers know exactly what to build.
- Quality assurance: Testers know what validation rules to test.
- Audit trail: If data looks wrong, the data dictionary shows what rules should have prevented it.
Components of a Data Dictionary
A complete data dictionary includes columns for:
- Entity: The table name (e.g., Student, Course, Enrollment).
- Attribute: The field/column name (e.g., StudentID, FirstName, DateOfBirth).
- Key Type: Is this a Primary Key (PK), Foreign Key (FK), or neither?
- Data Type: Text, Integer, Decimal, Date, Boolean, etc.
- Size: For text, max length (e.g., VARCHAR(50)). For numbers, precision (Integer 0–32,767).
- Validation Rule: What constraints apply? (e.g., "Presence check," "Range 0–100," "Restricted to A–F").
The Five Validation Types
1. Presence Check
A presence check ensures a field is not left blank. If a field is mandatory, the database rejects any record where that field is empty.
Example: StudentID is a primary key, so it must have a value. If someone tries to add a student without an ID, the database rejects the entry.
- Rule: StudentID must not be NULL
- Error: "Student ID is required"
2. Restricted Choice
A restricted choice validation limits entries to a predefined set of allowed values. The field only accepts values from a list.
Example: A Grade field should only accept A, B, C, D, E, or F. Entering "X" is rejected.
- Rule: Grade must be one of {A, B, C, D, E, F}
- Error: "Grade must be A, B, C, D, E, or F"
3. Field Length Validation
Field length validation limits how many characters can be entered into a text field. This prevents data that's too long for the field.
Example: A FirstName field is defined as VARCHAR(30), meaning up to 30 characters. A 9-character name is accepted. A 50-character name is rejected.
- Rule: FirstName must be ≤ 30 characters
- Error: "First name cannot exceed 30 characters"
4. Range Validation
Range validation ensures numeric or date values fall within acceptable limits. Test scores must be 0–100, dates must be between 2020 and 2030, ages must be 0–120.
Example: A TestScore field should only accept 0–100. Entering –5 or 150 is rejected.
- Rule: TestScore must be ≥ 0 AND ≤ 100
- Error: "Test score must be between 0 and 100"
5. Foreign Key Validation
Foreign key validation is the most powerful. It ensures a value in one table exists as a primary key in another table. This enforces referential integrity.
Example: An Enrollment table has a StudentID field referencing the Student table. Before allowing enrollment for StudentID=55, the database checks: does StudentID 55 exist in Student? If not, enrollment is rejected.
- Rule: StudentID in Enrollment must match a StudentID in Student
- Error: "That student does not exist"
Why Validation Matters
Validation prevents garbage-in-garbage-out (GIGO). Without validation:
- Users might leave required fields blank, creating incomplete records.
- Typos might be entered (e.g., Grade="X" instead of "A").
- Out-of-range values might be stored (e.g., Age=–5).
- Orphaned records might exist (e.g., enrollment for a deleted student).
- Reports and analysis become unreliable because data is inconsistent.
Validation catches errors before they're stored, ensuring data quality. This is much cheaper than fixing bad data later.
Worked examples
Scenario: Design a data dictionary for a library that tracks books, borrowers, and loans.
Scenario: Design validation rules for a Grades table with GradeID, StudentID (FK), CourseID (FK), Mark, Grade, DateRecorded.
Scenario: A Booking table has CustomerID referencing Customer table. What happens without and with foreign key validation?
Design validation rules for a Hospital Appointment System with Patient, Doctor, and Appointment tables.
Attributes to validate: PatientID, FirstName, LastName, DateOfBirth, PhoneNumber (11 chars), DoctorID, Speciality, AppointmentDate, AppointmentTime
Answer:
- PatientID: Presence check, Primary key, Auto-increment
- FirstName/LastName: Presence check, Field length ≤ 50
- DateOfBirth: Presence check, Range (0–120 years old)
- PhoneNumber: Presence check, Field length = 11
- DoctorID (FK): Foreign key → Doctor.DoctorID
- Speciality: Presence check, Restricted choice (Cardiology, Surgery, etc.)
- AppointmentDate: Presence check, Range ≥ today
- AppointmentTime: Presence check, Restricted choice (09:00, 10:00, 14:00, etc.)
Past paper question: "Design a data dictionary for a Hotel Booking system with Customer, Room, and Booking tables. Include validation rules."
Strong answer: Create a table with Entity, Attribute, Key Type, Data Type, Size, and Validation. Include at least one of each validation type. Explain why each validation is necessary. Weak answers list attributes without validation; strong answers explain the rules.
1. Which validation type prevents a user from leaving a field blank? TYPE 1
2. A field is defined as VARCHAR(25). Which validation does this enforce? TYPE 1
3. What does a foreign key validation ensure? TYPE 1
4. Which is NOT a standard validation type? TYPE 1
5. Describe what a data dictionary is and explain why databases need one. TYPE 2
Write 2-3 sentences.
6. A Mark field should only accept 0–100. What validation type applies and how would you specify it? TYPE 2
Write 1-2 sentences.
7. Explain how foreign key validation prevents orphaned records. TYPE 2
Write 2-3 sentences.
8. Design a data dictionary for a Cinema Booking system with Movie, Screen, and Booking tables. Create a table showing Entity, Attribute, Key Type, Data Type, and Validation. Include at least one of each validation type (10+ attributes). TYPE 3
Create a complete table.
9. A university database stores Student, Course, and Enrollment records. For each attribute (StudentID, CourseCode, EnrollmentDate, Grade), identify appropriate validation rules and explain why each is necessary. TYPE 3
Write 6-8 sentences covering all four attributes.
10. Compare a database WITH foreign key validation to one WITHOUT, using a real-world example. Explain consequences of missing validation (data corruption, orphaned records). TYPE 3
Write 6-8 sentences with a specific example.
Lesson duration: 50–60 minutes. Work through examples step-by-step with pupils.
Key points:
- Data dictionary is a specification, not just a list. Every field needs validation.
- Validation is a design decision, not an afterthought.
- Foreign key validation is most powerful—it maintains referential integrity.
- Validation prevents garbage-in-garbage-out (GIGO).
Common misconceptions: Students confuse data type with validation. Some think foreign keys are optional. Underestimate complexity of validation rules.
Past paper tips: Q5 asks students to design a data dictionary. Weak answers list attributes without validation; strong answers create clear tables with rules.