- Understand why computers need a system to represent text characters
- Describe how extended ASCII uses 8 bits to represent 256 characters
- Identify the structure of the ASCII table (non-printable, uppercase, lowercase, symbols)
- Understand how Unicode extends ASCII to support all world languages
- Convert between a character and its ASCII decimal value
- I can explain what ASCII stands for and how it works
- I can state that extended ASCII uses 8 bits and represents 256 characters
- I can identify where uppercase letters, lowercase letters and non-printable characters sit in the ASCII table
- I can explain one limitation of ASCII that led to Unicode
- I can decode a short ASCII message given a table of values
Answer before the lesson begins. These check prior knowledge — it's fine if you're unsure.
1. How many different values can be represented using 8 bits?
2. What is the binary number 01000001 in denary?
3. What does the term "bit" stand for?
Key vocabulary
Data Representation: Text
Why do computers need a code for text?
Computers store everything as binary — 1s and 0s. Numbers can be represented directly in binary, but characters (letters, punctuation, symbols) have no natural binary equivalent. For a computer to store the letter 'A', it must be agreed in advance what binary pattern represents it. Without a shared standard, a file saved on one computer might produce garbled characters when opened on another. This is why character encoding standards were developed — they create a universal agreement between all computers about which binary number maps to which character.
ASCII — the original standard
ASCII (American Standard Code for Information Interchange) was developed in the 1960s and became the dominant early standard. Original ASCII used 7 bits, providing 128 possible values (0–127). Extended ASCII added an eighth bit, doubling capacity to 256 characters (0–255). Each character is assigned a unique code number, and the computer stores that number in binary.
The 256 characters break into key ranges:
- Codes 0–31: Non-printable control characters. These don't display on screen — they send instructions to devices. Code 7 triggers a beep, code 10 is a newline, code 13 is a carriage return.
- Code 32: The space character — the first printable character.
- Codes 48–57: The digits 0–9.
- Codes 65–90: Uppercase letters A–Z.
- Codes 97–122: Lowercase letters a–z.
- Codes 33–47, 58–64, 91–96, 123–127: Punctuation marks and symbols.
A key pattern to notice: 'A' is 65 and 'a' is 97 — a difference of exactly 32. This consistent offset means software can convert between upper and lowercase by simple arithmetic, without needing a lookup table.
| Decimal | Character | Notes |
|---|---|---|
| 65 | A | First uppercase letter |
| 66 | B | |
| 67 | C | |
| 68 | D | |
| 69 | E | |
| 70 | F | |
| 90 | Z | Last uppercase letter |
| 97 | a | First lowercase letter (= 65 + 32) |
| 98 | b | |
| 99 | c | |
| 100 | d | |
| 101 | e | |
| 102 | f | |
| 122 | z | Last lowercase letter (= 90 + 32) |
| 32 | space | First printable character |
| 33 | ! | Exclamation mark |
| 48 | 0 | Digit zero (not binary zero) |
| 57 | 9 | Digit nine |
How text is stored in memory
When you type the word "Hi!" on a keyboard, the computer stores three separate ASCII values: 72 (H), 105 (i), 33 (!). Each takes 8 bits (1 byte) of storage. A 100-character text file therefore requires 100 bytes of storage — before any formatting. Every space, every punctuation mark, every character you can type has its own unique ASCII code. The computer never stores the letter itself — only the number that represents it.
Unicode — solving ASCII's limitations
Extended ASCII was created before computing was truly global. Its 256 character slots are sufficient for English and some Western European languages, but completely inadequate for Chinese (70,000+ characters), Arabic, Japanese, emoji, and many other scripts. Additionally, different companies assigned different characters to the 128–255 slots, creating compatibility problems between systems.
Unicode solves this by using 16 bits per character, providing 65,536 possible values — enough for every major language on Earth, plus thousands of symbols and emoji. Unicode is now the global standard; when you send a text message containing an emoji, Unicode is the encoding making that possible. Crucially, Unicode is backwards-compatible with ASCII — the first 128 Unicode values are identical to ASCII, so existing English text files need no conversion.
File size implications
Because every character is stored as 8 bits (1 byte) in ASCII, calculating the storage requirement of a text file is straightforward: number of characters × 8 bits. A 1,000-character essay would need 1,000 bytes = 8,000 bits of storage. In Unicode (16 bits per character), the same essay would need 2,000 bytes — double the storage, but with the ability to represent any character from any language on the planet. This trade-off between storage size and language coverage is a key concept in data representation.
Worked examples
ASCII values given: 72, 101, 108, 108, 111. What word do they spell?
Convert the word "Cat" to its ASCII decimal values.
67, 97, 116.How many bits are needed to store the word "Computing" in extended ASCII?
The ASCII code for 'A' is 65. What is the ASCII code for 'a'? And if 'Z' is 90, what is 'z'?
An ASCII message reads: 87, 101, 108, 108, 32, 100, 111, 110, 101, 33
Use the partial reference table below to decode it:
33=! | 32=space | 87=W | 100=d | 101=e | 108=l | 110=n | 111=o
Decode each value in order and write the complete message.
Working:
- 87 → W
- 101 → e
- 108 → l
- 108 → l
- 32 → [space]
- 100 → d
- 111 → o
- 110 → n
- 101 → e
- 33 → !
Answer: "Well done!"
01000001. There is an extra step between the character and the binary pattern.In N5 exam questions, you will always be given a reference table to decode ASCII. The skill being tested is your ability to use the table correctly and to explain the concept — not memorisation. Practise converting in both directions: decimal → character and character → decimal. Also be ready to calculate storage: number of characters × 8 bits (ASCII) or × 16 bits (Unicode). Watch out for spaces — a space is ASCII code 32, and it counts as a character when calculating file size.
Questions 1–5 are auto-checked. Questions 6–10 are self-marked — write your answer, then reveal the model answer to check your work.
1. What does ASCII stand for? TYPE 1
2. How many characters can extended ASCII represent? TYPE 1
3. The ASCII code for 'A' is 65. What is the ASCII code for 'a'? TYPE 1
4. How many bits does extended ASCII use to represent each character? TYPE 1
5. ASCII codes 0–31 are known as: TYPE 1
6. Using the ASCII table, decode the following values: 78, 53, 32, 67, 83 TYPE 2
Reference: 32=space, 51=3, 53=5, 67=C, 78=N, 83=S
Answer: "N5 CS"
7. A pupil types the name "Anya" into a text box. How many bits are needed to store this name in extended ASCII? Show your working. TYPE 2
Each character = 8 bits (1 byte).
4 × 8 = 32 bits (4 bytes).
8. Explain why Unicode was developed to replace ASCII as the global standard. TYPE 2
9. A text file contains 500 characters. (a) How many bytes of storage does it require using ASCII? (b) How many bytes would it require if stored using Unicode (16-bit)? (c) Explain the trade-off in using Unicode instead of ASCII. TYPE 3
(b) Unicode (16-bit): 500 × 2 bytes = 1,000 bytes.
(c) Unicode can represent characters from all world languages and scripts, making it globally inclusive. However, it uses twice the storage per character compared to ASCII, so files take up more space.
10. Which of the following is a reason Unicode was developed? TYPE 1
Suggested timing: ~45 minutes. Warm up 8 min; notes + ASCII table 15 min; worked examples 8 min; now you try 5 min; task set 9 min.
- Pupils often confuse 7-bit ASCII (128 chars) and 8-bit extended ASCII (256 chars). Emphasise the doubling rule — each extra bit doubles the number of possible values.
- The uppercase/lowercase offset of 32 is a nice pattern to highlight — it sometimes appears directly in past paper questions ("what is the code for 'b' if 'B' is 66?").
- If time allows, have pupils encode their own first name into ASCII decimal values using the reference table — effective consolidation and immediately personal.
- Unicode depth: for N5, pupils only need to know it uses 16 bits and supports more characters than ASCII. The term "UTF-8" is not required at this level.
- CS2 leads directly into CS3 (graphics). The underlying theme throughout CS1–CS4 is: all data is ultimately stored as binary numbers, whether it's integers, text, or images.
- SQA command words covered in task set: identify, state, explain, calculate, describe.