Select an AI system, define your audit scope, and design your test prompts before collecting evidence.
Choose the system you will audit. Circle or tick the one you select. All systems are real-world deployment types; one is open for your own choice.
Write 3 specific hypotheses you will test. A good hypothesis names a variable (e.g., applicant name, writing style, school location) and a predicted outcome (e.g., lower score, deprioritised ranking).
For each bias category, design a pair of prompts (A and B) that are identical except for one variable. The gap between A's output and B's output is your evidence. Plan at least 2 prompt pairs per category.
| Bias Category | Variable Being Tested | Prompt A (baseline) | Prompt B (test variable changed) |
|---|---|---|---|
|
Gender-Coded
Name, pronoun, or gender-associated language
G1
|
e.g. Male name → Female name
|
||
| Gender-Coded G2 | |||
|
Ethnicity-Associated
Culturally associated name, institution, or reference
E1
|
e.g. Anglo name → Arabic name |
||
| Ethnicity-Associated E2 | |||
|
Geographic / Location
Address, institution, or regional marker
L1
|
e.g. Urban school → Rural school |
||
|
Linguistic Style
Register, dialect, or vocabulary complexity
S1
|
e.g. Formal prose → Plain language |
Based on your system selection and prompt design, what do you predict you will find? Name the specific bias type and the expected gap.