
Study design and research governance
The guidance below highlights impactful applications of AI during data collection and quality control, with practical considerations and prompt templates you can copy and paste to adapt to your own needs.
Aim: Keep problems out of your dataset by building quality in from day one.

Turn your method into a protocol checklist and track deviations from the plan in a deviation log.
Define your dataset clearly with a ‘what each field means’ data dictionary and collection context.
Design error checks and validation rules and ‘what could go wrong’ lists (failure modes), plus what you’ll do if you detect them.
Test AI-generated checks on edge cases before using them at scale.
Don’t let AI ‘normalize’ messy data: keep raw data separate, and document cleaning rules and exceptions.
Avoid pasting raw sensitive data into tools that are not authorized to receive that data; when in doubt, use synthetic examples or summaries.
Keep a clear record of changes and why.
📑 Copy and paste prompt: turning protocol into a checklist and deviation log |
|---|
You are helping me prevent mistakes during data collection by turning our method into two practical tools: (1) a simple checklist everyone can follow the same way, and (2) a deviation log to record anything that goes off-plan so we can explain it later. Method summary (no sensitive info): [paste] Output: 1. Protocol checklist: step-by-step checklist with Must (critical) and Should (recommended) steps, plus common mistakes to avoid. 2. Deviation log template: a table with columns for date, step, what changed, why, potential impact, what we did, who approved/owned it. Rules: Don’t invent study details. If something is missing, write TBD and list the question I need to answer. |
📑 Copy and paste prompt: generate a failure codes list |
|---|
Generate a ’what could go wrong’‘ list for data collection and QC. Process summary: [paste]. Identify failure modes across: recruitment/eligibility, measurement, data entry, device/software, timing, missing data, duplicates, contamination, protocol drift. For each: - early warning signals - detection method (check or audit) - mitigation / corrective action - whether it must be logged as a deviation Keep it practical and action-oriented. |