GAMP 5 AI Gap Report: The 6 Most Common Validation Failures for AI Systems in GxP Environments

Why GAMP 5 AI Validation Is Different

GAMP 5 has been the primary validation methodology for GxP computerised systems since its first edition in 1994. Its category-based risk classification, lifecycle approach, and IQ/OQ/PQ validation framework have guided thousands of system validations across pharmaceutical, biotech and medical device organisations. For conventional software, the framework works well.

AI and machine learning systems challenge several GAMP 5 assumptions. GAMP 5 was designed for deterministic software: given the same inputs, the system produces the same outputs. AI systems — particularly those that learn from operational data — may not meet this condition. A model that was validated against historical training data may behave differently as it processes new operational data over time. The GAMP 5 AI Supplement, published in 2022, extends the GAMP 5 framework to address these characteristics. The six validation gaps in this guide are the most common failures to apply this extended framework correctly.

Access the complete guide

All 14 pages — practical implementation guidance, checklists and templates. Free, instant access.

No spam. No sales calls. AjaCertX will email you a copy for reference.

Guide unlocked ✓

A copy has been sent to your email for reference.

The Six GAMP 5 AI Validation Failures

Gap 1 — Intended use boundaries not defined

The single most common GAMP 5 AI validation gap. Intended use boundaries define the data types, ranges, contexts and decision types for which the AI model was trained and validated. Without defined boundaries, there is no basis for determining when the system is operating within validated conditions — and no mechanism for detecting when it is not. Remediation: document intended use boundaries in the user requirements specification and validation summary report, and implement monitoring that detects out-of-boundary operation.

Gap 2 — Training data not governed as GxP data

Training data is the foundation on which model behaviour is built. If training data quality is not assessed, provenance is not documented, and data preprocessing decisions are not recorded, the validation package cannot demonstrate that the model was built on reliable foundations. This is a data integrity failure. Remediation: apply ALCOA+ principles to training data — document provenance, preprocessing methodology, quality assessment criteria, and data version control. Retain training datasets as GxP records.

Gap 3 — Test datasets not genuinely independent of training data

A model tested against data that was used in training will appear more capable than it is. Test datasets must be selected before training begins, must not overlap with training data, and must represent the operational conditions the model will face — not just the conditions it was trained on. Remediation: document test dataset selection methodology in the validation protocol, confirm independence from training data, and include operational edge cases that were not well-represented in training.

Gap 4 — Performance metrics not defined before testing

Acceptance criteria for AI systems must be defined before validation testing begins — not selected after observing results. Post-hoc selection of metrics that the model happens to meet is not validation. Remediation: define primary performance metrics, acceptance thresholds, and the minimum test dataset size required to demonstrate statistical confidence in the metrics before the validation protocol is executed.

Gap 5 — No continuous performance monitoring programme

AI systems that are not continuously monitored may drift from their validated performance without detection. A model that was validated with 97% accuracy may, over time, degrade to 89% — below the acceptance criterion established at validation — without any system-generated alert. Remediation: implement continuous monitoring for primary performance metrics, define alert thresholds, establish escalation procedures for threshold breaches, and integrate monitoring into the change control procedure as a trigger for revalidation assessment.

Gap 6 — Change control not extended to AI-specific changes

Standard software change control was designed for intentional, human-initiated changes. AI systems can change their behaviour through mechanisms that traditional change control does not capture: model retraining, fine-tuning, changes to training data, and in some architectures, continuous learning from operational data. Remediation: extend your change control procedure with AI-specific change categories and define the revalidation requirements for each category.

Self-Assessment — Where Is Your AI System?

GAMP 5 AI Validation Gap Assessment

Intended use boundaries are documented in the validation package — specifying data types, ranges, contexts and decision types

Training data governance documentation demonstrates ALCOA+ compliance — provenance, preprocessing, quality assessment, version control

Test datasets were selected before training began and documented independence from training data is confirmed

Primary performance metrics and acceptance thresholds were defined in the validation protocol before testing

A continuous performance monitoring programme is operational with documented alert thresholds and escalation procedures

Change control procedure has been extended with AI-specific change categories and revalidation requirements

Remediation Priority Framework

Gap	Risk Level	Typical Remediation Timeline	Inspection Consequence if Unaddressed
Intended use boundaries	Critical	2–4 weeks	Major finding — caps element capability assessment
Training data governance	Critical	4–12 weeks (depends on data reconstruction complexity)	Critical data integrity finding
Independent test datasets	High	2–6 weeks	Major finding — validation package rejected
Pre-defined acceptance criteria	High	1–2 weeks if model not yet deployed	Major finding for new systems; observation for legacy
Continuous monitoring	High	4–8 weeks	Major finding — system may be considered uncontrolled
AI change control	Medium	2–4 weeks	Observation — potential major if changes have been made without assessment

Assessing your GxP AI validation programme?

GAMP 5 and AI validation specialists. Gap assessment proposal within 48 hours.

Get a Proposal →WhatsApp

About AjaCertX

AjaCertX is a specialist compliance, certification and assurance partner serving life science organisations globally. Our GAMP 5 and AI Validation practice delivers GxP AI system validation, Annex 22 compliance programmes, and validation gap remediation for pharmaceutical, biotech and medical device organisations.