Role & seniority

AI Test Lead (AI Foundry); senior/lead level responsible for end-to-end AI assurance

Stack/tools

AI testing methods blending traditional QA with AI-specific risk testing
Model-evaluation techniques (prompt variability, edge-case simulation, output consistency)
Post-deployment monitoring, drift detection, incident triage
Adversarial/adversarial input datasets, edge-case datasets, synthetic data tooling
Tools: LLM-monitoring, bias scanners, prompt-diff, data pipelines, human-in-the-loop workflows
Governance: align with risk, legal, security, compliance; ISO42001; Responsible AI policies

Top 3 responsibilities

Design and manage enterprise AI/Copilot testing strategy; define test approaches across functional, performance, accuracy, reliability, ethics and bias
Establish and run model evaluation, explainability, safety controls; embed quality gates; conduct post-deployment validation and continuous monitoring
Lead incident investigations, root-cause analysis across data, model logic, prompts and human-in-the-loop; drive remediation and risk-alignment with governance forums

Must-have skills

Experience designing/leading tests for complex AI/data-driven systems under regulatory/high-assurance constraints
Deep understanding of AI-specific risks (hallucinations, bias, drift, explainability gaps) and targeted testing approaches
Proficiency in post-deployment monitoring, drift detection, risk assessment, governance/audit alignment, and stakeh

Full Description

We are looking for an experienced AI Test Leads (AI Foundry) to ensure AI and Copilot solutions are safe, reliable and compliant, covering both traditional QA and AI‑specific risks (bias, hallucination, explain ability). The role defines assurance methods, quality gates and post‑deployment monitoring to meet internal policy and regulator expectations.

Design and manage the enterprise testing strategy for AI/Copilot, blending traditional QA with AI‑specific methods. Define test approaches for functional, performance, accuracy, reliability, ethical compliance and bias detection. Establish model‑evaluation techniques (prompt variability, edge‑case simulation, output consistency, scenario reasoning). Validate explainability, traceability and safety controls against policy and regulatory requirements. Evaluate and test human‑in‑the‑loop workflows and decision checkpoints for appropriate oversight. Embed quality gates in iterative delivery, preventing progression without assurance evidence. Develop and maintain specialised test datasets, including adversarial, low‑quality, domain‑specific and edge‑case inputs, to rigorously challenge model robustness and identify systemic weaknesses. Provide AI test engineering support to delivery squads, advising on model‑readiness criteria, testability risks, and quality implications of design decisions, ensuring solutions are verifiable throughout the lifecycle. Define and run post‑deployment validation, drift detection, incident triage and continuous model‑monitoring. Partner with Risk, Legal, Security and Compliance teams to meet control frameworks and audit standards. Provide inputs to risk/impact assessments, policy adherence checks and governance submissions. Lead incident investigations for unexpected AI behaviours, conducting deep‑dive root‑cause analysis across data quality, model logic, prompt flows, integration layers and human‑in‑the‑loop steps; identify systemic failure points, recommend corrective actions, and drive end‑to‑end remediation to prevent recurrence. Maintain test documentation, evaluation logs, datasets and reproducible evidence for audit. Uplift AI testing capability across teams through standards, templates, training and hands‑on support. Champion continuous improvement of AI assurance, evaluating new testing tooling (LLM‑monitoring, bias‑scanners, prompt‑diff tools, synthetic data generators) and maturing standards as organisational AI adoption scales. Ensure responsible AI principles (e.g., transparency, explainability, ISO42001) are incorporated into all development. Provide insight to support business cases, investment decisions, risk assessments, and prioritisation discussions at AI governance forums. Managing escalations supporting the wider Data & AI Leadership team.

Functional/Technical (Role Specific) Higher education qualification (or equivalent experience) in Ethics, Law, Risk Management, Social Sciences, Data/Computer Science or relevant field Experience with designing and leading testing for complex digital or data‑driven systems, including multi‑component architectures, API‑integrated platforms, event‑driven workflows and systems operating under regulatory or high‑assurance constraints. Clear understanding of AI‑specific risks such as hallucinations, bias, drift, explainability gaps, safety breaches and misuse pathways, paired with the ability to design targeted tests that uncover model blind spots and systemic weaknesses. Knowledge of model‑evaluation techniques, prompt‑testing strategies and scenario‑based testing approaches, including stress‑testing prompts, adversarial input creation, failure‑mode exploration and behaviour‑driven evaluation. Familiarity with governance, audit and regulatory standards for AI, data and digital services, ensuring testing evidence aligns with internal risk frameworks, ISO42001 controls, Responsible AI policies and external regulatory expectations. Experience developing structured QA strategies that integrate traditional and AI‑specific assurance, mapping out test plans, risk‑based prioritisation, acceptance criteria, model‑readiness thresholds and quality gates aligned to lifecycle stages. Ability to define and execute test plans across functional, non‑functional, ethical and performance dimensions, validating accuracy, latency, robustness, security, fairness, reliability and user‑journey consistency. Strong analytical mindset with the ability to identify root causes of defects or unexpected AI behaviour, performing deep‑dive diagnostics across data pipelines, vector stores, prompt flows, orchestration logic and human‑in‑the‑loop checkpoints. Experience with post‑deployment monitoring, drift detection and continuous validation, designing alerts, retraining triggers, performance thresholds and evaluation cadences to maintain long‑term model integrity. Comfortable learning and adapting to emerging AI technologies and engineering patterns. Excellent stakeholder management and communication skills, including senior‑level engagement. Commercial awareness and a value‑driven mindset. Use of professional networks and external influencers with clear evidence of learning and development to build and maintain skills and expertise.

AI Test Lead (AI Foundry)

Full Description