Cookies & analytics consent
We serve candidates globally, so we only activate Google Tag Manager and other analytics after you opt in. This keeps us aligned with GDPR/UK DPA, ePrivacy, LGPD, and similar rules. Essential features still run without analytics cookies.
Read how we use data in our Privacy Policy and Terms of Service.
🤖 15+ AI Agents working for you. Find jobs, score and update resumes, cover letter, interview questions, missing keywords, and lots more.

Gazelle Global • Leeds, England, United Kingdom
Role & seniority
Stack/tools
AI testing methods blending traditional QA with AI-specific risk testing
Model-evaluation techniques (prompt variability, edge-case simulation, output consistency)
Post-deployment monitoring, drift detection, incident triage
Adversarial/adversarial input datasets, edge-case datasets, synthetic data tooling
Tools: LLM-monitoring, bias scanners, prompt-diff, data pipelines, human-in-the-loop workflows
Governance: align with risk, legal, security, compliance; ISO42001; Responsible AI policies
Top 3 responsibilities
Design and manage enterprise AI/Copilot testing strategy; define test approaches across functional, performance, accuracy, reliability, ethics and bias
Establish and run model evaluation, explainability, safety controls; embed quality gates; conduct post-deployment validation and continuous monitoring
Lead incident investigations, root-cause analysis across data, model logic, prompts and human-in-the-loop; drive remediation and risk-alignment with governance forums
Must-have skills
Experience designing/leading tests for complex AI/data-driven systems under regulatory/high-assurance constraints
Deep understanding of AI-specific risks (hallucinations, bias, drift, explainability gaps) and targeted testing approaches
Proficiency in post-deployment monitoring, drift detection, risk assessment, governance/audit alignment, and stakeh
We are looking for an experienced AI Test Leads (AI Foundry) to ensure AI and Copilot solutions are safe, reliable and compliant, covering both traditional QA and AI‑specific risks (bias, hallucination, explain ability). The role defines assurance methods, quality gates and post‑deployment monitoring to meet internal policy and regulator expectations.
Design and manage the enterprise testing strategy for AI/Copilot, blending traditional QA with AI‑specific methods. Define test approaches for functional, performance, accuracy, reliability, ethical compliance and bias detection. Establish model‑evaluation techniques (prompt variability, edge‑case simulation, output consistency, scenario reasoning). Validate explainability, traceability and safety controls against policy and regulatory requirements. Evaluate and test human‑in‑the‑loop workflows and decision checkpoints for appropriate oversight. Embed quality gates in iterative delivery, preventing progression without assurance evidence. Develop and maintain specialised test datasets, including adversarial, low‑quality, domain‑specific and edge‑case inputs, to rigorously challenge model robustness and identify systemic weaknesses. Provide AI test engineering support to delivery squads, advising on model‑readiness criteria, testability risks, and quality implications of design decisions, ensuring solutions are verifiable throughout the lifecycle. Define and run post‑deployment validation, drift detection, incident triage and continuous model‑monitoring. Partner with Risk, Legal, Security and Compliance teams to meet control frameworks and audit standards. Provide inputs to risk/impact assessments, policy adherence checks and governance submissions. Lead incident investigations for unexpected AI behaviours, conducting deep‑dive root‑cause analysis across data quality, model logic, prompt flows, integration layers and human‑in‑the‑loop steps; identify systemic failure points, recommend corrective actions, and drive end‑to‑end remediation to prevent recurrence. Maintain test documentation, evaluation logs, datasets and reproducible evidence for audit. Uplift AI testing capability across teams through standards, templates, training and hands‑on support. Champion continuous improvement of AI assurance, evaluating new testing tooling (LLM‑monitoring, bias‑scanners, prompt‑diff tools, synthetic data generators) and maturing standards as organisational AI adoption scales. Ensure responsible AI principles (e.g., transparency, explainability, ISO42001) are incorporated into all development. Provide insight to support business cases, investment decisions, risk assessments, and prioritisation discussions at AI governance forums. Managing escalations supporting the wider Data & AI Leadership team.
Functional/Technical (Role Specific) Higher education qualification (or equivalent experience) in Ethics, Law, Risk Management, Social Sciences, Data/Computer Science or relevant field Experience with designing and leading testing for complex digital or data‑driven systems, including multi‑component architectures, API‑integrated platforms, event‑driven workflows and systems operating under regulatory or high‑assurance constraints. Clear understanding of AI‑specific risks such as hallucinations, bias, drift, explainability gaps, safety breaches and misuse pathways, paired with the ability to design targeted tests that uncover model blind spots and systemic weaknesses. Knowledge of model‑evaluation techniques, prompt‑testing strategies and scenario‑based testing approaches, including stress‑testing prompts, adversarial input creation, failure‑mode exploration and behaviour‑driven evaluation. Familiarity with governance, audit and regulatory standards for AI, data and digital services, ensuring testing evidence aligns with internal risk frameworks, ISO42001 controls, Responsible AI policies and external regulatory expectations. Experience developing structured QA strategies that integrate traditional and AI‑specific assurance, mapping out test plans, risk‑based prioritisation, acceptance criteria, model‑readiness thresholds and quality gates aligned to lifecycle stages. Ability to define and execute test plans across functional, non‑functional, ethical and performance dimensions, validating accuracy, latency, robustness, security, fairness, reliability and user‑journey consistency. Strong analytical mindset with the ability to identify root causes of defects or unexpected AI behaviour, performing deep‑dive diagnostics across data pipelines, vector stores, prompt flows, orchestration logic and human‑in‑the‑loop checkpoints. Experience with post‑deployment monitoring, drift detection and continuous validation, designing alerts, retraining triggers, performance thresholds and evaluation cadences to maintain long‑term model integrity. Comfortable learning and adapting to emerging AI technologies and engineering patterns. Excellent stakeholder management and communication skills, including senior‑level engagement. Commercial awareness and a value‑driven mindset. Use of professional networks and external influencers with clear evidence of learning and development to build and maintain skills and expertise.
Sector (desirable) Understanding of financial services industry, markets and competitors Understanding of how financial services organisations operate and the associated regulatory environment, or other regulated industries Awareness of the Mutual Sector and the needs and interests of Members.
Commercial Ability to work with autonomy and make operational decisions Experience of delivering organisational change Understanding of related functions and/or services outside of the role’s direct remit. Experience of managing a set of internal and external stakeholder relationships
Interpersonal Good interpersonal skills and ability to build and maintain strong working relationships Ability to work effectively in diverse teams A problem-solving approach with curiosity and proactivity to engage and understand both the strategic business goals and our customer’s needs Ability to identify areas of improvement and create innovative approaches to delivering better quality service. Experience working in cross-functional teams and agile environments Ability to identify, nurture and realise the potential in others Strong communication, engagement and influencing skills Ability to effectively represent YBS through building collaborative relationships.