Role & seniority: Senior/lead Quality Engineer focused on automation and QA leadership for AI-powered applications (7+ years QA, 3+ years in QA leadership).

Stack/tools: Playwright, PyTest, Cypress, JUnit; REST API testing; CI/CD, DevOps, cloud-native (Azure, Kubernetes, GitHub Actions); Agentic AI frameworks (LangChain, LangGraph, RAG pipelines, Vector DBs).

Top 3 responsibilities

Lead transformation from manual QA to automation-first practices in LLM/SLM-powered domains (healthcare/education).
Design and implement quality strategies addressing LLM/SLM-specific challenges (latency, token efficiency, hallucinations, SME UAT cycles).
Ensure robust automation, CI/CD pipelines, and reliable deployments for regulated environments.

Must-have skills

Deep understanding of LLM/SLM quality issues and healthcare/education regulatory constraints.
Strong automation expertise with the listed tools; experience with API testing.
CI/CD, DevOps, cloud-native systems; familiarity with observability and monitoring concepts.

Nice-to-haves

Playwright MCP (multi-context automation) experience.
Hands-on with AI evaluation tools (Promptfoo, DeepEval, OpenAI Evals).
AI observability/monitoring (Datadog); AI security testing (prompt injection, adversarial robustness).
Location & work type: Location and work type not specified; please confirm.

Full Description

Required Skills & Qualifications

· This role requires deep awareness of quality challenges unique to LLM- and SLM-powered Agentic AI applications, especially in healthcare and education, where correctness, reliability, and compliance are essential.
7+ years in Quality Engineering/Automation, with 3 years in QA leadership roles.
Proven experience transforming teams from manual QA to automation-first.
Awareness of LLM/SLM quality challenges (latency unpredictability, token inefficiency, hallucinations, SME UAT cycles).
Strong automation expertise (Playwright, PyTest, Cypress, JUnit, REST API testing).
Familiarity with Agentic AI frameworks (LangChain, LangGraph, RAG pipelines, Vector DBs).
Experience in healthcare or education applications with regulatory constraints.
Solid background in CI/CD, DevOps, and cloud-native systems (Azure, Kubernetes, GitHub Actions).
Nice to Have (Big Plus)
Experience with Playwright MCP (multi-context automation) for scaling automation.
Hands-on with AI evaluation tools (Promptfoo, DeepEval, OpenAI Evals).
Familiarity with AI observability & monitoring (Datadog).
Background in AI security testing (prompt injection, adversarial robustness).

AI TEST LEAD - AI TEST LEAD

Top 3 responsibilities

Must-have skills

Nice-to-haves

Full Description

Required Skills & Qualifications