Role & seniority: QA Lead – AI & GenAI Systems (8–12+ years in software QA; 3–5+ years leading QA teams)

Stack/tools: API, backend, and integration test automation; CI/CD pipelines; GenAI/LLM-based testing; RAG architectures (vector databases, embeddings); LLMOps/MLOps tools; test management: Jira, Confluence, TestRail (preferred)

Top 3 responsibilities

Define end-to-end QA strategy, test planning, and AI/non-AI quality metrics; own QA governance
Lead automation initiatives, design automated test frameworks, and integrate QA into CI/CD for rapid feedback
Validate GenAI/LLM systems, non-deterministic outputs, RAG pipelines, safety/bias guardrails; mentor QA teams; collaborate with engineering/ML/product

Must-have skills

8–12+ years QA experience with 3–5+ years in leadership
Strong API, backend, and integration test automation; hands-on CI/CD
Proven GenAI/LLM testing in production; non-deterministic/probabilistic validation
Experience with RAG, embeddings, vector databases; familiarity with LLMOps/MLOps; multi-agent testing
SDLC/Agile knowledge; effective cross-functional collaboration

Nice-to-haves

Startup/fast-paced product experience
AI evaluation frameworks, benchmarking, A/B testing
Prompt engineering, prompt regression/versioning
Experience with Jira/Confluence/TestRail
Location & work type: Remote position (India)

Note: Neutral, concise, no hype.

Full Description

Job Description

This is a remote position.

Job Summary

We are seeking an experienced and hands-on QA Lead – AI & GenAI Systems to own and drive quality across next-generation AI-powered products. The ideal candidate will bring deep expertise in traditional QA practices along with strong experience testing GenAI, LLM-based, and agentic systems in high-scale production environments.

You will define the QA strategy, lead automation initiatives, establish AI-specific quality metrics, and work closely with engineering, ML, and product teams to ensure reliability, safety, and performance of complex AI workflows. This role is critical in shaping quality standards for non-deterministic systems, multi-agent architectures, and retrieval-augmented generation (RAG) pipelines in a fast-paced startup environment.

Responsibilities

Own and define the end-to-end QA strategy, test planning, and quality metrics for AI and non-AI systems. Lead and mentor QA engineers, ensuring best practices in automation, test design, and execution. Design and implement automated test frameworks for API, backend, integration, and regression testing. Integrate QA processes into CI/CD pipelines to enable continuous testing and rapid feedback loops. Collaborate closely with engineering, ML, and product teams to validate functional and non-functional requirements. Define and track AI-specific quality metrics including accuracy, relevance, hallucination rate, latency, and consistency. Test GenAI / LLM-based systems including hosted and open-source model integrations. Validate non-deterministic behaviors, probabilistic outputs, and prompt-based workflows. Lead testing for RAG systems, including vector search, embeddings, retrieval accuracy, and response grounding. Execute safety, bias, and guardrail testing to ensure responsible AI behavior. Support evaluation frameworks such as human-in-the-loop, offline benchmarking, and online experimentation. Validate data quality used for model training, fine-tuning, and inference pipelines. Test agent workflows involving multi-step reasoning, tool calling, memory/state handling, and orchestration logic. Collaborate on prompt engineering, prompt regression testing, and prompt versioning strategies.

Requirements

Essential Skills

Job

8–12+ years of experience in software QA with 3–5+ years leading QA teams. Strong expertise in API, backend, and integration test automation. Hands-on experience with CI/CD pipelines and automated regression testing. Proven experience testing GenAI / LLM-based applications in production environments. Deep understanding of non-deterministic systems and probabilistic output validation. Experience with RAG architectures, embeddings, vector databases, and retrieval quality testing. Familiarity with LLMOps / MLOps tools, model monitoring, and evaluation pipelines. Experience testing multi-agent systems or agent orchestration frameworks. Strong understanding of SDLC, Agile methodologies, and quality governance.

Quality Assurance Lead, Remote, India

Top 3 responsibilities

Must-have skills

Nice-to-haves

Full Description

Essential Skills