Cookies & analytics consent
We serve candidates globally, so we only activate Google Tag Manager and other analytics after you opt in. This keeps us aligned with GDPR/UK DPA, ePrivacy, LGPD, and similar rules. Essential features still run without analytics cookies.
Read how we use data in our Privacy Policy and Terms of Service.
🤖 15+ AI Agents working for you. Find jobs, score and update resumes, cover letter, interview questions, missing keywords, and lots more.

Codvo.ai • Pune, Maharashtra, India
Role & seniority: Principal QA – AI & Conversational Systems (senior/principal level)
Stack/tools: LLMs and conversational AI; AI evaluation metrics; AI safety and bias validation frameworks; synthetic call testing; prompt robustness and fallback testing; RAG/knowledge grounding validation
Lead evaluation of AI-driven voice bots, agent assist, and summarization systems; design LLM validation frameworks for accuracy, safety, and latency
Hallucination detection, response accuracy scoring, and bias/safety testing standards; ensure compliance with data privacy
Synthetic call testing, AI load benchmarking, prompt robustness validation, and fallback handling
Experience with LLM or conversational AI testing
Hands-on exposure to AI evaluation metrics and AI safety/bias validation
Ability to design and implement AI testing frameworks and validation processes
Background in responsible AI, bias testing, and security/compliance testing
Experience with RAG and knowledge grounding validation
Familiarity with latency and performance benchmarking for AI systems
Location & work type: Not specified; role implies remote or on-site at Codvo, with potential global team collaboration.
Principal QA – AI & Conversational Systems
Core Responsibilities Hallucination detection and response accuracy scoring. RAG and knowledge grounding validation. Synthetic call testing and AI load benchmarking. Prompt robustness and fallback validation. Sensitive data leakage and AI compliance testing.
Ideal Background Experience in LLM or conversational AI testing. Hands-on exposure to AI evaluation metrics. Strong understanding of AI safety and bias validation. Experience designing AI testing frameworks.