Cookies & analytics consent
We serve candidates globally, so we only activate Google Tag Manager and other analytics after you opt in. This keeps us aligned with GDPR/UK DPA, ePrivacy, LGPD, and similar rules. Essential features still run without analytics cookies.
Read how we use data in our Privacy Policy and Terms of Service.
🤖 15+ AI Agents working for you. Find jobs, score and update resumes, cover letter, interview questions, missing keywords, and lots more.

Quest Global • Bengaluru, Karnataka, India
Role & seniority: AI/ML Engineer (senior-level) focused on AI evaluation, testing, and deployment
Stack/tools: Python (expert), LLM/RAG pipelines, LangSmith, DeepEval, TruLens, Pytest, Playwright, LangGraph; cloud: AWS Bedrock, Azure AI; APIs: OpenAI, AWS Bedrock, Anthropic; CI/CD; Docker, Kubernetes
Build and maintain model evaluation pipelines assessing response quality, coherence, factual grounding, and semantic accuracy (RAGAS framework)
Develop systems for hallucination and bias detection; implement checks for false facts and inherited data biases
Conduct prompt engineering for testing, automate AI testing within CI/CD, manage evaluation datasets and synthetic test data
Expert Python for automation and AI library interaction
Deep understanding of LLM architectures, RAG, model drift
Proficiency with testing/observability tools (Pytest, Playwright; TruLens, LangGraph)
Experience with cloud-native MLOps (AWS Bedrock, Azure AI), Docker, Kubernetes
Strong communication, problem-solving, and collaboration
Experience with LangSmith, DeepEval, TruLens in practice
API integration work with OpenAI, AWS Bedrock, Anthropic
Safety/compliance knowledge (ethical standards, regulations like EU AI Act)
Location & work type: Location not specified; work type not specified
Job Requirements
At Quest Global, it’s not just what we do but how and why we do it that makes us different. With over 25 years as an engineering services provider, we believe in the power of doing things differently to make the impossible possible. Our people are driven by the desire to make the world a better place—to make a positive difference that contributes to a brighter future. We bring together technologies and industries, alongside the contributions of diverse individuals who are empowered by an intentional workplace culture, to solve problems better and faster.
Key Responsibilities
Model Evaluation: Implementing evaluation pipelines to assess model response quality, coherence, factual grounding, and semantic accuracy. Experience with RAGAS framework
Hallucination & Bias Detection: Developing systems to identify and operationalize checks for "hallucinations" (false facts) and inherited data biases.
Prompt Engineering for Testing: Crafting complex prompts to test model boundaries
Automated AI Testing: Building and maintaining automated testing frameworks using specialized tools (e.g., LangSmith, DeepEval, or TruLens) integrated into CI/CD pipelines.
Dataset Management: Defining and managing high-quality evaluation datasets and synthetic test data for model benchmarking.
API Integration: Integrating third-party APIs (OpenAI, AWS Bedrock, Anthropic) into applications to leverage pre-trained models.
Safety & Compliance: Ensuring AI outputs adhere to ethical standards, legal regulations (like the EU AI Act), and company safety policies.
We are known for our extraordinary people who make the impossible possible every day. Questians are driven by hunger, humility, and aspiration. We believe that our company culture is the key to our ability to make a true difference in every industry we reach. Our teams regularly invest time and dedicated effort into internal culture work, ensuring that all voices are heard.
We wholeheartedly believe in the diversity of thought that comes with fostering a culture rooted in respect, where everyone belongs, is valued, and feels inspired to share their ideas. We know embracing our unique differences makes us better, and that solving the worlds hardest engineering problems requires diverse ideas, perspectives, and backgrounds. We shine the brightest when we tap into the many dimensions that thrive across over 21,000 difference-makers in our workplace.
Work Experience
Programming: Expert-level Python is typically mandatory for script automation and interacting with AI libraries.
Good understanding of LLM architectures, Retrieval-Augmented Generation (RAG) pipelines, and model drift.
Specialized Testing Frameworks: Proficiency with tools like Pytest, Playwright and AI evaluation frameworks like TruLens or LangGraph.
Observability & MLOps: Familiarity with cloud-native tools (AWS Bedrock, Azure AI), Docker, Kubernetes
Soft Skills: Strong communication, problem-solving, and collaboration skills