Role & seniority: AI/ML Engineer (senior-level) focused on AI evaluation, testing, and deployment

Stack/tools: Python (expert), LLM/RAG pipelines, LangSmith, DeepEval, TruLens, Pytest, Playwright, LangGraph; cloud: AWS Bedrock, Azure AI; APIs: OpenAI, AWS Bedrock, Anthropic; CI/CD; Docker, Kubernetes

Top 3 responsibilities

Build and maintain model evaluation pipelines assessing response quality, coherence, factual grounding, and semantic accuracy (RAGAS framework)
Develop systems for hallucination and bias detection; implement checks for false facts and inherited data biases
Conduct prompt engineering for testing, automate AI testing within CI/CD, manage evaluation datasets and synthetic test data

Must-have skills

Expert Python for automation and AI library interaction
Deep understanding of LLM architectures, RAG, model drift
Proficiency with testing/observability tools (Pytest, Playwright; TruLens, LangGraph)
Experience with cloud-native MLOps (AWS Bedrock, Azure AI), Docker, Kubernetes
Strong communication, problem-solving, and collaboration

Nice-to-haves

Experience with LangSmith, DeepEval, TruLens in practice
API integration work with OpenAI, AWS Bedrock, Anthropic
Safety/compliance knowledge (ethical standards, regulations like EU AI Act)

Location & work type: Location not specified; work type not specified

Full Description

Job Requirements

At Quest Global, it’s not just what we do but how and why we do it that makes us different. With over 25 years as an engineering services provider, we believe in the power of doing things differently to make the impossible possible. Our people are driven by the desire to make the world a better place—to make a positive difference that contributes to a brighter future. We bring together technologies and industries, alongside the contributions of diverse individuals who are empowered by an intentional workplace culture, to solve problems better and faster.

Key Responsibilities

Model Evaluation: Implementing evaluation pipelines to assess model response quality, coherence, factual grounding, and semantic accuracy. Experience with RAGAS framework

Hallucination & Bias Detection: Developing systems to identify and operationalize checks for "hallucinations" (false facts) and inherited data biases.

Prompt Engineering for Testing: Crafting complex prompts to test model boundaries

Automated AI Testing: Building and maintaining automated testing frameworks using specialized tools (e.g., LangSmith, DeepEval, or TruLens) integrated into CI/CD pipelines.

Dataset Management: Defining and managing high-quality evaluation datasets and synthetic test data for model benchmarking.

API Integration: Integrating third-party APIs (OpenAI, AWS Bedrock, Anthropic) into applications to leverage pre-trained models.

Safety & Compliance: Ensuring AI outputs adhere to ethical standards, legal regulations (like the EU AI Act), and company safety policies.

We are known for our extraordinary people who make the impossible possible every day. Questians are driven by hunger, humility, and aspiration. We believe that our company culture is the key to our ability to make a true difference in every industry we reach. Our teams regularly invest time and dedicated effort into internal culture work, ensuring that all voices are heard.

We wholeheartedly believe in the diversity of thought that comes with fostering a culture rooted in respect, where everyone belongs, is valued, and feels inspired to share their ideas. We know embracing our unique differences makes us better, and that solving the worlds hardest engineering problems requires diverse ideas, perspectives, and backgrounds. We shine the brightest when we tap into the many dimensions that thrive across over 21,000 difference-makers in our workplace.

Work Experience

Programming: Expert-level Python is typically mandatory for script automation and interacting with AI libraries.

Good understanding of LLM architectures, Retrieval-Augmented Generation (RAG) pipelines, and model drift.

Senior QA Engineer

Top 3 responsibilities

Must-have skills

Nice-to-haves

Full Description