Cookies & analytics consent
We serve candidates globally, so we only activate Google Tag Manager and other analytics after you opt in. This keeps us aligned with GDPR/UK DPA, ePrivacy, LGPD, and similar rules. Essential features still run without analytics cookies.
Read how we use data in our Privacy Policy and Terms of Service.
🤖 15+ AI Agents working for you. Find jobs, score and update resumes, cover letter, interview questions, missing keywords, and lots more.

Ncs • Singapore, Singapore
Role & seniority: LLM / AI Quality Engineer, mid-senior (3+ years in software testing/QA)
Stack/tools: Python/TypeScript; API and performance testing; CI/CD; cloud basics (AWS/Azure/GCP); microservices; observability/evaluation tools (e.g., LangSmith, Weights & Biases, TruLens, Guardrails); data pipelines for RAG
Lead end-to-end evaluation of AI applications (LLM features, RAG, multi-agent workflows) across offline, pre-prod, and prod with test design, execution, and reporting
Design and validate non-functional aspects: performance, latency, cost, safety, security; implement CI/CD integrated tests and canary/A-B testing
Ensure quality and compliance: conduct rubric-based reviews, guardrails validation, data residency/PII controls, and produce risk-aware decision reports
3+ years in software testing/QA with API and performance testing; strong test methodology and tooling
Programming familiarity (Python/TypeScript); CI/CD and version control; cloud basics; microservices
Experience with test design for AI/ML systems and evaluation/observability tooling
ML/MLOps concepts, production model validation and monitoring; AI security testing
Experience with Azure OpenAI/Bedrock/Vertex; token accounting; RAG evaluation tools (e.g., LangSmith, Weights & Biases, Promptfoo)
Automation frameworks (Playwright/Cypress/Selenium); API tools; k6/JMeter
Location & work type: Asia Pacific region
NCS is a leading technology services firm that operates across the Asia Pacific region in over 20 cities, providing consulting, digital services, technology solutions, and more. We believe in harnessing the power of technology to achieve extraordinary things, creating lasting value and impact for our communities, partners, and people. Our diverse workforce of 14,000 has delivered large-scale, mission-critical, and multi-platform projects for governments and enterprises in Singapore and the APAC region.
Job Description
As a LLM / AI Quality Engineer, you will lead the end-to-end evaluation of AI applications—LLM features, RAG systems, and multi-agent workflows—to ensure they meet business outcomes, safety requirements, and platform standards. Own test design, execution, and reporting across offline, pre-prod, and in-prod stages, integrating with CI/CD and working closely with product, data, and platform teams.
Define evaluation strategies (golden sets, adversarial suites, regressions), pass/fail gates, and SLOs for quality, safety, latency, and cost. Establish rubric-based human reviews (usefulness, faithfulness, safety, clarity) and calibrate annotators. Instrument LLM-as-judge where appropriate with calibration and spot checks.
Measure retrieval precision/recall, MRR/nDCG, and answer faithfulness to sources; detect hallucination and citation errors. Test chunking, prompt templates, filters, and policy chains; monitor stale/poisoned content.
Validate multi-step plans, tool selection, error recovery, retries, and idempotency for functions with side effects. Contract-test JSON schemas and structured outputs across services.
Run token-aware load/soak tests (context length, temperature, batching); track p50/p95/p99, throughput, timeouts, cache hit rate, and cost per successful task. Recommend optimizations (prompt/policy changes, retrieval tweaks, caching).
Red-team for prompt injection, data exfiltration, indirect injections via retrieved content; validate guardrails pre/post inference. Enforce PII controls, data-residency, and compliance checks; align with organizational security testing practices.
Implement prompt/dataset/version lineage and trace-based evals; automate in CI (pre-merge golden tests, nightly adversarials) with canary/A-B in prod and rollback criteria. Produce clear, decision-ready reports with risk assessments and release recommendations.
Analyze requirements, enhance test plans with additional cases, prepare environments (including cloud), execute tests per plan, and drive defect resolution. Provide regular status updates; manage test activities to schedule; support SIT/UAT and production readiness.
Execute API, performance, and load testing for microservices/web services that underpin AI features; integrate automated testing into CI/CD.
Adopt and improve test standards/methodology; share practices, train teams, participate in peer reviews, and pursue self-directed learning.
Qualifications
3+ years in software testing/QA with strong test methodology and tooling; hands-on API testing and performance testing. Programming familiarity (e.g., Python/TypeScript) and experience with CI/CD and version control. Cloud basics (AWS/Azure/GCP) and microservices fundamentals. Degree/Diploma in CS/IT or equivalent. Preferred (AI/ML Focus)
Understanding of ML concepts and MLOps; experience with model validation and monitoring in production. Experience with AI-specific security testing and vulnerability assessment.
Familiarity with evaluation/observability tools (any of): LangSmith, Weights & Biases, RAGAS, TruLens, Promptfoo, DeepEval, Guardrails/LlamaGuard, Presidio; plus OpenTelemetry-style LLM traces. Practical exposure to Azure OpenAI/Bedrock/Vertex and model gateways; quota & token accounting know-how. Tooling & Automation
Modern automation frameworks (e.g., Playwright, Cypress, Selenium), API test tools (Postman/REST Assured), performance tools (k6/JMeter), and CI/CD integration. Data evaluation pipelines for RAG (embedding validation, filtering, drift detection). Traits
Outcome-oriented, high standards; strong communication and collaboration; customer-focused; proficient in written and spoken English. Telco Context (Nice-to-Have)
Experience testing copilots/agents for BSS/OSS, NOC analytics, and enterprise care; ability to tie eval KPIs to CSAT, AHT, FCR, MTTR.
Additional Information
Why Join NCS
Lead high-impact Data & AI advisory programs for major enterprises and public sector clients.
Shape enterprise strategies and governance frameworks that drive real transformation.
Work with a talented, multidisciplinary team in a collaborative environment.
Competitive compensation and strong professional development support.
We are driven by our AEIOU beliefs—Adventure, Excellence, Integrity, Ownership, and Unity—and we seek individuals who embody these values in both their professional and personal lives. We are committed to our Impact: Valuing our clients, Growing our people, and Creating our future.
Together, we make the extraordinary happen.
Learn more about us at ncs.co and visit our LinkedIn career site.
Scam Alert
We are aware of fraudulent job offers and impersonations of NCS recruiters. Phishing emails using convincing-looking but fake addresses are also commonly used to trick you into thinking that they come from official NCS sources.
Please note that all official communications from NCS Group will only be sent from verified corporate email addresses. Always check that the sender’s email address ends with the genuine NCS domain, @ncs.com.sg and beware of extra letters, symbols or misspellings. When in doubt, verify the sender’s identity by contacting us at reachus@ncs.com.sg.