Cookies & analytics consent
We serve candidates globally, so we only activate Google Tag Manager and other analytics after you opt in. This keeps us aligned with GDPR/UK DPA, ePrivacy, LGPD, and similar rules. Essential features still run without analytics cookies.
Read how we use data in our Privacy Policy and Terms of Service.
🤖 15+ AI Agents working for you. Find jobs, score and update resumes, cover letter, interview questions, missing keywords, and lots more.
Caseware • Colombia
Role & seniority: AI SDET (Quality Engineering/SDET) in Platform Engineering; foundational, high-impact senior individual contributor / lead capability-building role.
Stack/tools: AI/ML/LLM evaluation (Ragas, DeepEval, LangChain/LangSmith/LangFuse), CI/CD (Jenkins, GitHub Actions), full-stack testing (Frontend/Backend/API), Java/Python/JS/TS, observability (New Relic), load/perf tools (K6, JMeter, Blazemeter).
Define and execute an AI-first quality strategy for a fast-scaling, cloud-native SaaS, including infrastructure and agentic systems.
Integrate AI testing into CI/CD, build automated evaluation pipelines, red-teaming, bias/fairness checks, and continuous monitoring.
Partner with product/data/AI teams to design multi-agent tests, end-to-end AI lifecycle validation, metrics dashboards, and roadmap-driven QA improvements.
7+ years in Quality Engineering/SDET for cloud-native SaaS; 2+ years hands-on with AI/ML/LLM systems.
Strong automated testing infra, CI/CD, test pyramid, full-stack testing, and experience testing LLMs/AI agents/RAG pipelines.
Proficiency in JS/TS; working knowledge of Python/Java; excellent communication; bachelor's degree in CS/AI or related.
ISTQB AI Testing or similar certifications; experience with performance testing tools; red-teaming, ethical AI practices; open-source contributions or notable case studies.
Location & work type: Fully remote; Colombia-b
Overview Caseware is one of Canada's original Fintech companies, leading the global audit and accounting software industry for over 30 years, with more than 500,000 users across 130 countries and available in 16 languages.
The AI SDET role pioneers and scales AI-driven testing practices to fast-track reliable, safe, and high-performing AI capabilities across the organization.
This is a high-impact, foundational role in Platform Engineering's Quality function, influencing product trust, compliance, and innovation for end users.
Location: This is a fully remote position located in Colombia.
You will be reporting to: Jai Joshi Contact: Maira Russo - Senior Talent Acquisition Partner What You'll Be Doing Quality & AI-First Mindset Evolve a modern, AI-first quality strategy for a fast-scaling SaaS architecture, including foundational infrastructure and emerging agentic/intelligent systems.
Integrate AI enhancements into CI/CD pipelines (e.g., predictive flakiness detection, automated test generation, self-healing scripts) to improve isolation, data setup, and execution reliability using existing or suggested tools.
Establish scalable testing practices that support hyper-growth and petabyte-scale AI data pipelines.
AI-Focused Test Strategy, Automation & Evaluation Design deterministic and statistical testing approaches for non-deterministic LLM-based and agentic systems, addressing hallucinations, prompt injection, bias, drift, and safety risks.
Build automated evaluation pipelines and harnesses for correctness, faithfulness, retrieval quality, generation accuracy, tool-calling, planning sequences, and multi-agent flows.
Execute and develop test frameworks for the full AI lifecycle: prompts, datasets, embeddings, model versions, RAG pipelines (end-to-end validation), and guardrails.
Implement red-teaming, bias/fairness checks, and compliance mechanisms; leverage trend frameworks for metrics and observability.
Integrate AI-specific quality signals into CI/CD for automated gating and continuous monitoring.
Cross-Functional & End-to-End Testing Partner closely with product, data science, AI engineering, and development teams to test AI features, conduct multi-agent simulations, and ensure high-quality roadmap delivery.
Facilitate knowledge sharing and upskilling on AI testing best practices across the Quality Function.
Metrics, Observability & Continuous Improvement Drive core metrics (DORA, test coverage/effectiveness) plus AI-specific indicators (e.g., hallucination rate, context precision, drift detection).
Build real-time dashboards and support A/B testing of models with post-deployment monitoring.
Culture, Mentorship & Innovation Champion a quality-first, ethical AI mindset organization-wide.
Mentor SDET's, lead workshops on AI risks/validation, and influence design/deploy/incident processes.
As a foundational hire, define roadmaps and best practices for sustainable AI quality assurance.
Challenges You'll Tackle Ensuring reliability in agentic systems amid data drift and non-deterministic behavior.
Scaling tests for global SaaS while maintaining low hallucination rates and strong safety guardrails.
Building evaluation from scratch in a rapidly evolving landscape (e.g., multi-modal, agentic flows).
Success in the First 6 Months Launch foundational AI test frameworks and pipelines, achieving *****% coverage for key AI components.
Reduce AI-related defect escapes by *****% and integrate automated safety/compliance checks into all releases.
Establish metrics dashboards and evaluation loops that enable data-driven iteration on intelligent features.
What You Will Bring 7+ years in Quality Engineering/SDET roles within cloud-native SaaS environments, including 2+ years hands-on with AI/ML/LLM systems.
Expertise in automated testing infrastructure, CI/CD (Jenkins/GitHub Actions), and test pyramid strategies (unit ? E2E).
Strong full-stack testing experience (frontend/backend/API) and collaboration with dev teams.
Proven experience testing LLMs, AI agents, RAG pipelines, and related risks (hallucinations, prompt injection, bias, drift).
Proficiency in JS/TS, working knowledge of Python or Java; experience with AI evaluation frameworks (e.g., Ragas, DeepEval, LangChain/LangSmith/LangFuse) and other tools you may have proficiency in.
Knowledge of performance, Stress and Load testing tools like K6, JMeter, Blazemeter is a plus.
Knowledge of observability (New Relic), statistical testing methods, red-teaming, and ethical AI practices.
Excellent communication and coaching skills; ability to thrive in ambiguity and drive innovation.
Bachelor's/Master's in Computer Science, AI, or related; certifications (e.g., ISTQB AI Testing) a plus.
Strong English language communication and collaboration skills We value adaptability in this fast-moving field—equivalent experience and a strong portfolio (e.g., open-source contributions, case studies) are highly regarded.
What's in it for you Innovation is at our core.
We work with cutting-edge technology in accounting and financial reporting, constantly pushing the boundaries to create impactful software solutions.
We are committed to a collaborative culture, where your ideas are valued, and knowledge sharing is encouraged within a supportive, inclusive team.
Work-life balance is important to us.
We offer flexible work options, remote opportunities, and generous time-off policies to ensure a healthy work-life balance.
We offer competitive compensation, including a competitive salary and comprehensive benefits such as health insurance and retirement plans.
We are driven by impactful work .
Your contributions directly affect how our clients manage financial processes and drive their success.
Recognition and rewards matter to us .
We celebrate hard work through recognition programs, performance bonuses, and opportunities for career growth.
We embrace global opportunities .
Work on international projects and collaborate with a diverse, global team.
About Caseware Caseware's cutting-edge software products are designed for accounting firms, corporations, and governments.
Our teams collaborate, innovate, and build upon our suite of products to shape the future of audits, financial reporting, and financial data analytics.
With a strategic investment from Hg Capital, Caseware is in a growth phase focusing on people and products that have driven our success to date.
Many Voices, One Team: we value diversity and inclusion and welcome candidates of all backgrounds.
For accommodations or questions during the application process, email ******.
Background Check & Security Selected candidates will undergo a background check through Certn.co, including identity verification and criminal record check.
Executives and senior managers may undergo a soft credit check.
Netherlands and Germany residents are excluded from Certn.co background checks.
Security: all legitimate communications come from @caseware.com addresses and listings appear on reputable job boards and our website.
We will never ask for payment or financial information.
Be cautious of unsolicited offers.
#J-*****-Ljbffr