We serve candidates globally, so we only activate Google Tag Manager and other analytics after you opt in. This keeps us aligned with GDPR/UK DPA, ePrivacy, LGPD, and similar rules. Essential features still run without analytics cookies.
🤖 15+ AI Agents working for you. Find jobs, score and update resumes, cover letter, interview questions, missing keywords, and lots more.
Software Quality Engineer (AI/ML Applicaitons) at Vizient - QATestingJobs.com
V
Software Quality Engineer (AI/ML Applicaitons)
Vizient • Chicago, Illinois, United States
onsitefull-time
Salary: $77,400 - $135,400 / year
Posted Feb 18, 2026
Role & seniority: Mid-level AI/ML QA Engineer (2+ years of relevant experience)
Stack/tools: Python, SQL, test automation frameworks; ML/AI validation; model evaluation; dashboards/monitoring; CI/CD; Agile
Top 3 responsibilities
Develop and execute test strategies for ML and generative AI-powered applications
Design and maintain evaluation frameworks for LLMs (automated scoring, LLM-as-a-judge)
Build/maintain dashboards and monitoring to detect drift, degraded scores, and safety risks; implement proactive AI-driven alerts
Must-have skills
2+ years validating ML or generative AI applications (model evaluation, data quality)
Proficiency in Python and SQL; experience with test automation
Experience evaluating LLMs, prompt regression testing, and human-in-the-loop methodologies
Knowledge of RAG concepts (retrieval quality, relevance, faithfulness, safety)
Experience designing AI evaluation metrics (ranking, calibration, reliability); production health reporting
Strong analytical, documentation, and communication skills; Agile/CI-CD familiarity; self-starter in fast-paced environments
Nice-to-haves
Experience with model monitoring in production; familiarity with proactive alerting systems
Ability to partner across data science, engineering, product, and security to define quality gates
Location & work type: Location not specified; work type not disclosed (full-time role with benefits and incentive eligibility)
Full Description
When you’re the best, we’re the best. We instill an environment where employees feel engaged, satisfied and able to contribute their unique skills and talents while living and working as their authentic selves. We provide extensive opportunities for personal and professional development, building both employee competence and organizational capability to fuel exceptional performance through an inclusive environment both now and in the future.
Summary
In this role, you will validate AI and ML-powered healthcare solutions across the full development lifecycle to ensure data quality, model performance, reliability, and safe deployment in production environments. You will design and execute data-driven and automated test strategies, including model evaluation, prompt regression testing, dataset profiling, and end-to-end pipeline validation. You will partner with data science, engineering, product, and security teams to define measurable quality gates and deliver compliant, explainable, and dependable AI experiences that drive client value.
Responsibilities
Develop and execute test strategies for ML and generative AI-powered applications.
Design and maintain evaluation frameworks for Large Language Models (LLM), including automated scoring and LLM -as-a-judge methodologies.
Develop prompt regression test suites to detect performance degradation across model and prompt versions.
Evaluate generative AI systems for hallucination risk, factual consistency, grounding accuracy, and safety compliance.
Conduct model evaluation, regression testing, and drift monitoring in development and production environments.
Build dashboards and monitoring tools to detect degraded evaluation scores, drift, or safety risks and support proactive triage.
Design and implement proactive AI-driven alerting and recommendation systems embedded within dashboards and user workflows
Automate dashboard metric generation and refresh pipelines using Python and data workflows.
Partner with cross-functional teams to define AI quality standards, acceptance criteria, and release gates.
Investigate defects, analyze root causes, and recommend corrective actions to improve reliability and performance.
Qualifications
Relevant degree preferred.
2 or more years of relevant experience required.
Experience validating ML or generative AI-based applications, including model evaluation and data quality assessment required.
Proficiency in Python, SQL, and test automation frameworks.
Experience evaluating LLM systems, including prompt regression testing and automated or human-in-the-loop judging methodologies.
Familiarity with RAG evaluation concepts, including retrieval quality, context relevance, faithfulness, and safety testing.
Experience designing AI evaluation metrics, including ranking, calibration, and reliability measures.
Experience building model monitoring dashboards and production health reporting.
Understanding of Agile methodologies and CI/CD practices.
Strong analytical, documentation, and communication skills.
Self-starter who thrives in fast-paced, iterative environment and drives quality initiatives end-to-end amid ambiguity and shifting priorities.
Estimated Hiring Range
At Vizient, we consider skills, experience, and organizational needs in our compensation approach. Geographic factors may adjust the range estimate and hires typically fall below the top range. Compensation decisions are tailored to individual circumstances. The current salary range for this role is $77,400.00 to $135,400.00.
This position is also incentive eligible.
Vizient has a comprehensive benefits plan! Please view our benefits here
Equal Opportunity Employer: Females/Minorities/Veterans/Individuals with Disabilities
The Company is committed to equal employment opportunity to all employees and applicants without regard to race, religion, color, gender identity, ethnicity, age, national origin, sexual orientation, disability status, veteran status or any other category protected by applicable law.