
Test Automation & Reliability Engineer
Jobs via Dice • Phoenix, Arizona, United States
Role & seniority
- Test Automation & Reliability Engineer (entry to mid level; flexible for junior candidates)
Stack / tools
Test frameworks: Cypress, Playwright (web and API testing)
Languages/tech: JavaScript, TypeScript, Node.js, GraphQL
CI/CD: Azure DevOps, GitHub Actions
Infrastructure: Azure cloud, Kubernetes
Observability / reliability (nice-to-have): Dynatrace, Mezmo, BigPanda, Nucleus
Top 3 responsibilities
-
Design, develop, and maintain automated test frameworks for web and API testing
-
Build and maintain integration with CI/CD pipelines to ensure reliable automated testing
-
Conduct regression, performance, and security testing; participate in SRE activities (monitoring, incident response, postmortems)
Must-have skills
-
Design, develop, and maintain automated test frameworks using Cypress/Playwright
-
Experience testing applications built with JavaScript, TypeScript, Node.js, GraphQL
-
Build/integrate with CI/CD pipelines using Azure DevOps and GitHub Actions
-
Familiarity with Azure cloud and Kubernetes
-
Strong quality engineering background with automation best practices; ability to work independently
Nice-to-have
-
Experience with Dynatrace, Mezmo (LogDNA), BigPanda, Nucleus
-
Understanding of production observability, incident management, SRE concepts (monitoring, alerting, SLAs/SLOs)
-
Willingness to learn, grow SRE skills, and own reliability improvements
-
Collaboration with developers and product managers; blameless postmortems
Location &
Full Description
Dice is the leading career destination for tech experts at every stage of their careers. Our client, Ztek Consulting, is seeking the following. Apply via Dice today!
Role: - Test Automation & Reliability Engineer
Location: - Phoenix, AZ- Onsite
Preferred Qualifications Education & Prior Job Experience
Master's degree in Computer Science, Computer Engineering, Technology, Information Systems (CIS/MIS), Engineering or related technical discipline, or equivalent experience/training
Airline Industry experience
Top 3 required skills
Able to Design, develop, and maintain automated test frameworks using Cypress/Playwright for web and API testing. With Experience in testing applications built with JavaScript, TypeScript, Node.js, and GraphQL. Able to Build and maintain integration with CI/CD pipelines to ensure reliable automated testing using Azure Dev Ops and GitHub actions.
We will consider junior developers who can demonstrate passion for development and processes.
Nice to Have Skills and Experience
Dynatrace (APM/monitoring)
Mezmo (LogDNA) (log aggregation)
BigPanda (incident intelligence)
Nucleus (security & vulnerability management)
Understanding of production observability and incident management concepts.
Excited to learn, grow their SRE skills, and take ownership across both testing and reliability domains.
A passion for improving processes and building reliable systems.
Proven ability to work independently and take initiative with minimal guidance.
Strong background in quality engineering, with a solid understanding of automation best practices.
Familiarity with SRE concepts such as monitoring, alerting, incident response.
Design, develop, and maintain automated test frameworks using Cypress/Playwright for web and API testing.
Experience testing applications built with JavaScript, TypeScript, Node.js, and GraphQL.
Familiarity with navigating and managing resources in Azure cloud and Kubernetes environments.
Define and execute comprehensive test strategies for new features and services.
Implement and manage CI/CD workflows using GitHub Actions or other GitHub-integrated tools.
Build and maintain integration with CI/CD pipelines to ensure reliable automated testing using Azure Dev Ops and GitHub actions.
Conduct regression, performance, and security testing.
Collaborate with developers and product managers to ensure high-quality releases.
Participate in the SRE team s daily operations, including system monitoring, alerting, and incident response.
Implement post-deployment validation, health checks, and release safety mechanisms.
Help define and monitor SLAs, SLOs, and error budgets.
Contribute to reliability tooling, observability improvements, and performance diagnostics.
Participate in blameless postmortems and propose solutions to improve system stability.