Role & seniority

Test Automation & Reliability Engineer (entry to mid level; flexible for junior candidates)

Stack / tools

Test frameworks: Cypress, Playwright (web and API testing)

Languages/tech: JavaScript, TypeScript, Node.js, GraphQL

CI/CD: Azure DevOps, GitHub Actions

Infrastructure: Azure cloud, Kubernetes

Observability / reliability (nice-to-have): Dynatrace, Mezmo, BigPanda, Nucleus

Top 3 responsibilities

Design, develop, and maintain automated test frameworks for web and API testing
Build and maintain integration with CI/CD pipelines to ensure reliable automated testing
Conduct regression, performance, and security testing; participate in SRE activities (monitoring, incident response, postmortems)

Must-have skills

Design, develop, and maintain automated test frameworks using Cypress/Playwright
Experience testing applications built with JavaScript, TypeScript, Node.js, GraphQL
Build/integrate with CI/CD pipelines using Azure DevOps and GitHub Actions
Familiarity with Azure cloud and Kubernetes
Strong quality engineering background with automation best practices; ability to work independently

Nice-to-have

Experience with Dynatrace, Mezmo (LogDNA), BigPanda, Nucleus
Understanding of production observability, incident management, SRE concepts (monitoring, alerting, SLAs/SLOs)
Willingness to learn, grow SRE skills, and own reliability improvements
Collaboration with developers and product managers; blameless postmortems

Location &

Full Description

Dice is the leading career destination for tech experts at every stage of their careers. Our client, Ztek Consulting, is seeking the following. Apply via Dice today!

Role: - Test Automation & Reliability Engineer

Location: - Phoenix, AZ- Onsite

Preferred Qualifications Education & Prior Job Experience

Master's degree in Computer Science, Computer Engineering, Technology, Information Systems (CIS/MIS), Engineering or related technical discipline, or equivalent experience/training

Airline Industry experience

Top 3 required skills

Able to Design, develop, and maintain automated test frameworks using Cypress/Playwright for web and API testing. With Experience in testing applications built with JavaScript, TypeScript, Node.js, and GraphQL. Able to Build and maintain integration with CI/CD pipelines to ensure reliable automated testing using Azure Dev Ops and GitHub actions.

We will consider junior developers who can demonstrate passion for development and processes.

Nice to Have Skills and Experience

Dynatrace (APM/monitoring)

Mezmo (LogDNA) (log aggregation)

BigPanda (incident intelligence)

Nucleus (security & vulnerability management)

Understanding of production observability and incident management concepts.

Excited to learn, grow their SRE skills, and take ownership across both testing and reliability domains.

A passion for improving processes and building reliable systems.

Proven ability to work independently and take initiative with minimal guidance.

Strong background in quality engineering, with a solid understanding of automation best practices.

Familiarity with SRE concepts such as monitoring, alerting, incident response.

Design, develop, and maintain automated test frameworks using Cypress/Playwright for web and API testing.

Experience testing applications built with JavaScript, TypeScript, Node.js, and GraphQL.

Familiarity with navigating and managing resources in Azure cloud and Kubernetes environments.

Define and execute comprehensive test strategies for new features and services.

Implement and manage CI/CD workflows using GitHub Actions or other GitHub-integrated tools.

Build and maintain integration with CI/CD pipelines to ensure reliable automated testing using Azure Dev Ops and GitHub actions.

Conduct regression, performance, and security testing.

Collaborate with developers and product managers to ensure high-quality releases.

Participate in the SRE team s daily operations, including system monitoring, alerting, and incident response.

Implement post-deployment validation, health checks, and release safety mechanisms.

Help define and monitor SLAs, SLOs, and error budgets.

Contribute to reliability tooling, observability improvements, and performance diagnostics.

Participate in blameless postmortems and propose solutions to improve system stability.