
Test Automation & Reliability Engineer
Jobs via Dice • Phoenix, Arizona, United States
Role & seniority: Test Automation & Reliability Engineer (senior level; 5+ years in design/development/production reliability)
Stack / tools
-
Test automation: Cypress or Playwright for web and API testing
-
Tech stack tested: JavaScript, TypeScript, Node.js, GraphQL
-
CI/CD: Azure DevOps, GitHub Actions
-
Cloud / infra: Azure, Kubernetes (familiarity)
-
Observability / reliability (nice-to-have): Dynatrace, Mezmo, BigPanda, Nucleus; SRE concepts; dashboards/runbooks
Top 3 responsibilities
-
Design, develop, and maintain automated test frameworks; automate manual operations (toil reduction)
-
Build and sustain integration with CI/CD pipelines to ensure reliable automated testing
-
Collaborate with product/platform owners to define SLOs/SLIs, monitor reliability, create dashboards, runbooks, and participate in on-call/incident response
Must-have skills
-
Experience designing, developing, and maintaining automated test frameworks using Cypress or Playwright for web/API testing
-
Strong background testing applications built with JavaScript, TypeScript, Node.js, GraphQL
-
Experience integrating automated testing with CI/CD pipelines using Azure DevOps and GitHub Actions
Nice-to-haves
-
Production observability and incident management concepts; familiarity with SRE practices
-
Experience with Azure cloud, Kubernetes
-
Exposure to monitoring/observability tooling ( Dynatrace, LogDNA Mezmo, etc.)
-
Ability to work independently, improv
Full Description
Dice is the leading career destination for tech experts at every stage of their careers. Our client, Trident Consulting Inc., is seeking the following. Apply via Dice today!
Title: Test Automation & Reliability Engineer
Location: PHX, AZ
Hybrid working model
Top 3 required skills
Able to Design, develop, and maintain automated test frameworks using Cypress/Playwright for web and API testing. With Experience in testing applications built with JavaScript, TypeScript, Node.js, and GraphQL. Able to Build and maintain integration with CI/CD pipelines to ensure reliable automated testing using Azure Dev Ops and GitHub actions.
Description
This list is intended to reflect the current job but there may be additional essential functions (and certainly non-essential job functions) that are not referenced. Management will modify the job or require other tasks be performed whenever it is deemed appropriate to do so, observing, of course, any legal obligations including any collective bargaining obligations.
Ensure key stakeholders, product owners, and platform owners are informed of reliability concerns and their potential impact on the customers' experience.
Design, code, test and deliver solutions to automate manual operation (i.e., TOIL ).
Participate in operations support and on-call rotation shifts (could include weekends and holidays), for SRE supported systems and products with a focus on implementing long-term solutions for any problems identified.;
Collaborate with stakeholders such as product and platform owners, to define service level objectives (SLOs), and service-level indicators (SLIs) for system operations focused on the critical features of the customers journey and experience.
Track and manage reliability performance against agreed SLOs, in partnership with IT monitoring teams or other stakeholders, and ensure systems continue to meet SLOs over time.
Provide expert knowledge on reliability approaches, to ensure our organization achieves its goals and roadmap for reliability.
Champion reliability being treated as a feature in products and platforms and promote the concept across all phases of the software development life cycle.
Create dashboards and reports to communicate key metrics, to product owners and key stakeholders.
Contribute to documentation and runbooks for owned applications based on operational experience, user feedback, and application changes
ALL YOU LL NEED FOR SUCCESS
Minimum Qualifications Education & Prior Job Experience
Bachelor's degree in Computer Science, Computer Engineering, Technology, Information Systems (CIS/MIS), Engineering or related technical discipline, or equivalent experience/training
5+ years of experience designing, developing, and implementing large-scale solutions in production environments
Preferred Qualifications Education & Prior Job Experience
Master's degree in Computer Science, Computer Engineering, Technology, Information Systems (CIS/MIS), Engineering or related technical discipline, or equivalent experience/training
Airline Industry experience
Top 3 required skills
Able to Design, develop, and maintain automated test frameworks using Cypress/Playwright for web and API testing. With Experience in testing applications built with JavaScript, TypeScript, Node.js, and GraphQL. Able to Build and maintain integration with CI/CD pipelines to ensure reliable automated testing using Azure Dev Ops and GitHub actions.
We will consider junior developers who can demonstrate passion for development and processes.
Nice to Have Skills and Experience
Dynatrace (APM/monitoring)
Mezmo (LogDNA) (log aggregation)
BigPanda (incident intelligence)
Nucleus (security & vulnerability management)
Understanding of production observability and incident management concepts.
Excited to learn, grow their SRE skills, and take ownership across both testing and reliability domains.
A passion for improving processes and building reliable systems.
Proven ability to work independently and take initiative with minimal guidance.
Strong background in quality engineering, with a solid understanding of automation best practices.
Familiarity with SRE concepts such as monitoring, alerting, incident response.
Design, develop, and maintain automated test frameworks using Cypress/Playwright for web and API testing.
Experience testing applications built with JavaScript, TypeScript, Node.js, and GraphQL.
Familiarity with navigating and managing resources in Azure cloud and Kubernetes environments.
Define and execute comprehensive test strategies for new features and services.
Implement and manage CI/CD workflows using GitHub Actions or other GitHub-integrated tools.
Build and maintain integration with CI/CD pipelines to ensure reliable automated testing using Azure Dev Ops and GitHub actions.
Conduct regression, performance, and security testing.
Collaborate with developers and product managers to ensure high-quality releases.
Participate in the SRE team s daily operations, including system monitoring, alerting, and incident response.
Implement post-deployment validation, health checks, and release safety mechanisms.
Help define and monitor SLAs, SLOs, and error budgets.
Contribute to reliability tooling, observability improvements, and performance diagnostics.
Participate in blameless postmortems and propose solutions to improve system stability.