Role & seniority: Support & Automation Engineer (senior-level, production-focused platform reliability role)

Stack/tools: Cloud platforms (Azure preferred; AWS/GCP acceptable), networking/security fundamentals, monitoring/observability tooling, scripting (Python, PowerShell, Bash), infrastructure-as-code (Terraform, Bicep, ARM), automation pipelines, AI-driven runbooks

Top 3 responsibilities

Monitor, protect, and improve client systems; perform RCA; reduce MTTR; design systematic response plans
Build automation to replace manual work; design automated remediation pipelines; integrate AI-driven runbooks and alert → diagnose → resolve workflows
Move clients toward self-healing architectures; identify automation candidates; implement closed-loop remediation and reduce ticket volume

Must-have skills

Strong cloud experience (Azure preferred; AWS/GCP acceptable)
Networking, identity, security fundamentals
Monitoring/observability tooling proficiency
Scripting ability (Python, PowerShell, Bash)
Infrastructure-as-code experience (Terraform, Bicep, ARM)

Nice-to-haves

Experience with Microsoft Fabric, data platforms, or AI workloads
Building LLM-powered tooling or agent workflows
CI/CD pipeline experience
Prior work in managed services or DevOps

Location & work type: Remote position; full-time, client-facing managed services environment

Notes

Emphasis on real production systems, proactive engineering, and AI-ena

Full Description

This is a remote position.

Build Systems That Heal Themselves

Most companies treat support as reactive.

We don’t.

We believe infrastructure should be observable, intelligent, and progressively autonomous.

Our Managed Services team exists to move clients from fragile environments to mature, self-healing systems powered by automation and AI.

We’re looking for Support & Automation Engineers who think beyond tickets. Engineers who see recurring incidents and instinctively ask

“Why is this happening repeatedly — and how do we eliminate it permanently?”

If that’s how you think, keep reading.

What You’ll Actually Do

This is not a helpdesk role.

This is systems engineering with accountability.

You will: Your success is measured not by tickets closed —

Monitor & Protect Client Systems Proactively monitor cloud infrastructure, data platforms, and AI workloads Identify performance degradation, anomalies, and failure patterns Perform root cause analysis (RCA) with structured documentation Reduce MTTR through systematic response design Remediate with Engineering Discipline Resolve incidents across cloud, networking, data, and AI systems Patch, optimize, and harden client environments Improve reliability through configuration management and guardrails Translate recurring issues into structured improvement plans Build Automation That Replaces Manual Work Write scripts, workflows, and agents to eliminate repetitive support tasks Design automated remediation pipelines Implement monitoring-triggered workflows (alert → diagnose → resolve) Build AI-driven runbooks and intelligent response systems Move Clients Toward Self-Healing Architecture Identify automation candidates inside recurring incidents Implement closed-loop remediation systems Integrate LLM-based diagnostics and decision support Reduce ticket volume through engineered automation

but by tickets that never happen again.

What Makes This Role Different

You will work on real production systems, not internal sandbox environments You will design automation that impacts live client operations You will help architect environments that require less human intervention over time You will operate at the intersection of Cloud, Data, AI, and Infrastructure

This is operational engineering with strategic depth.

Requirements

What We’re Looking For

Core Technical Foundation

Strong experience with cloud platforms (Azure preferred, AWS/GCP acceptable) Understanding of networking, identity, and security fundamentals Experience with monitoring and observability tooling Scripting ability (Python, PowerShell, Bash, or similar) Familiarity with infrastructure-as-code (Terraform, Bicep, ARM, etc.)

Support & Automation Engineer

Top 3 responsibilities

Must-have skills

Nice-to-haves

Notes

Full Description

We’re looking for Support & Automation Engineers who think beyond tickets. Engineers who see recurring incidents and instinctively ask

We value

We do not value

In 12–18 months, a strong engineer in this role

We scale by

You’re likely someone who

Apply with