Cookies & analytics consent
We serve candidates globally, so we only activate Google Tag Manager and other analytics after you opt in. This keeps us aligned with GDPR/UK DPA, ePrivacy, LGPD, and similar rules. Essential features still run without analytics cookies.
Read how we use data in our Privacy Policy and Terms of Service.
🤖 15+ AI Agents working for you. Find jobs, score and update resumes, cover letter, interview questions, missing keywords, and lots more.

The Zig • Nairobi, Kenya
Role & seniority: Support & Automation Engineer (senior-level, production-focused platform reliability role)
Stack/tools: Cloud platforms (Azure preferred; AWS/GCP acceptable), networking/security fundamentals, monitoring/observability tooling, scripting (Python, PowerShell, Bash), infrastructure-as-code (Terraform, Bicep, ARM), automation pipelines, AI-driven runbooks
Monitor, protect, and improve client systems; perform RCA; reduce MTTR; design systematic response plans
Build automation to replace manual work; design automated remediation pipelines; integrate AI-driven runbooks and alert → diagnose → resolve workflows
Move clients toward self-healing architectures; identify automation candidates; implement closed-loop remediation and reduce ticket volume
Strong cloud experience (Azure preferred; AWS/GCP acceptable)
Networking, identity, security fundamentals
Monitoring/observability tooling proficiency
Scripting ability (Python, PowerShell, Bash)
Infrastructure-as-code experience (Terraform, Bicep, ARM)
Experience with Microsoft Fabric, data platforms, or AI workloads
Building LLM-powered tooling or agent workflows
CI/CD pipeline experience
Prior work in managed services or DevOps
Location & work type: Remote position; full-time, client-facing managed services environment
This is a remote position.
Build Systems That Heal Themselves
Most companies treat support as reactive.
We don’t.
We believe infrastructure should be observable, intelligent, and progressively autonomous.
Our Managed Services team exists to move clients from fragile environments to mature, self-healing systems powered by automation and AI.
“Why is this happening repeatedly — and how do we eliminate it permanently?”
If that’s how you think, keep reading.
What You’ll Actually Do
This is not a helpdesk role.
This is systems engineering with accountability.
You will: Your success is measured not by tickets closed —
Monitor & Protect Client Systems Proactively monitor cloud infrastructure, data platforms, and AI workloads Identify performance degradation, anomalies, and failure patterns Perform root cause analysis (RCA) with structured documentation Reduce MTTR through systematic response design Remediate with Engineering Discipline Resolve incidents across cloud, networking, data, and AI systems Patch, optimize, and harden client environments Improve reliability through configuration management and guardrails Translate recurring issues into structured improvement plans Build Automation That Replaces Manual Work Write scripts, workflows, and agents to eliminate repetitive support tasks Design automated remediation pipelines Implement monitoring-triggered workflows (alert → diagnose → resolve) Build AI-driven runbooks and intelligent response systems Move Clients Toward Self-Healing Architecture Identify automation candidates inside recurring incidents Implement closed-loop remediation systems Integrate LLM-based diagnostics and decision support Reduce ticket volume through engineered automation
but by tickets that never happen again.
What Makes This Role Different
You will work on real production systems, not internal sandbox environments You will design automation that impacts live client operations You will help architect environments that require less human intervention over time You will operate at the intersection of Cloud, Data, AI, and Infrastructure
This is operational engineering with strategic depth.
Requirements
What We’re Looking For
Core Technical Foundation
Strong experience with cloud platforms (Azure preferred, AWS/GCP acceptable) Understanding of networking, identity, and security fundamentals Experience with monitoring and observability tooling Scripting ability (Python, PowerShell, Bash, or similar) Familiarity with infrastructure-as-code (Terraform, Bicep, ARM, etc.)
Automation Mindset
You naturally look for patterns in failures You think in systems, not isolated fixes You prefer engineering solutions over repetitive manual work You care about reliability metrics (SLA, SLO, MTTR, incident frequency)
Bonus Experience
Experience with Microsoft Fabric, data platforms, or AI workloads Building LLM-powered tooling or agent workflows CI/CD pipeline experience Experience in a managed services or DevOps environment
How We Evaluate Engineers
Clear thinking under pressure Structured debugging Ownership mentality Strong documentation habits Ability to communicate clearly with both technical and non-technical stakeholders
Ticket passers Engineers who fix symptoms but ignore root cause People who wait to be told what to improve
What Growth Looks Like
Designs full automation frameworks for clients Reduces client incident volume by 40–60% Architects intelligent monitoring strategies Contributes to AI-driven managed service tooling Becomes a reliability leader within the organization
Why This Role Matters
Most managed services teams scale by hiring more support engineers.
Automating response Embedding AI into operations Building resilient, intelligent environments
You will help define what next-generation managed services looks like.
Ideal Candidate Profile
Enjoys debugging complex systems Thinks about failure modes before they happen Gets satisfaction from making systems more reliable Wants to move from reactive support to proactive engineering Believes AI should augment operations, not replace thinking
If This Sounds Like You
A short note explaining how you’ve automated a recurring operational issue A GitHub or code sample (if available) A description of the most complex system you’ve debugged
We’re building systems that improve themselves.
If that excites you — we should talk.
Benefits
What We Offer at The Zig
Meaningful, real-world work Accelerated growth through ownership Front-row seat to AI-driven change Investment in learning Competitive compensation and benefits