Cookies & analytics consent
We serve candidates globally, so we only activate Google Tag Manager and other analytics after you opt in. This keeps us aligned with GDPR/UK DPA, ePrivacy, LGPD, and similar rules. Essential features still run without analytics cookies.
Read how we use data in our Privacy Policy and Terms of Service.
🤖 15+ AI Agents working for you. Find jobs, score and update resumes, cover letter, interview questions, missing keywords, and lots more.

Microsoft • Taiwan
Role & seniority
Stack/tools
Hardware systems for GPU servers and data-center infrastructure
Telemetry and log data analysis for failure signatures
Root Cause Analysis (RCA), corrective actions, and quality metrics
GPU subsystems, power, cooling (including liquid cooling familiarity)
Data center manufacturing/repair processes and high-volume production environments
Top 3 responsibilities
Develop and implement supplier quality management strategy for data-center hardware
Lead quality issues and improvement tasks to contain, mitigate, and resolve top global data-center quality problems
Conduct debug and failure analysis for GPU subsystems; drive resolutions with partners/suppliers; establish quality readouts and action plans from telemetry
Must-have skills
Master’s degree in Electrical Eng or related field with 3+ years, or Bachelor’s with 5+ years, or equivalent
5+ years of product quality experience in electronics; 5+ years of hardware issue resolution for GPU servers
Experience filtering debugging data (telemetry/logs) to identify failure signatures
Ability to meet Microsoft security screening requirements (Cloud Background Check)
Nice-to-haves
7+ years in large-scale manufacturing/data center environments; advanced degrees or PhDs in related fields
Patent or proven track record of engineering excellence
Experience with liquid cooling systems in data centers
12+ years in modern server architectures; 8+ ye
Overview
Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the team behind Microsoft’s expanding Cloud Infrastructure and responsible for powering Microsoft’s “Intelligent Cloud” mission. SCHIE delivers the core infrastructure and foundational technologies for Microsoft's over 200 online businesses including Bing, MSN, Office 365, Xbox Live, Teams, OneDrive, and the Microsoft Azure platform globally with our server and data center infrastructure, security and compliance, operations, globalization, and manageability solutions. Our focus is on smart growth, high efficiency, and delivering a trusted experience to customers and partners worldwide and we are looking for passionate, high-energy engineers to help achieve that mission.
As Microsoft's cloud business continues to grow the ability to deploy new offerings and hardware infrastructure on time, in high volume with high quality and lowest cost is of paramount importance. To achieve this goal, the Hardware, Infrastructure Management, and Fundamentals Engineering (HIFE) team is instrumental in defining and delivering operational measures of success for hardware manufacturing, improving the planning process, quality, delivery, scale and sustainability related to Microsoft cloud hardware. We are looking for seasoned engineers with a dedicated passion for customer focused solutions, insight and industry knowledge to envision and implement future technical solutions that will manage and optimize the Cloud infrastructure.
We are looking for a Senior HW Quality Engineer to join the team.
#azurehwjobs #HIFE
Responsibilities
Develop and implement a robust supplier quality management strategy to ensure the data center hardware is manufactured at the highest level of quality standards. Lead quality issues and improvement task force to contain, mitigate, and resolve the top-quality issues impacting global data centers. Conduct debug and failure analysis for GPU subsystems in the Azure fleet and drive resolution with partners and suppliers. Drive the continuous improvement process based on Root Cause Analysis (RCA) and identified opportunities. Responsible for quality readouts based on your telemetry data analysis, to bring clarity on status, actions across the organization and next steps for issue resolution. Establish Critical-to-Quality performance metrics to measure and improve product quality.
Qualifications
Master's Degree in Electrical Engineering, or related field AND 3+ years technical engineering experienceOR Bachelor's Degree in Electrical Engineering, or related field AND 5+ years technical engineering experienceOR equivalent experience.
5+ years of work experience in managing product quality in the electronic industry.
5+ years of direct engineering experience in hardware system issue resolution for GPU Servers.
Versed in filtering through applicable debug data, like telemetry and logs to identify and investigate HW failure signatures.Ability to meet Microsoft, customer and/or government security screening requirements are required for this role.
These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
Preferred Qualifications
Bachelor's Degree in electrical and systems engineering, or related field AND 7+ years experience in a large scale manufacturing and/or data center environment/repairOR Master's Degree in manufacturing, material, mechanical, electrical, and industrial engineering, or related field AND 6+ years experience in a high-volume manufacturing environmentOR Doctorate in manufacturing, material, mechanical, electrical, and industrial engineering, or related field AND 3+ years experience in a manufacturing environment/repairOR 9+ years equivalent experience.Patent or track record of engineering excellency.Experience with Liquid Cooling Systems in Data Centers12+ years of experience in working with the modern server architectures – includes understanding of GPU, CPU methods for failure analysis, debugging or validation.8+ years of system level server debugging with an understanding of power, system and network environments3+ years of direct GPU related engineering experience in issue debug/test log review. Leadership skills and ability to collaborate with diverse teams and drive a call to action. Experience in root cause analysis and corrective action methods to identify contributing factors of production defects.Ability to analyze large data sets, extract key insights, and effectively present and communicate the results.Proficient communication and project management skills. Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations. Show more Show less