Principal Program Manager – Datacenter High Performance Computing

Microsoft’s Cloud Operations & Innovation (CO+I) is the engine that Performances our cloud services. As a CO+I Principal Program Manager – Datacenter High Performance Computing, you will perform a key role in delivering the core infrastructure and foundational technologies for Microsoft’s online services including Bing, Office 365, Xbox, OneDrive, and the Microsoft Azure platform. As a group, CO+I is focused on the personal and professional development for all employees and offers trainings and growth opportunities including Career Rotation Programs, Diversity & Inclusion trainings and events, and professional certifications.

Our infrastructure is comprised of a large global portfolio of more than 200 Data Centers in 32 countries and millions of servers. Our foundation is built upon and managed by a team of subject matter experts working to support services for more than 1 billion customers and 20 million businesses in over 90 countries worldwide.

With environmental sustainability and optimization at the forefront of our data center design and operations, we continue to grow and evolve as we meet the ever-changing business demands that hold Microsoft as a world-class cloud provider.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Successful candidates can be located anywhere in the U.S.

Responsibilities:

  • Own programs that improve the availability, scalability, latency and efficiency of HPC implementations.
  • Manage the portfolio of High Performance Computing (HPC) projects for the regional field operations organization that uses clusters of powerful processors, working in parallel, to process massive multi-dimensional datasets (big data) and solve complex problems at extremely high speeds.
  • Provide technical support, monitoring and engineering of Linux systems, NVIDIA DGX1, A100 servers, and H100 servers.
  • Manage High-Performance Computing (HPC) Growth Budget that drives OPEX and CAPEX requirements to support a variety of tasks associated with the deployment of HPC platforms that include GPU Fiber Infrastructure, Facility Retrofits to support deployments, and workflow development associated with the complexity of HPC that host robust AI systems.
  • Develop & implement strategies for HPC cluster physical deployment, monitoring, and tools.
  • Actively work with developers, hardware architects, and other teams actively engaged in enterprise HPC to improve workflows and identify new requirements.
  • Provide technical documentation, feedback, reports, technical findings, root cause analysis, and engineering technical assessments on HPC implementations to improve service delivery.
  • Support a strategic vision and implementation plan by establishing network roadmaps, data analytics, staffing efficiencies, process improvements, and governance for DC Ops strategy that will support our current and future growth.
  • Collaborate with partners to assess long range plans, determine capacity, and resource plans for both project delivery and ongoing operational support.
  • Drive service delivery excellence through benchmarking across industry best practices.
  • Work with the Global Director and all horizontal and vertical operations leaders to streamline operations to ensure flawless service delivery to our customers.
  • Learn, live, and coach the One Microsoft culture and values .

Qualifications:
Required Qualifications:

  • High School Diploma or equivalent AND 3+ years experience supporting IT equipment or related technology
    • OR 4+ years experience supporting IT equipment or related technology.

Other Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:

  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

  • 3+ years’ experience working with High Performance Computing systems.
  • 8+ years’ experience in Program and Project Management in large complex, cross-functional technology or global manufacturing organizations.
  • Bachelor’s degree in Engineering, Information Technology, Computer Science, or related technical field.
  • Experience buiding large and geographically dispersed infrastructure supporting business critical HPC and on premise services.
  • Experience in creating road maps for large projects to gain efficiencies and proven experience in delivering to road map capabilities.
  • Knowledge of TCP/IP, LAN/WAN, MPLS, Routing, Switching, DWDM, and SONET Data Communications Network.
  • Awareness of IT Service Operations fundamentals and principles. (ITIL, MOF, OLA/SLA, CSI, etc.).
  • Previous experience with MS Cloud Services and Platforms (e.g., Windows Azure, Office 365, Excel Pivot tables, etc.).
  • Ability to collaborate, openly communicate, present, and negotiate effectively across groups of various levels and disciplines, such as senior company leaders, engineers, suppliers, partners, government entities, technicians, customers, etc.
  • Practical knowledge of ITIL concepts and terminology – with focus on Change, Incident, Problem, and Service Delivery Management functions.
  • Experience with reporting and data analysis systems & platforms (e.g. PerformanceBI).
  • Previous experience in structured process improvement and efficiency gains (e.g., Six Sigma, Kaizen, etc.).

Data Center Operations Management IC6 – The typical base pay range for this role across the U.S. is USD $137,800 – $267,600 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $180,000 – $294,000 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.

Job Category
Job Type
Salary
Country
City
Career Level
Company
JOB SOURCE