Opportunity Description
NVIDIA DGX Cloud is building the operating model for reliable, scalable GPU infrastructure across internal, partner, and on-prem environments. We are looking for an Engineering Manager to lead a team of software and production engineers focused on Kubernetes-based operations, automation, reliability, and cluster lifecycle tooling. This leader will help run today’s production systems while building the automation and engineering practices needed for the next generation of DGX Cloud infrastructure.
What you’ll be doing:
+ Lead a team of software and production engineers building and operating DGX Cloud infrastructure across NVIDIA Cloud Partner (NCP) and on-prem environments.
+ Drive execution across cluster operations, Kubernetes operability, automation, GitOps, observability, and incident response.
+ Help define team priorities, roadmap, staffing, and operational ownership.
+ Partner with platform, workload, storage, networking, security, and TPM teams to i...
What you’ll be doing:
+ Lead a team of software and production engineers building and operating DGX Cloud infrastructure across NVIDIA Cloud Partner (NCP) and on-prem environments.
+ Drive execution across cluster operations, Kubernetes operability, automation, GitOps, observability, and incident response.
+ Help define team priorities, roadmap, staffing, and operational ownership.
+ Partner with platform, workload, storage, networking, security, and TPM teams to i...
Ready to Apply?
Submit your application for Engineering Manager, DGX Cloud Production Engineering at NVIDIA
Apply for this Position