Systems Software Engineer, Kubernetes Scale - DGX Cloud
This role involves deep performance and scale analysis of the NVIDIA DGX Cloud software stack, focusing on Kubernetes and NVIDIA components like GPU Operator, DCGM, and NIM. The engineer will diagnose distributed systems issues, build automated testing frameworks, and contribute to open-source communities to optimize large-scale AI infrastructure. Work includes continuous performance testing, root cause analysis, and collaboration with AI teams and cloud platforms.