Opportunity Description
The AI Infrastructure Engineer is a platform specialist responsible for architecting, building, and operating high-performance AI infrastructure to support advanced AI workloads, including LLMs, GenAI, Computer Vision, and MLOps. This role will focus on managing GPU clusters (NVIDIA A100/H100), deploying and maintaining Red Hat OpenShift AI (RHODS), and ensuring secure, scalable, and cost-efficient AI platforms across SDD’s Sovereign Cloud and hybrid/multi-cloud environments. The engineer will enable enterprise-grade AI adoption for 200+ government entities.
Key Responsibilities & Deliverables GPU & AI Platform ArchitectureDesign and implement GPU-based compute clusters. Define reference architectures for LLM hosting, Vector Databases, MLOps, and high-performance storage/networking.
Fully operational GPU-based AI infrastructure. GPU Cluster Uptime and Performance Utilization. Reduction in Cost per Training/Inference Workload.
GPU Cluster Oper...Ready to Apply?
Submit your application for AI Infrastructure Engineer at Dautom
Apply for this Position