H

ML Infra Engineer (m/f/d)

Halian

abu dhabi, abu dhabi emirate, United-Arab-Emirates Full-time June 28, 2026
Apply Now

Opportunity Description

Role Overview

This role focuses on designing and building the infrastructure that enables scalable machine learning development, from training-ready datasets through to validated models deployed in production environments.

The position involves establishing core systems and architectural foundations that will support long-term scalability and performance. Key areas include training infrastructure, distributed learning frameworks, experiment management, model lifecycle management, and reliable pathways from model development to production deployment.

Key Responsibilities

  • Training infrastructure
    Design, deploy, and operate GPU-based training environments across cloud platforms such as AWS and GCP. This includes node provisioning, workload scheduling (e.g., Kubernetes, Slurm), multi-node networking, GPU monitoring, and cost/utilization optimization.

  • Distributed training systems
    Own and optimize di...

Full-time IT & Technology

Ready to Apply?

Submit your application for ML Infra Engineer (m/f/d) at Halian

Apply for this Position