Opportunity Description
We are building real-time conversational AI systems for contact centres, powered by ASR, LLMs, and TTS.
As an LLM Systems Engineer, you will sit within our LLM team and focus on the systems layer that makes production Conversational AI work at scale. You’ll design and improve the infrastructure, orchestration, and runtime systems behind low-latency conversational AI workflows.
This role focuses on solving the technical challenges associated with delivering real-time AI conversations: coordinating complex AI systems under strict latency and reliability constraints.
What you’ll do
- Design and build systems that enable LLM workflows to maintain real-time responses even under peak load
- Improve latency, throughput, concurrency, and reliability across our production systems
- Build orchestration logic for model calls, services, queues, retries, fallbacks, and routing that balances load management with low response times
Ready to Apply?
Submit your application for Machine Learning Systems Engineer at ConnexAI
Apply for this Position