We‘re seeking a Senior Software Engineer (Python, MLOps) to build and scale our multimodal AI models into a real-time API platform. You‘ll own the inference lifecycle, from infrastructure management and deployment optimization to API design and performance tuning. Collaborate with our research team to productize cutting-edge model architectures and automate key processes.

Responsibilities:

Build and deploy multimodal AI models into a real-time API platform.
Manage infrastructure using Terraform (or similar IaC).
Optimize deployment and CI/CD pipelines for seamless ML model integration.
Design and build RESTful/WebRTC APIs for model serving.
Identify and eliminate inference lifecycle bottlenecks.
Collaborate with research to productize new model architectures.
Automate documentation, processes, and systems.

Requirements:

Production-level MLOps or Python experience.
Strong backend fundamentals (concurrency, event-driven architectures, caching).
Experience scaling software with Docker, Kubernetes, or similar.
API/SDK design and development experience.
Cloud experience (AWS, GCP, Azure, or on-premise).
Basic understanding of LLM inference (KV cache, paged attention).

Senior Software Engineer (Python, MLOps)

Application form