AI Technologies
ML
About the Role
WHAT YOU'LL DO - Build low-latency inference pipelines for on-device deployment, enabling real-time next-token and diffusion-based control loops in robotics - Design and optimize distributed inference systems on GPU clusters, pushing throughput with large-batch serving and efficient resource utilization - Implement efficient low-level code (CUDA, Triton, custom kernels) and integrate it seamlessly into high-level frameworks - Optimize workloads for both throughput (batching, scheduling, quantization) and latency (caching, memory management, graph compilation) - Develop monitoring and debugging tools to guarantee reliability, determinism, and rapid diagnosis of regressions across both stacks WHAT YOU'LL BRING - Deep experience in distributed systems, ML infrastructure, or high-performance s
Requirements
About Genesis AI
No company description available.
Similar Jobs
Staff Technical Program Manager (Bay Area)
Genesis AI • Bay Area
Principal
Staff Product Manager
Robin AI • New York City
Principal
Technical Sourcer - Contract
Scale AI • San Francisco, CA; New York, NY
Mid-level
Lead Technical Account Manager
Sully AI • US - Remote
Lead
Staff Software Engineer, Training (Bay Area / Paris / Remote)
Genesis AI • Bay Area
Principal