AI Engineer, Reinforcement Learning
Job description
About Mytra
We’re creating an entirely new way to solve the most ubiquitous problem in industry - moving and storing material. We’re applying robotics and distributed software to create a new class of product for this $1T market. We’re focused on the supply chain industry first. The industry is in a massive bind with the continued growth of e-commerce, sharp rise in costs, and supply chain disruptions. What has been a “sleepy” industry for decades is now at the epicenter of sustaining the global economy.
Role Overview
Mytra's AI team seeks pioneering engineers to architect the decision-making core of our distributed robotic intelligence system. As a Reinforcement Learning AI Engineer, you'll push the boundaries of multi-agent coordination by developing novel approaches for real-time adaptive control, hierarchical policy learning, and decision optimization. You'll tackle fascinating challenges like dynamic resource allocation in uncertain environments, multi-robot task scheduling, collective behavior emergence, and real-time policy adaptation with safety-critical constraints. Working at the intersection of reinforcement learning and robotics, you'll implement state-of-the-art algorithms for multi-agent coordination, efficient exploration strategies, and robust policy optimization while collaborating with a multidisciplinary team to bridge the gap between theoretical advances and production-grade systems that reliably operate in real-world industrial environments.
Example Projects
- Design and implement a multi-agent RL deep learning model for coordinating robot tasks.
- Implement a new algorithm or architecture from a newly published paper.
- Develop a simulation environment that accurately models real-world warehouse dynamics, including variable payload characteristics, battery management, and per-customer success metrics, to enable faster training and validation of RL policies.
- Implement and adapt state-of-the-art RL algorithms (like Proximal Policy Optimization or Soft Actor-Critic) to handle the partial observability and high-dimensional state spaces.
- Devise a hierarchical RL system that decomposes complex tasks into manageable sub-tasks, enabling more efficient learning and better generalization across different topologies.
- Build a reward shaping mechanism that balances multiple competing objectives.
- Implement a transfer learning approach to leverage knowledge from simulation training to real-world deployment.
The Ideal Candidate
- You are a reinforcement learning and system optimization enthusiast, theoretical yet pragmatic, curious yet focused, always learning yet humble, and you are motivated to see your work deployed at scale, solving real world problems.
- You’re very comfortable writing Python and have experience with PyTorch, Ray and training deep neural networks.
- You have an owner's mindset. You are a doer.
- You are excited about working on new projects, growing together, and shaping the future of the company.
- You’ve played with different types of sensors and are familiar with robot software frameworks.
- Uncertainty isn’t scary, and you’re willing to realize mistakes and pivot.
- You’re familiar with simulators such as Isaac Sim, Gymnasium, MuJoCo, Unity, or other video game environments.
- You can convey complex ideas and technical concepts to a diverse audience.
- Openness, inquisitiveness, and constant ambition are important values.
- You enjoy pair programming and taking time to help others.
A Final Note
If this is your dream role, but you can’t take on every example project in its totality, we encourage you to apply. We work closely and collaboratively at Mytra - no one takes on projects alone. We seek people who are eager to work together, learn new things, and bring unique perspectives.
The pay range for this role is:
180,000 - 220,000 USD per year (South San Francisco)
About the company
Job Location
Company Size
Our Story
Resilient, modular, reconfigurable, no-integration-code automated material flow and storage system enabled by robotics and AI