RIVR Industry · Engineering

Senior AI Engineer - VLA Foundation Model

CHF 150'000 – 170'000 / year

ZÜRICH

AI-TITLEMACHINE LEARNINGDEEP LEARNINGNEURAL NETWORKREINFORCEMENT LEARNINGSUPERVISED LEARNINGGENERATIVE AIDIFFUSION MODELFOUNDATION MODELAI ENGINEERPYTORCH

Description

Amazon RIVR, an ETH Zurich spin-off acquired by Amazon, is building the next generation of safe, reliable autonomous robots for last-mile delivery.

In this role, you will develop multi-modal Vision-Language-Action (VLA) models to enable robots to autonomously generate actions from demonstrations, real-time sensor data, and natural language commands.

Responsibilities

Develop and implement cutting-edge Vision-Language-Action (VLA) models, generalist robot transformers, and imitation learning algorithms (e.g., diffusion policies) to enable robots to autonomously execute complex tasks.
Design, test, and refine your algorithms to meet the demands of complex real-world autonomy and navigation tasks, with a focus on spatial reasoning and generalization.
Streamline the data collection and training workflow to efficiently expand model capabilities with new tasks and data sources.
Collaborate with the reinforcement learning team to innovate methods that leverage both simulated and real-world data.
Optimize and distill networks for real-time deployment on the edge (e.g. Nvidia Jetson Thor).
Build, lead and mentor an exceptional team of software engineers.
Provide expert guidance to product managers and executives for strategic decision-making.
Create and maintain documentation, guidelines, and best practices to streamline knowledge sharing.

Qualifications

Master’s degree or higher in a relevant field such as Engineering, Robotics, or Machine Learning.
A minimum of three years of industry or research experience, with PhD experience applicable.
Strong deep learning fundamentals including supervised learning, self-supervised learning, Transformer-based architectures, policy optimization algorithms, imitation learning, and generative AI techniques (including Diffusion Models).
Proven experience in developing Vision-Language-Action (VLA) models or large-scale generalist robot models (e.g., RT-2, Octo, etc.).
Strong background in robotics including autonomy, navigation.
Experience with deploying artificial neural networks on hardware platforms.
Ability to prototype algorithms and train deep neural networks in Python (Pytorch)
PhD degree in Robotics, Engineering, Computer Science, Machine Learning or a similar discipline, or an equivalent amount of research experience.
Publications at top-tier conferences.
Experience in managing a software team.
Ability to write production-level code in modern C++

Apply Now