强化学习算法工程师

60-90K·14薪

北京市硕士不限经验

职位描述

（大模型团队，P7~P9均有hc，base上海或者北京）Key Responsibilities**: Develop and optimize **RL algorithms** (e.g., PPO, DQN) for real-world applications. Train and deploy models using **PyTorch/TensorFlow** in simulation or real systems. Improve **sample efficiency, generalization, and robustness** of RL agents. Collaborate with teams to integrate RL into products (robotics, games, etc.). **Requirements**: **MSc/PhD in AI/CS or equivalent experience**. Strong **RL fundamentals** + hands-on Python/PyTorch/TensorFlow. Experience with **RL frameworks (RLlib, Gym, etc.)**. Bonus: **Multi-agent RL, robotics, or distributed training**.

20,861+ 岗位更新等你来订阅

一键订阅最新的岗位，每周送达

您可以在邮箱中随时取消订阅