Description
About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the role We're looking for a research engineer who believes that visual and spatial reasoning are core to fully unlocking the capabilities of LLMs. On the Vision team, you'll own the end-to-end process of creating training data and RL environments targeting visual knowledge work: identifying long-horizon and vision-heavy tasks, building evals, designing rewards, and scaling data. This is a unique role that combines applied research with hands-on data work. It's also highly collaborative — you'll partner with external vendors, pretraining, RL, and product teams to make sure the environments you build translate into real-world knowledge work capabilities. What you'll do: - Own the data strategy for vision capabilities end-to-end, from building evals and scaling RL environments - Manage technical relationships with external data vendors, including writing task specifications, evaluating visual data and annotation quality, and iterating on reward design - Develop and improve QA frameworks that catch reward hacking and ensure environment quality at scale - Run generalization experiments to measure how data strategy changes improve multimodal capabilities on held-out evaluations - Partner with pretraining, RL, and product teams, and do the science that shows we’re all rowing in the same direction You may be a good fit if you: - Have 7+ years of ML, computer vision, and software engineering experience through industry, academia, or other projects - Have experience with reinforcement learning, reward design, or training data curation for large language or vision-language models - Are familiar with the architecture, training, and operation of large vision language models - Are comfortable managing technical vendor relationships and iterating quickly on feedback - Are results-oriented, with a bias towards flexibility and impact - Care about the societal impacts of your work Strong candidates may also have experience with: - Designing evals or benchmarks for LLMs or vision language models - Large-scale pretraining, SL, and RL on language models - Deep learning research on images, video, or other modalities - Developing complex agentic systems using LLMs - Large-scale ETL and data pipeline development Representative projects: - Writing a vendor-facing specification for a new family of visual RL training tasks, then iterating with the vendor on coverage, quality, and reward design - Running experiments to determine ideal training datamixes and parameters for a synthetically generated vision dataset - Finetuning Claude to maximize its performance using a particular set of agent tools/skills The annual compensation range for this role is listed below. <p>
You'll be taken to Anthropic's application page to finish applying.