Anthropic is an AI safety and research company on a mission to build reliable, interpretable, and steerable AI systems. We are looking for passionate AI Research Scientists to join our Safety & Alignment team. You will work on critical challenges in ensuring AI systems behave in accordance with human intentions and values. This role involves theoretical research, empirical validation, and developing novel techniques for AI alignment. **Responsibilities:** * Conduct research into fundamental problems of AI safety, interpretability, and alignment. * Develop and evaluate methods for training and controlling advanced AI systems. * Design experiments and analyze results to inform safety research directions. * Collaborate with engineers to integrate safety techniques into deployed AI models. * Publish research findings in top-tier AI conferences and journals. **Requirements:** * PhD in Computer Science, Machine Learning, Statistics, or a related quantitative field, with a focus on AI safety, alignment, interpretability, or related areas. * Strong theoretical and empirical research skills. * Proficiency in Python and common ML libraries (e.g., PyTorch, JAX). * Experience with large language models or other complex AI systems. * Excellent analytical and problem-solving abilities. **Benefits:** Anthropic provides a highly competitive salary, equity options, comprehensive health, dental, and vision insurance, generous PTO, and a stimulating work environment focused on impactful research.

AI Research Scientist, Safety & Alignment

Job Description

Skills & Tags