ME
Senior Software Engineer, Machine Learning Infrastructure
Job Description
Meta is hiring a Senior Software Engineer to join our Machine Learning Infrastructure team. You will play a key role in building and scaling the systems that enable Meta's AI researchers and engineers to develop and deploy world-leading machine learning models. This is an opportunity to work on highly scalable, distributed systems that power one of the largest AI organizations in the world.
Responsibilities:
- Design, develop, and maintain robust and scalable ML infrastructure, including training, inference, and data processing platforms.
- Optimize ML workflows for performance, efficiency, and reliability.
- Collaborate with ML researchers and engineers to understand their needs and provide efficient solutions.
- Implement best practices for software development, testing, and deployment.
- Contribute to the evolution of our internal ML tools and frameworks.
Requirements:
- BS/MS in Computer Science or a related technical field, or equivalent practical experience.
- 5+ years of experience in software engineering, with a focus on distributed systems and infrastructure.
- Strong programming skills in C++ or Python.
- Experience with ML frameworks (PyTorch, TensorFlow) and ML infrastructure concepts (e.g., distributed training, hyperparameter tuning, model serving).
- Familiarity with cloud platforms and containerization technologies (e.g., Kubernetes).
Benefits:
Meta offers competitive salaries, bonuses, stock awards, comprehensive health coverage, 401(k) matching, and a culture that emphasizes innovation and impact.
Skills & Tags
ml infrastructuresoftware engineeringdistributed systemspythonc++