GO
Machine Learning Engineer, AI Infrastructure
Job Description
Google is seeking experienced Machine Learning Engineers to join our AI Infrastructure teams. In this role, you will be instrumental in designing, developing, and deploying cutting-edge ML infrastructure that powers Google's AI advancements across Search, Cloud, YouTube, and more. You'll work on large-scale distributed systems, optimize ML model training and inference performance, and contribute to the next generation of AI technologies.
**Responsibilities:**
- Design, implement, and maintain scalable ML infrastructure for training and serving models.
- Optimize performance of ML workloads on diverse hardware (CPUs, GPUs, TPUs).
- Develop tools and frameworks to streamline the ML development lifecycle.
- Collaborate with ML researchers and product teams to understand and meet their infrastructure needs.
- Troubleshoot and resolve complex issues in distributed ML systems.
**Qualifications:**
- Bachelor's or Master's degree in Computer Science, a related technical field, or equivalent practical experience.
- Strong programming skills in C++, Java, Python, or Go.
- Experience with distributed systems and cloud computing platforms (e.g., Google Cloud, AWS, Azure).
- Familiarity with ML frameworks like TensorFlow, PyTorch, or JAX.
- Experience with containerization technologies (Docker, Kubernetes).
**Benefits:**
- Competitive salary and equity.
- Comprehensive health, dental, and vision insurance.
- Generous paid time off and parental leave.
- Opportunities for professional development and growth.
- Access to cutting-edge AI research and development.
Skills & Tags
machine learninginfrastructuredistributed systemspythonc++