Posted 10 June, 2026
Technical Lead Manager, TorchTPU
London UK
Full Time
Minimum qualifications:
- Bachelor's degree or equivalent practical experience.
- 8 years of experience with software development in one or more programming languages (e.g., Python, C++ or C).
- 5 years of experience in a technical leadership role; overseeing projects.
- 5 years of experience in a people management, supervision/team leadership role.
- Experience with machine learning frameworks, compiler technology, or high-performance computing (HPC).
- Experience leading engineering projects with cross-functional or global stakeholders.
Preferred qualifications:
- Master’s degree or PhD in Engineering, Computer Science, or a related technical field.
- Experience leading teams on compiler stacks or infrastructure, such as Multi-Level Intermediate Representation (MLIR) or Low Level Virtual Machine (LLVM).
- Experience optimizing performance for Generative AI and Large Language Models (LLMs).
- Experience contributing to or maintaining large-scale open-source machine learning projects.
- Background in HPC, GPU workloads, or ML frameworks like JAX, PyTorch, or TensorFlow.
- Proven track record of delivering global projects through cross-functional collaboration.
About the job
Like Google's own ambitions, the work of a Software Engineer goes beyond just Search. Software Engineering Managers have not only the technical expertise to take on and provide technical leadership to major projects, but also manage a team of Engineers. You not only optimize your own code but make sure Engineers are able to optimize theirs. As a Software Engineering Manager you manage your project goals, contribute to product strategy and help develop your team. Teams work all across the company, in areas such as information retrieval, artificial intelligence, natural language processing, distributed computing, large-scale system design, networking, security, data compression, user interface design; the list goes on and is growing every day. Operating with scale and speed, our exceptional software engineers are just getting started -- and as a manager, you guide the way.With technical and leadership expertise, you manage engineers across multiple teams and locations, a large product budget and oversee the deployment of large-scale projects across multiple sites internationally.
Google Cloud provides organizations with leading infrastructure and enterprise-grade solutions, leveraging Google’s technology to help customers in over 150 countries solve critical business problems.
As a part of the Core ML team, you will develop frameworks and compilers that support the GCP Cloud TPU service. You will provide customers with large-scale access to Google’s first-party ML supercomputers to run training and inference workloads using PyTorch and JAX.
As a part of the PyTorch TPU team, you will be responsible for the PyTorch framework, ecosystem, and model performance, also lead engagements with customers to help them achieve massive scale and speed on Google’s TPUs.The ML, Systems, & Cloud AI (MSCA) organization at Google designs, implements, and manages the hardware, software, machine learning, and systems infrastructure for all Google services (Search, YouTube, etc.) and Google Cloud. Our end users are Googlers, Cloud customers and the billions of people who use Google services around the world.
We prioritize security, efficiency, and reliability across everything we do - from developing our latest TPUs to running a global network, while driving towards shaping the future of hyperscale computing. Our global impact spans software and hardware, including Google Cloud’s Vertex AI, the leading AI platform for bringing Gemini models to enterprise customers.
Responsibilities
- Lead and manage a team of software engineers, promoting a collaborative culture and psychological safety.
- Coach and mentor engineers to achieve their potential while aligning team execution with TorchTPU priorities and organizational goals.
- Collaborate with global peer managers and teams to drive AI framework development, enabling PyTorch models to run with peak performance on Cloud TPUs.
- Deliver end-to-end performance compiler optimizations and contribute to open-source software, supporting advanced ML frameworks and compilers on Cloud TPUs and GPUs.
- Enable PyTorch models at massive scale for generative models, computer vision, language modeling, and other advanced machine learning applications.

