AiThority Interview with James Rubin, Product Manager at Google

By Sudipto Ghosh On May 9, 2023

Please tell us a little bit about your role and responsibilities in your current company. A: Since 2022,

I’ve been a Product Manager at Google on their core AI/ML team. My passion lies in defining and unlocking AI experiences that deliver exceptional value to ML developers, consumers, and enterprises. Together with a team of ML engineers, researchers, and go-to-market specialists, I focus on optimization of AI systems such as PaLM for Google products as well as third-party models like Stable Diffusion for industry partners and Cloud customers. I additionally launched OpenXLA in partnership with my incredible team at Google and 11 other AI/ML industry leaders, including NVIDIA, Hugging Face, AWS, and Graphcore.

Prior to Google, I was an AI-focused PM at Amazon, launching and leading products at Zoox, Alexa, and AWS that spanned the ML stack. Outside of work, I am an advisor to Oddworld Inhabitants, a narrative animation and gaming company backed by Epic Games, and a stealth-mode blockchain startup.

What are the critical challenges impeding the growth and distribution of open source AI and ML projects?

There can be many barriers to growth for open source AI/ML, but two stand out in particular: sustaining community engagement and the fragmented landscape of OSS AI/ML tools. The difficulty of sustaining community engagement in OSS is not unique to the AI/ML sphere. In fact, research indicates that 80% of open source projects falter due to this issue. To address this, projects can focus on fostering the social capital of their community. At OpenXLA, we’ve done this by: (1) assembling a cross-company network of developers with diverse expertise in model architectures, frameworks, compilers, hardware, and cloud; (2) promoting cohesion through virtual and in-person meet-ups featuring presentations of work; and (3) adhering to a vendor-neutral governance model that rewards contributions and encourages transparent, consensus-based decision-making.

The fragmentation of AI/ML tools presents another significant challenge. Many open source AI/ML infrastructure products lack the extensibility, flexibility, and production-readiness needed for broad adoption. Tools tend to be either domain-specific (e.g., hardware type) or generalizable but not production-quality. These limitations often compel developers to create their own siloed infrastructure, further fragmenting an already sprawling landscape of tools.

Despite these obstacles, the open source AI/ML ecosystem is seeing significant growth. As we speak, all of the top 25 trending GitHub repositories from the past month are AI/ML projects.

Many successful OSS AI/ML projects are being productized by start-ups with open core or open source SaaS business models. These include unicorns like Hugging Face, Stability AI (Stable Diffusion), Databricks (Spark) as well as Deepset (Haystack), Replicate (Cog), Modular (MLIR), and OctoML (TVM), which are backed by firms like Sequoia and Madrona. These start-ups can leverage their OSS developer communities to drive top-of-the-funnel growth for their commercial offerings.

What is OpenXLA Project? What are the primary objectives of launching the project?

OpenXLA is a suite of ML infrastructure tools that lets developers easily accelerate and scale their models’ performance on different hardware types like GPUs, Google TPUs, and many more. Its core component is XLA (Accelerated Linear Algebra), a compiler specialized for ML that performs automated model optimizations and generates efficient hardware-specific code.

Our project is driven forward by a few key goals.

Firstly, we want to make models portable across the diverse ecosystem of hardware and frameworks. There’s a plethora of ML accelerators in use today, each characterized by a unique architecture. Optimizing models for these accelerators can require crafting hardware-specific assembly code; a process that necessitates specialist knowledge and proves unscalable when deploying to different hardware types and generations. Via its unified compiler-framework API and hardware-specific compiler optimizations, OpenXLA aims to enable execution of any model from any major framework on a wide array of hardware with minimal code changes.

We also want to provide leading out of the box performance and scalability for current and emerging models. In just 4 years, large model parameter sizes have grown 1,800x imposing substantial compute costs on development teams. These models have sundry characteristics. Their algorithms and operations vary (e.g. multi-head attention for Transformers and convolutions for CNNs) as do their input shapes and batch sizes. OpenXLA aims to provide best-in-class performance for a multitude of model types and significantly reduce training and inference costs.

Lastly, we want to let developers extend our compiler platform via its composable and layered stack. Inflexible ML infrastructure has led hardware vendors to develop their own bespoke toolchains like MKL, cuDNN, etc. Our compiler platform instead embraces the diverse ecosystem of hardware, frameworks, and pluggable software libraries via extension points and escape hatches that allow complete customization of the compilation flow by vendors.

Could you tell us about the role of AI technology leaders in this project and how they would support its growth?

OpenXLA’s members, including Alibaba, Amazon Web Services, AMD, Arm, Apple, Cerebras, Google, Graphcore, Hugging Face, Intel, Meta, and NVIDIA, are the lifeblood of our ecosystem. They help grow the OpenXLA platform by deeply integrating their AI models, frameworks, primitive libraries, and hardware; their expert practitioners infuse our community with unique insights from across the ML stack via code contributions, discussions, and monthly presentations.

Moreover, our members rely on OpenXLA to boost performance for their own products and customers. Meta AI leverages XLA to power their large-scale PyTorch workloads for research on TPUs, while at Google, we employ XLA to compile everything from Bard to Waymo’s on-vehicle models. Cerebras, Graphcore, and AWS use XLA for their custom silicon, owing to its exceptional performance, stability, and compatibility with popular frameworks. MLOps tools like Google Cloud’s Vertex AI and Amazon SageMaker also use XLA for model acceleration. This direct involvement and investment from leading hardware, framework, cloud, and model providers ensures that with OpenXLA, the broader ML community gets top performance for their preferred tools.

Please tell us more about MLIR and the programming infrastructure it supports?

Multi-Level Intermediate Representation (MLIR) is a framework for building compilers, especially for machine learning. It was launched and later donated to the LLVM Project in 2019 by Google’s TensorFlow team under the leadership of Swift inventor Chris Lattner and TFLite founder Tim Davis.

MLIR provides an extensible compiler infrastructure that lets developers create, implement, and re-use compiler components like optimization passes and code generators for various hardware types and applications. Its primary objectives are to substantially lower the cost of developing compilers, piece together different intermediate representations, and enhance compilation for a wide range of hardware targets.

OpenXLA, as well as numerous other ML infrastructure projects such as TensorFlow, OpenAI’s Triton, and Modular’s forthcoming AI developer platform, are developed with MLIR.

Tech companies are introducing ChatGPT3-based tools and capabilities for IT, marketing and sales customers. What are your thoughts on this blazing trend?

The surge in text/code generation tools and plug-ins from start-ups like Jasper AI and established players such as Google (Bard) or GitHub (Copilot powered by OpenAI Codex), is pretty remarkable. Estimates indicate that some of these AI apps are already generating several hundred million dollars in revenue on an annualized basis. Moreover, the market for generative AI conceivably encompasses any activity requiring human creativity and intelligence.

The strong product-market fit for some of these AI tools – as evidenced by their rapid, retentive growth, and ability to monetize – is in part attributable to the fundamental needs they address, notably our input/output bound nature. So many of us desire to produce more with less time and energy. Gen AI provides that step function improvement in our ability to both assimilate and convey information.

As you noted, sales and marketing has emerged as a popular early use case. A multitude of copywriting apps, such as Jasper AI, Copy.ai, and Twain are entering the market. The trend is at least in part driven by the inflection in e-commerce growth since COVID and the resulting demand for targeted marketing and ad copy that can be generated at scale. AI copywriting apps promise improved conversion rates, SEO optimization, and, of course, productivity, which can help reduce CAC. Certain marketing orgs have even been able to reduce their labor costs by up to 50% by using Jasper AI .

I think the long-term moats for these apps, however, remains to be seen. Many of them offer similar functionality via the same foundation models (GPT3, T5, PaLM, Bloom), which may converge in quality over time. I’m personally excited about AI companies in the sales and marketing space building defensibility for their tools by looking beyond single tasks like copywriting and addressing the end-to-end user journey with AI-first experiences.

Telescope AI is a great example of this for sales prospecting. It provides a platform designed from the ground up around AI that not only optimizes and automates sales outreach emails, but additionally identifies the ideal customer profile, generates leads, and evaluates customer sentiment. If Telescope can achieve wider distribution and use its proprietary user interaction data to continually improve model performance for prospecting tasks, its vertically fine tuned solution and end-to-end user-centric workflow could be difficult for competitors to replicate.

How could technology platforms benefit from leveraging Generative AI?

Assuming technology platforms refers to products like Google Search, cloud platforms like Lambda Labs or Google Cloud, or model hosting platforms like Hugging Face, then yes, these businesses stand to substantially benefit from offering and leveraging generative AI.

Users gravitate toward platforms offering low-friction, cost-effective, and reliable ways to transact, interact, or develop and run services at scale. Generative AI has the ability to greatly augment this core value proposition.

For instance, Etsy could pass on time and cost savings to sellers via AI tools for creating targeted detail page descriptions and chatbots to handle time-consuming interactions with buyers. Likewise, Instacart could enrich the buyer’s experience with features like image-to-text grocery list generation based on, for example, a photo of an appealing dish spotted at a restaurant.

In turn, generative AI can itself benefit from tapping into platforms. AI advancements are propelled forward by a virtuous cycle of more data, improved algorithms, and faster, cheaper compute. Platforms can provide AI models with distribution and a trove of user interaction data sustained by network effects. By re-training models on this data, model makers can enhance performance and UX leading to stronger technical differentiation and moats for their AI.

Hugging Face exemplifies this strategy with its platform-based ecosystem that thrives on cross-side network effects between model providers and users. It features a community of 10,000+ organizations and thousands of active contributors that have shared over 160,000 models and 26,000 datasets. The continuous stream of feedback data flowing through the platform, including pull requests, user prompts, model outputs, and fine-tuned child models, empowers providers to further refine their AI systems, ultimately benefiting all stakeholders in the ecosystem.

Thank you, James! That was fun and we hope to see you back on AiThority.com soon.

[To share your insights with us, please write to sghosh@martechseries.com]

About James
About Google

About James

James is a Product Lead at Google, responsible for the optimization of AI models. He works with ML engineers and researchers to ensure that Alphabet’s end users and Cloud customers receive responsive, accurate, and delightful AI experiences – whether they are chatbots, protein folding algorithms, or autonomous vehicles. He also leads open-source strategy from a product perspective for the OpenXLA Project, an end-to-end ML infrastructure platform used by AWS, Hugging Face, Graphcore, Meta, and many more organizations to accelerate tens of millions of ML workloads a day.

Outside of Google, James serves as an Advisor to Oddworld Inhabitants, a narrative animation and gaming company backed by Epic Games, and a stealth-mode blockchain startup.

About Google

Google is a multinational corporation that specializes in Internet-related services and products. Googlers build products that help create opportunities for everyone, whether down the street or across the globe. Bring your insight, imagination and a healthy disregard for the impossible. Bring everything that makes you unique.

AI ML Tools AiThority Interview Generative AI Google machine learning OpenXLA