The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Waymo's Foundation Model for Autonomous Driving with Drago Anguelov - #725

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Drago Anguelov, head of AI foundations at Waymo, for a deep dive into the role of foundation models in autonomous driving. Drago shares how Waymo is leveraging large-scale machine learning, including vision-language models and generative AI techniques to improve perception, planning, and simulation for its self-driving vehicles. The conversation explores the evolution of Waymo?s research stack, their custom ?Waymo Foundation Model,? and how they?re incorporating multimodal sensor data like lidar, radar, and camera into advanced AI systems. Drago also discusses how Waymo ensures safety at scale with rigorous validation frameworks, predictive world models, and realistic simulation environments. Finally, we touch on the challenges of generalization across cities, freeway driving, end-to-end learning vs. modular architectures, and the future of AV testing through ML-powered simulation. The complete show notes for this episode can be found at https://twimlai.com/go/725.

2025-03-31
Link to episode

Dynamic Token Merging for Efficient Byte-level Language Models with Julie Kallini - #724

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Julie Kallini, PhD student at Stanford University to discuss her recent papers, ?MrT5: Dynamic Token Merging for Efficient Byte-level Language Models? and ?Mission: Impossible Language Models.? For the MrT5 paper, we explore the importance and failings of tokenization in large language models?including inefficient compression rates for under-resourced languages?and dig into byte-level modeling as an alternative. We discuss the architecture of MrT5, its ability to learn language-specific compression rates, its performance on multilingual benchmarks and character-level manipulation tasks, and its performance and efficiency. For the ?Mission: Impossible Language Models? paper, we review the core idea behind the research, the definition and creation of impossible languages, the creation of impossible language training datasets, and explore the bias of language model architectures towards natural language. The complete show notes for this episode can be found at https://twimlai.com/go/724.

2025-03-24
Link to episode

Scaling Up Test-Time Compute with Latent Reasoning with Jonas Geiping - #723

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Jonas Geiping, research group leader at Ellis Institute and the Max Planck Institute for Intelligent Systems to discuss his recent paper, ?Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach.? This paper proposes a novel language model architecture which uses recurrent depth to enable ?thinking in latent space.? We dig into ?internal reasoning? versus ?verbalized reasoning??analogous to non-verbalized and verbalized thinking in humans, and discuss how the model searches in latent space to predict the next token and dynamically allocates more compute based on token difficulty. We also explore how the recurrent depth architecture simplifies LLMs, the parallels to diffusion models, the model's performance on reasoning tasks, the challenges of comparing models with varying compute budgets, and architectural advantages such as zero-shot adaptive exits and natural speculative decoding. The complete show notes for this episode can be found at https://twimlai.com/go/723.

2025-03-17
Link to episode

Imagine while Reasoning in Space: Multimodal Visualization-of-Thought with Chengzu Li - #722

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Chengzu Li, PhD student at the University of Cambridge to discuss his recent paper, ?Imagine while Reasoning in Space: Multimodal Visualization-of-Thought.? We explore the motivations behind MVoT, its connection to prior work like TopViewRS, and its relation to cognitive science principles such as dual coding theory. We dig into the MVoT framework along with its various task environments?maze, mini-behavior, and frozen lake. We explore token discrepancy loss, a technique designed to align language and visual embeddings, ensuring accurate and meaningful visual representations. Additionally, we cover the data collection and training process, reasoning over relative spatial relations between different entities, and dynamic spatial reasoning. Lastly, Chengzu shares insights from experiments with MVoT, focusing on the lessons learned and the potential for applying these models in real-world scenarios like robotics and architectural design. The complete show notes for this episode can be found at https://twimlai.com/go/722.

2025-03-10
Link to episode

Inside s1: An o1-Style Reasoning Model That Cost Under $50 to Train with Niklas Muennighoff - #721

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Niklas Muennighoff, a PhD student at Stanford University, to discuss his paper, ?S1: Simple Test-Time Scaling.? We explore the motivations behind S1, as well as how it compares to OpenAI's O1 and DeepSeek's R1 models. We dig into the different approaches to test-time scaling, including parallel and sequential scaling, as well as S1?s data curation process, its training recipe, and its use of model distillation from Google Gemini and DeepSeek R1. We explore the novel "budget forcing" technique developed in the paper, allowing it to think longer for harder problems and optimize test-time compute for better performance. Additionally, we cover the evaluation benchmarks used, the comparison between supervised fine-tuning and reinforcement learning, and similar projects like the Hugging Face Open R1 project. Finally, we discuss the open-sourcing of S1 and its future directions. The complete show notes for this episode can be found at https://twimlai.com/go/721.

2025-03-04
Link to episode

Accelerating AI Training and Inference with AWS Trainium2 with Ron Diamant - #720

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Ron Diamant, chief architect for Trainium at Amazon Web Services, to discuss hardware acceleration for generative AI and the design and role of the recently released Trainium2 chip. We explore the architectural differences between Trainium and GPUs, highlighting its systolic array-based compute design, and how it balances performance across key dimensions like compute, memory bandwidth, memory capacity, and network bandwidth. We also discuss the Trainium tooling ecosystem including the Neuron SDK, Neuron Compiler, and Neuron Kernel Interface (NKI). We also dig into the various ways Trainum2 is offered, including Trn2 instances, UltraServers, and UltraClusters, and access through managed services like AWS Bedrock. Finally, we cover sparsity optimizations, customer adoption, performance benchmarks, support for Mixture of Experts (MoE) models, and what?s next for Trainium. The complete show notes for this episode can be found at https://twimlai.com/go/720.

2025-02-24
Link to episode

?0: A Foundation Model for Robotics with Sergey Levine - #719

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Sergey Levine, associate professor at UC Berkeley and co-founder of Physical Intelligence, to discuss ?0 (pi-zero), a general-purpose robotic foundation model. We dig into the model architecture, which pairs a vision language model (VLM) with a diffusion-based action expert, and the model training "recipe," emphasizing the roles of pre-training and post-training with a diverse mixture of real-world data to ensure robust and intelligent robot learning. We review the data collection approach, which uses human operators and teleoperation rigs, the potential of synthetic data and reinforcement learning in enhancing robotic capabilities, and much more. We also introduce the team?s new FAST tokenizer, which opens the door to a fully Transformer-based model and significant improvements in learning and generalization. Finally, we cover the open-sourcing of ?0 and future directions for their research. The complete show notes for this episode can be found at https://twimlai.com/go/719.

2025-02-18
Link to episode

AI Trends 2025: AI Agents and Multi-Agent Systems with Victor Dibia - #718

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today we?re joined by Victor Dibia, principal research software engineer at Microsoft Research, to explore the key trends and advancements in AI agents and multi-agent systems shaping 2025 and beyond. In this episode, we discuss the unique abilities that set AI agents apart from traditional software systems?reasoning, acting, communicating, and adapting. We also examine the rise of agentic foundation models, the emergence of interface agents like Claude with Computer Use and OpenAI Operator, the shift from simple task chains to complex workflows, and the growing range of enterprise use cases. Victor shares insights into emerging design patterns for autonomous multi-agent systems, including graph and message-driven architectures, the advantages of the ?actor model? pattern as implemented in Microsoft?s AutoGen, and guidance on how users should approach the ?build vs. buy? decision when working with AI agent frameworks. We also address the challenges of evaluating end-to-end agent performance, the complexities of benchmarking agentic systems, and the implications of our reliance on LLMs as judges. Finally, we look ahead to the future of AI agents in 2025 and beyond, discuss emerging HCI challenges, their potential for impact on the workforce, and how they are poised to reshape fields like software engineering. The complete show notes for this episode can be found at https://twimlai.com/go/718.

2025-02-10
Link to episode

Speculative Decoding and Efficient LLM Inference with Chris Lott - #717

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Chris Lott, senior director of engineering at Qualcomm AI Research to discuss accelerating large language model inference. We explore the challenges presented by the LLM encoding and decoding (aka generation) and how these interact with various hardware constraints such as FLOPS, memory footprint and memory bandwidth to limit key inference metrics such as time-to-first-token, tokens per second, and tokens per joule. We then dig into a variety of techniques that can be used to accelerate inference such as KV compression, quantization, pruning, speculative decoding, and leveraging small language models (SLMs). We also discuss future directions for enabling on-device agentic experiences such as parallel generation and software tools like Qualcomm AI Orchestrator. The complete show notes for this episode can be found at https://twimlai.com/go/717.

2025-02-04
Link to episode

Ensuring Privacy for Any LLM with Patricia Thaine - #716

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Patricia Thaine, co-founder and CEO of Private AI to discuss techniques for ensuring privacy, data minimization, and compliance when using 3rd-party large language models (LLMs) and other AI services. We explore the risks of data leakage from LLMs and embeddings, the complexities of identifying and redacting personal information across various data flows, and the approach Private AI has taken to mitigate these risks. We also dig into the challenges of entity recognition in multimodal systems including OCR files, documents, images, and audio, and the importance of data quality and model accuracy. Additionally, Patricia shares insights on the limitations of data anonymization, the benefits of balancing real-world and synthetic data in model training and development, and the relationship between privacy and bias in AI. Finally, we touch on the evolving landscape of AI regulations like GDPR, CPRA, and the EU AI Act, and the future of privacy in artificial intelligence. The complete show notes for this episode can be found at https://twimlai.com/go/716.

2025-01-28
Link to episode

AI Engineering Pitfalls with Chip Huyen - #715

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Chip Huyen, independent researcher and writer to discuss her new book, ?AI Engineering.? We dig into the definition of AI engineering, its key differences from traditional machine learning engineering, the common pitfalls encountered in engineering AI systems, and strategies to overcome them. We also explore how Chip defines AI agents, their current limitations and capabilities, and the critical role of effective planning and tool utilization in these systems. Additionally, Chip shares insights on the importance of evaluation in AI systems, highlighting the need for systematic processes, human oversight, and rigorous metrics and benchmarks. Finally, we touch on the impact of open-source models, the potential of synthetic data, and Chip?s predictions for the year ahead. The complete show notes for this episode can be found at https://twimlai.com/go/715.

2025-01-21
Link to episode

Evolving MLOps Platforms for Generative AI and Agents with Abhijit Bose - #714

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Abhijit Bose, head of enterprise AI and ML platforms at Capital One to discuss the evolution of the company?s approach and insights on Generative AI and platform best practices. In this episode, we dig into the company?s platform-centric approach to AI, and how they?ve been evolving their existing MLOps and data platforms to support the new challenges and opportunities presented by generative AI workloads and AI agents. We explore their use of cloud-based infrastructure?in this case on AWS?to provide a foundation upon which they then layer open-source and proprietary services and tools. We cover their use of Llama 3 and open-weight models, their approach to fine-tuning, their observability tooling for Gen AI applications, their use of inference optimization techniques like quantization, and more. Finally, Abhijit shares the future of agentic workflows in the enterprise, the application of OpenAI o1-style reasoning in models, and the new roles and skillsets required in the evolving GenAI landscape. The complete show notes for this episode can be found at https://twimlai.com/go/714.

2025-01-13
Link to episode

Why Agents Are Stupid & What We Can Do About It with Dan Jeffries - #713

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Dan Jeffries, founder and CEO of Kentauros AI to discuss the challenges currently faced by those developing advanced AI agents. We dig into how Dan defines agents and distinguishes them from other similar uses of LLM, explore various use cases for them, and dig into ways to create smarter agentic systems. Dan shared his ?big brain, little brain, tool brain? approach to tackling real-world challenges in agents, the trade-offs in leveraging general-purpose vs. task-specific models, and his take on LLM reasoning. We also cover the way he thinks about model selection for agents, along with the need for new tools and platforms for deploying them. Finally, Dan emphasizes the importance of open source in advancing AI, shares the new products they?re working on, and explores the future directions in the agentic era. The complete show notes for this episode can be found at https://twimlai.com/go/713.

2024-12-16
Link to episode

Automated Reasoning to Prevent LLM Hallucination with Byron Cook - #712

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Byron Cook, VP and distinguished scientist in the Automated Reasoning Group at AWS to dig into the underlying technology behind the newly announced Automated Reasoning Checks feature of Amazon Bedrock Guardrails. Automated Reasoning Checks uses mathematical proofs to help LLM users safeguard against hallucinations. We explore recent advancements in the field of automated reasoning, as well as some of the ways it is applied broadly, as well as across AWS, where it is used to enhance security, cryptography, virtualization, and more. We discuss how the new feature helps users to generate, refine, validate, and formalize policies, and how those policies can be deployed alongside LLM applications to ensure the accuracy of generated text. Finally, Byron also shares the benchmarks they?ve applied, the use of techniques like ?constrained coding? and ?backtracking,? and the future co-evolution of automated reasoning and generative AI. The complete show notes for this episode can be found at https://twimlai.com/go/712.

2024-12-09
Link to episode

AI at the Edge: Qualcomm AI Research at NeurIPS 2024 with Arash Behboodi - #711

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Arash Behboodi, director of engineering at Qualcomm AI Research to discuss the papers and workshops Qualcomm will be presenting at this year?s NeurIPS conference. We dig into the challenges and opportunities presented by differentiable simulation in wireless systems, the sciences, and beyond. We also explore recent work that ties conformal prediction to information theory, yielding a novel approach to incorporating uncertainty quantification directly into machine learning models. Finally, we review several papers enabling the efficient use of LoRA (Low-Rank Adaptation) on mobile devices (Hollowed Net, ShiRA, FouRA). Arash also previews the demos Qualcomm will be hosting at NeurIPS, including new video editing diffusion and 3D content generation models running on-device, Qualcomm's AI Hub, and more! The complete show notes for this episode can be found at https://twimlai.com/go/711.

2024-12-03
Link to episode

AI for Network Management with Shirley Wu - #710

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Shirley Wu, senior director of software engineering at Juniper Networks to discuss how machine learning and artificial intelligence are transforming network management. We explore various use cases where AI and ML are applied to enhance the quality, performance, and efficiency of networks across Juniper?s customers, including diagnosing cable degradation, proactive monitoring for coverage gaps, and real-time fault detection. We also dig into the complexities of integrating data science into networking, the trade-offs between traditional methods and ML-based solutions, the role of feature engineering and data in networking, the applicability of large language models, and Juniper?s approach to using smaller, specialized ML models to optimize speed, latency, and cost. Finally, Shirley shares some future directions for Juniper Mist such as proactive network testing and end-user self-service. The complete show notes for this episode can be found at https://twimlai.com/go/710.

2024-11-19
Link to episode

Why Your RAG System Is Broken, and How to Fix It with Jason Liu - #709

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Jason Liu, freelance AI consultant, advisor, and creator of the Instructor library to discuss all things retrieval-augmented generation (RAG). We dig into the tactical and strategic challenges companies face with their RAG system, the different signs Jason looks for to identify looming problems, the issues he most commonly encounters, and the steps he takes to diagnose these issues. We also cover the significance of building out robust test datasets, data-driven experimentation, evaluation tools, and metrics for different use cases. We also touched on fine-tuning strategies for RAG systems, the effectiveness of different chunking strategies, the use of collaboration tools like Braintrust, and how future models will change the game. Lastly, we cover Jason?s interest in teaching others how to capitalize on their own AI experience via his AI consulting course. The complete show notes for this episode can be found at https://twimlai.com/go/709.

2024-11-11
Link to episode

An Agentic Mixture of Experts for DevOps with Sunil Mallya - #708

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today we're joined by Sunil Mallya, CTO and co-founder of Flip AI. We discuss Flip?s incident debugging system for DevOps, which was built using a custom mixture of experts (MoE) large language model (LLM) trained on a novel "CoMELT" observability dataset which combines traditional MELT data?metrics, events, logs, and traces?with code to efficiently identify root failure causes in complex software systems. We discuss the challenges of integrating time-series data with LLMs and their multi-decoder architecture designed for this purpose. Sunil describes their system's agent-based design, focusing on clear roles and boundaries to ensure reliability. We examine their "chaos gym," a reinforcement learning environment used for testing and improving the system's robustness. Finally, we discuss the practical considerations of deploying such a system at scale in diverse environments and much more. The complete show notes for this episode can be found at https://twimlai.com/go/708.

2024-11-04
Link to episode

Building AI Voice Agents with Scott Stephenson - #707

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Scott Stephenson, co-founder and CEO of Deepgram to discuss voice AI agents. We explore the importance of perception, understanding, and interaction and how these key components work together in building intelligent AI voice agents. We discuss the role of multimodal LLMs as well as speech-to-text and text-to-speech models in building AI voice agents, and dig into the benefits and limitations of text-based approaches to voice interactions. We dig into what?s required to deliver real-time voice interactions and the promise of closed-loop, continuously improving, federated learning agents. Finally, Scott shares practical applications of AI voice agents at Deepgram and provides an overview of their newly released agent toolkit. The complete show notes for this episode can be found at https://twimlai.com/go/707.

2024-10-28
Link to episode

Is Artificial Superintelligence Imminent? with Tim Rocktäschel - #706

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Tim Rocktäschel, senior staff research scientist at Google DeepMind, professor of Artificial Intelligence at University College London, and author of the recently published popular science book, ?Artificial Intelligence: 10 Things You Should Know.? We dig into the attainability of artificial superintelligence and the path to achieving generalized superhuman capabilities across multiple domains. We discuss the importance of open-endedness in developing autonomous and self-improving systems, as well as the role of evolutionary approaches and algorithms. Additionally, we cover Tim?s recent research projects such as ?Promptbreeder,? ?Debating with More Persuasive LLMs Leads to More Truthful Answers,? and more. The complete show notes for this episode can be found at https://twimlai.com/go/706.

2024-10-21
Link to episode

ML Models for Safety-Critical Systems with Lucas García - #705

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Lucas García, principal product manager for deep learning at MathWorks to discuss incorporating ML models into safety-critical systems. We begin by exploring the critical role of verification and validation (V&V) in these applications. We review the popular V-model for engineering critical systems and then dig into the ?W? adaptation that?s been proposed for incorporating ML models. Next, we discuss the complexities of applying deep learning neural networks in safety-critical applications using the aviation industry as an example, and talk through the importance of factors such as data quality, model stability, robustness, interpretability, and accuracy. We also explore formal verification methods, abstract transformer layers, transformer-based architectures, and the application of various software testing techniques. Lucas also introduces the field of constrained deep learning and convex neural networks and its benefits and trade-offs. The complete show notes for this episode can be found at https://twimlai.com/go/705.

2024-10-14
Link to episode

AI Agents: Substance or Snake Oil with Arvind Narayanan - #704

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Arvind Narayanan, professor of Computer Science at Princeton University to discuss his recent works, AI Agents That Matter and AI Snake Oil. In ?AI Agents That Matter?, we explore the range of agentic behaviors, the challenges in benchmarking agents, and the ?capability and reliability gap?, which creates risks when deploying AI agents in real-world applications. We also discuss the importance of verifiers as a technique for safeguarding agent behavior. We then dig into the AI Snake Oil book, which uncovers examples of problematic and overhyped claims in AI. Arvind shares various use cases of failed applications of AI, outlines a taxonomy of AI risks, and shares his insights on AI?s catastrophic risks. Additionally, we also touched on different approaches to LLM-based reasoning, his views on tech policy and regulation, and his work on CORE-Bench, a benchmark designed to measure AI agents' accuracy in computational reproducibility tasks. The complete show notes for this episode can be found at https://twimlai.com/go/704.

2024-10-07
Link to episode

AI Agents for Data Analysis with Shreya Shankar - #703

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Shreya Shankar, a PhD student at UC Berkeley to discuss DocETL, a declarative system for building and optimizing LLM-powered data processing pipelines for large-scale and complex document analysis tasks. We explore how DocETL's optimizer architecture works, the intricacies of building agentic systems for data processing, the current landscape of benchmarks for data processing tasks, how these differ from reasoning-based benchmarks, and the need for robust evaluation methods for human-in-the-loop LLM workflows. Additionally, Shreya shares real-world applications of DocETL, the importance of effective validation prompts, and building robust and fault-tolerant agentic systems. Lastly, we cover the need for benchmarks tailored to LLM-powered data processing tasks and the future directions for DocETL. The complete show notes for this episode can be found at https://twimlai.com/go/703.

2024-09-30
Link to episode

Stealing Part of a Production Language Model with Nicholas Carlini - #702

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Nicholas Carlini, research scientist at Google DeepMind to discuss adversarial machine learning and model security, focusing on his 2024 ICML best paper winner, ?Stealing part of a production language model.? We dig into this work, which demonstrated the ability to successfully steal the last layer of production language models including ChatGPT and PaLM-2. Nicholas shares the current landscape of AI security research in the age of LLMs, the implications of model stealing, ethical concerns surrounding model privacy, how the attack works, and the significance of the embedding layer in language models. We also discuss the remediation strategies implemented by OpenAI and Google, and the future directions in the field of AI security. Plus, we also cover his other ICML 2024 best paper, ?Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining,? which questions the use and promotion of differential privacy in conjunction with pre-trained models. The complete show notes for this episode can be found at https://twimlai.com/go/702.

2024-09-23
Link to episode

Supercharging Developer Productivity with ChatGPT and Claude with Simon Willison - #701

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Simon Willison, independent researcher and creator of Datasette to discuss the many ways software developers and engineers can take advantage of large language models (LLMs) to boost their productivity. We dig into Simon?s own workflows and how he uses popular models like ChatGPT and Anthropic?s Claude to write and test hundreds of lines of code while out walking his dog. We review Simon?s favorite prompting and debugging techniques, his strategies for sidestepping the limitations of contemporary models, how he uses Claude?s Artifacts feature for rapid prototyping, his thoughts on the use and impact of vision models, the role he sees for open source models and local LLMs, and much more. The complete show notes for this episode can be found at https://twimlai.com/go/701.

2024-09-17
Link to episode

Automated Design of Agentic Systems with Shengran Hu - #700

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Shengran Hu, a PhD student at the University of British Columbia, to discuss Automated Design of Agentic Systems (ADAS), an approach focused on automatically creating agentic system designs. We explore the spectrum of agentic behaviors, the motivation for learning all aspects of agentic system design, the key components of the ADAS approach, and how it uses LLMs to design novel agent architectures in code. We also cover the iterative process of ADAS, its potential to shed light on the behavior of foundation models, the higher-level meta-behaviors that emerge in agentic systems, and how ADAS uncovers novel design patterns through emergent behaviors, particularly in complex tasks like the ARC challenge. Finally, we touch on the practical applications of ADAS and its potential use in system optimization for real-world tasks. The complete show notes for this episode can be found at https://twimlai.com/go/700.

2024-09-02
Link to episode

The EU AI Act and Mitigating Bias in Automated Decisioning with Peter van der Putten - #699

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Peter van der Putten, director of the AI Lab at Pega and assistant professor of AI at Leiden University. We discuss the newly adopted European AI Act and the challenges of applying academic fairness metrics in real-world AI applications. We dig into the key ethical principles behind the Act, its broad definition of AI, and how it categorizes various AI risks. We also discuss the practical challenges of implementing fairness and bias metrics in real-world scenarios, and the importance of a risk-based approach in regulating AI systems. Finally, we cover how the EU AI Act might influence global practices, similar to the GDPR's effect on data privacy, and explore strategies for closing bias gaps in real-world automated decision-making. The complete show notes for this episode can be found at https://twimlai.com/go/699.

2024-08-27
Link to episode

The Building Blocks of Agentic Systems with Harrison Chase - #698

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Harrison Chase, co-founder and CEO of LangChain to discuss LLM frameworks, agentic systems, RAG, evaluation, and more. We dig into the elements of a modern LLM framework, including the most productive developer experiences and appropriate levels of abstraction. We dive into agents and agentic systems as well, covering the ?spectrum of agenticness,? cognitive architectures, and real-world applications. We explore key challenges in deploying agentic systems, and the importance of agentic architectures as a means of communication in system design and operation. Additionally, we review evolving use cases for RAG, and the role of observability, testing, and evaluation tools in moving LLM applications from prototype to production. Lastly, Harrison shares his hot takes on prompting, multi-modal models, and more! The complete show notes for this episode can be found at https://twimlai.com/go/698.

2024-08-19
Link to episode

Simplifying On-Device AI for Developers with Siddhika Nevrekar - #697

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Siddhika Nevrekar, AI Hub head at Qualcomm Technologies, to discuss on-device AI and how to make it easier for developers to take advantage of device capabilities. We unpack the motivations for AI engineers to move model inference from the cloud to local devices, and explore the challenges associated with on-device AI. We dig into the role of hardware solutions, from powerful system-on-chips (SoC) to neural processors, the importance of collaboration between community runtimes like ONNX and TFLite and chip manufacturers, the unique challenges of IoT and autonomous vehicles, and the key metrics developers should focus on to ensure optimal on-device performance. Finally, Siddhika introduces Qualcomm's AI Hub, a platform developed to simplify the process of testing and optimizing AI models across different devices. The complete show notes for this episode can be found at https://twimlai.com/go/697.

2024-08-12
Link to episode

Genie: Generative Interactive Environments with Ashley Edwards - #696

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Ashley Edwards, a member of technical staff at Runway, to discuss Genie: Generative Interactive Environments, a system for creating ?playable? video environments for training deep reinforcement learning (RL) agents at scale in a completely unsupervised manner. We explore the motivations behind Genie, the challenges of data acquisition for RL, and Genie?s capability to learn world models from videos without explicit action data, enabling seamless interaction and frame prediction. Ashley walks us through Genie?s core components?the latent action model, video tokenizer, and dynamics model?and explains how these elements collaborate to predict future frames in video sequences. We discuss the model architecture, training strategies, benchmarks used, as well as the application of spatiotemporal transformers and the MaskGIT techniques used for efficient token prediction and representation. Finally, we touched on Genie?s practical implications, its comparison to other video generation models like ?Sora,? and potential future directions in video generation and diffusion models. The complete show notes for this episode can be found at https://twimlai.com/go/696.

2024-08-05
Link to episode

Bridging the Sim2real Gap in Robotics with Marius Memmel - #695

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Marius Memmel, a PhD student at the University of Washington, to discuss his research on sim-to-real transfer approaches for developing autonomous robotic agents in unstructured environments. Our conversation focuses on his recent ASID and URDFormer papers. We explore the complexities presented by real-world settings like a cluttered kitchen, data acquisition challenges for training robust models, the importance of simulation, and the challenge of bridging the sim2real gap in robotics. Marius introduces ASID, a framework designed to enable robots to autonomously generate and refine simulation models to improve sim-to-real transfer. We discuss the role of Fisher information as a metric for trajectory sensitivity to physical parameters and the importance of exploration and exploitation phases in robot learning. Additionally, we cover URDFormer, a transformer-based model that generates URDF documents for scene and object reconstruction to create realistic simulation environments. The complete show notes for this episode can be found at https://twimlai.com/go/695.

2024-07-30
Link to episode

Building Real-World LLM Products with Fine-Tuning and More with Hamel Husain - #694

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Hamel Husain, founder of Parlance Labs, to discuss the ins and outs of building real-world products using large language models (LLMs). We kick things off discussing novel applications of LLMs and how to think about modern AI user experiences. We then dig into the key challenge faced by LLM developers?how to iterate from a snazzy demo or proof-of-concept to a working LLM-based application. We discuss the pros, cons, and role of fine-tuning LLMs and dig into when to use this technique. We cover the fine-tuning process, common pitfalls in evaluation?such as relying too heavily on generic tools and missing the nuances of specific use cases, open-source LLM fine-tuning tools like Axolotl, the use of LoRA adapters, and more. Hamel also shares insights on model optimization and inference frameworks and how developers should approach these tools. Finally, we dig into how to use systematic evaluation techniques to guide the improvement of your LLM application, the importance of data generation and curation, and the parallels to traditional software engineering practices. The complete show notes for this episode can be found at https://twimlai.com/go/694.

2024-07-23
Link to episode

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert?s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

2024-07-17
Link to episode

Decoding Animal Behavior to Train Robots with EgoPet with Amir Bar - #692

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Amir Bar, a PhD candidate at Tel Aviv University and UC Berkeley to discuss his research on visual-based learning, including his recent paper, ?EgoPet: Egomotion and Interaction Data from an Animal?s Perspective.? Amir shares his research projects focused on self-supervised object detection and analogy reasoning for general computer vision tasks. We also discuss the current limitations of caption-based datasets in model training, the ?learning problem? in robotics, and the gap between the capabilities of animals and AI systems. Amir introduces ?EgoPet,? a dataset and benchmark tasks which allow motion and interaction data from an animal's perspective to be incorporated into machine learning models for robotic planning and proprioception. We explore the dataset collection process, comparisons with existing datasets and benchmark tasks, the findings on the model performance trained on EgoPet, and the potential of directly training robot policies that mimic animal behavior. The complete show notes for this episode can be found at https://twimlai.com/go/692.

2024-07-09
Link to episode

How Microsoft Scales Testing and Safety for Generative AI with Sarah Bird - #691

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Sarah Bird, chief product officer of responsible AI at Microsoft. We discuss the testing and evaluation techniques Microsoft applies to ensure safe deployment and use of generative AI, large language models, and image generation. In our conversation, we explore the unique risks and challenges presented by generative AI, the balance between fairness and security concerns, the application of adaptive and layered defense strategies for rapid response to unforeseen AI behaviors, the importance of automated AI safety testing and evaluation alongside human judgment, and the implementation of red teaming and governance. Sarah also shares learnings from Microsoft's ?Tay? and ?Bing Chat? incidents along with her thoughts on the rapidly evolving GenAI landscape. The complete show notes for this episode can be found at https://twimlai.com/go/691.

2024-07-01
Link to episode

Long Context Language Models and their Biological Applications with Eric Nguyen - #690

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Eric Nguyen, PhD student at Stanford University. In our conversation, we explore his research on long context foundation models and their application to biology particularly Hyena, and its evolution into Hyena DNA and Evo models. We discuss Hyena, a convolutional-based language model developed to tackle the challenges posed by long context lengths in language modeling. We dig into the limitations of transformers in dealing with longer sequences, the motivation for using convolutional models over transformers, its model training and architecture, the role of FFT in computational optimizations, and model explainability in long-sequence convolutions. We also talked about Hyena DNA, a genomic foundation model pre-trained on 1 million tokens, designed to capture long-range dependencies in DNA sequences. Finally, Eric introduces Evo, a 7 billion parameter hybrid model integrating attention layers with Hyena DNA's convolutional framework. We cover generating and designing DNA with language models, hallucinations in DNA models, evaluation benchmarks, the trade-offs between state-of-the-art models, zero-shot versus a few-shot performance, and the exciting potential in areas like CRISPR-Cas gene editing. The complete show notes for this episode can be found at https://twimlai.com/go/690.

2024-06-25
Link to episode

Accelerating Sustainability with AI with Andres Ravinet - #689

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Andres Ravinet, sustainability global black belt at Microsoft, to discuss the role of AI in sustainability. We explore real-world use cases where AI-driven solutions are leveraged to help tackle environmental and societal challenges, from early warning systems for extreme weather events to reducing food waste along the supply chain to conserving the Amazon rainforest. We cover the major threats that sustainability aims to address, the complexities in standardized sustainability compliance reporting, and the factors driving businesses to take a step toward sustainable practices. Lastly, Andres addresses the ways LLMs and generative AI can be applied towards the challenges of sustainability. The complete show notes for this episode can be found at https://twimlai.com/go/689.

2024-06-18
Link to episode

Gen AI at the Edge: Qualcomm AI Research at CVPR 2024 with Fatih Porikli - #688

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today we?re joined by Fatih Porikli, senior director of technology at Qualcomm AI Research. In our conversation, we covered several of the Qualcomm team?s 16 accepted main track and workshop papers at this year?s CVPR conference. The papers span a variety of generative AI and traditional computer vision topics, with an emphasis on increased training and inference efficiency for mobile and edge deployment. We explore efficient diffusion models for text-to-image generation, grounded reasoning in videos using language models, real-time on-device 360° image generation for video portrait relighting, unique video-language model for situated interactions like fitness coaching, and visual reasoning model and benchmark for interpreting complex mathematical plots, and more! We also touched on several of the demos the team will be presenting at the conference, including multi-modal vision-language models (LLaVA) and parameter-efficient fine tuning (LoRA) on mobile phones. The complete show notes for this episode can be found at https://twimlai.com/go/688.

2024-06-11
Link to episode

Energy Star Ratings for AI Models with Sasha Luccioni - #687

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Sasha Luccioni, AI and Climate lead at Hugging Face, to discuss the environmental impact of AI models. We dig into her recent research into the relative energy consumption of general purpose pre-trained models vs. task-specific, non-generative models for common AI tasks. We discuss the implications of the significant difference in efficiency and power consumption between the two types of models. Finally, we explore the complexities of energy efficiency and performance benchmarking, and talk through Sasha?s recent initiative, Energy Star Ratings for AI Models, a rating system designed to help AI users select and deploy models based on their energy efficiency. The complete show notes for this episode can be found at http://twimlai.com/go/687.

2024-06-04
Link to episode

Language Understanding and LLMs with Christopher Manning - #686

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today, we're joined by Christopher Manning, the Thomas M. Siebel professor in Machine Learning at Stanford University and a recent recipient of the 2024 IEEE John von Neumann medal. In our conversation with Chris, we discuss his contributions to foundational research areas in NLP, including word embeddings and attention. We explore his perspectives on the intersection of linguistics and large language models, their ability to learn human language structures, and their potential to teach us about human language acquisition. We also dig into the concept of ?intelligence? in language models, as well as the reasoning capabilities of LLMs. Finally, Chris shares his current research interests, alternative architectures he anticipates emerging beyond the LLM, and opportunities ahead in AI research. The complete show notes for this episode can be found at https://twimlai.com/go/686.

2024-05-27
Link to episode

Chronos: Learning the Language of Time Series with Abdul Fatir Ansari - #685

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today we're joined by Abdul Fatir Ansari, a machine learning scientist at AWS AI Labs in Berlin, to discuss his paper, "Chronos: Learning the Language of Time Series." Fatir explains the challenges of leveraging pre-trained language models for time series forecasting. We explore the advantages of Chronos over statistical models, as well as its promising results in zero-shot forecasting benchmarks. Finally, we address critiques of Chronos, the ongoing research to improve synthetic data quality, and the potential for integrating Chronos into production systems. The complete show notes for this episode can be found at twimlai.com/go/685.

2024-05-20
Link to episode

Powering AI with the World's Largest Computer Chip with Joel Hestness - #684

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today we're joined by Joel Hestness, principal research scientist and lead of the core machine learning team at Cerebras. We discuss Cerebras? custom silicon for machine learning, Wafer Scale Engine 3, and how the latest version of the company?s single-chip platform for ML has evolved to support large language models. Joel shares how WSE3 differs from other AI hardware solutions, such as GPUs, TPUs, and AWS? Inferentia, and talks through the homogenous design of the WSE chip and its memory architecture. We discuss software support for the platform, including support by open source ML frameworks like Pytorch, and support for different types of transformer-based models. Finally, Joel shares some of the research his team is pursuing to take advantage of the hardware's unique characteristics, including weight-sparse training, optimizers that leverage higher-order statistics, and more. The complete show notes for this episode can be found at twimlai.com/go/684.

2024-05-13
Link to episode

AI for Power & Energy with Laurent Boinot - #683

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today we're joined by Laurent Boinot, power and utilities lead for the Americas at Microsoft, to discuss the intersection of AI and energy infrastructure. We discuss the many challenges faced by current power systems in North America and the role AI is beginning to play in driving efficiencies in areas like demand forecasting and grid optimization. Laurent shares a variety of examples along the way, including some of the ways utility companies are using AI to ensure secure systems, interact with customers, navigate internal knowledge bases, and design electrical transmission systems. We also discuss the future of nuclear power, and why electric vehicles might play a critical role in American energy management. The complete show notes for this episode can be found at twimlai.com/go/683.

2024-05-07
Link to episode

Controlling Fusion Reactor Instability with Deep Reinforcement Learning with Aza Jalalvand - #682

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today we're joined by Azarakhsh (Aza) Jalalvand, a research scholar at Princeton University, to discuss his work using deep reinforcement learning to control plasma instabilities in nuclear fusion reactors. Aza explains his team developed a model to detect and avoid a fatal plasma instability called ?tearing mode?. Aza walks us through the process of collecting and pre-processing the complex diagnostic data from fusion experiments, training the models, and deploying the controller algorithm on the DIII-D fusion research reactor. He shares insights from developing the controller and discusses the future challenges and opportunities for AI in enabling stable and efficient fusion energy production. The complete show notes for this episode can be found at twimlai.com/go/682.

2024-04-29
Link to episode

GraphRAG: Knowledge Graphs for AI Applications with Kirk Marple - #681

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today we're joined by Kirk Marple, CEO and founder of Graphlit, to explore the emerging paradigm of "GraphRAG," or Graph Retrieval Augmented Generation. In our conversation, Kirk digs into the GraphRAG architecture and how Graphlit uses it to offer a multi-stage workflow for ingesting, processing, retrieving, and generating content using LLMs (like GPT-4) and other Generative AI tech. He shares how the system performs entity extraction to build a knowledge graph and how graph, vector, and object storage are integrated in the system. We dive into how the system uses ?prompt compilation? to improve the results it gets from Large Language Models during generation. We conclude by discussing several use cases the approach supports, as well as future agent-based applications it enables. The complete show notes for this episode can be found at twimlai.com/go/681.

2024-04-22
Link to episode

Teaching Large Language Models to Reason with Reinforcement Learning with Alex Havrilla - #680

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today we're joined by Alex Havrilla, a PhD student at Georgia Tech, to discuss "Teaching Large Language Models to Reason with Reinforcement Learning." Alex discusses the role of creativity and exploration in problem solving and explores the opportunities presented by applying reinforcement learning algorithms to the challenge of improving reasoning in large language models. Alex also shares his research on the effect of noise on language model training, highlighting the robustness of LLM architecture. Finally, we delve into the future of RL, and the potential of combining language models with traditional methods to achieve more robust AI reasoning. The complete show notes for this episode can be found at twimlai.com/go/680.

2024-04-17
Link to episode

Localizing and Editing Knowledge in LLMs with Peter Hase - #679

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today we're joined by Peter Hase, a fifth-year PhD student at the University of North Carolina NLP lab. We discuss "scalable oversight", and the importance of developing a deeper understanding of how large neural networks make decisions. We learn how matrices are probed by interpretability researchers, and explore the two schools of thought regarding how LLMs store knowledge. Finally, we discuss the importance of deleting sensitive information from model weights, and how "easy-to-hard generalization" could increase the risk of releasing open-source foundation models. The complete show notes for this episode can be found at twimlai.com/go/679.

2024-04-08
Link to episode

Coercing LLMs to Do and Reveal (Almost) Anything with Jonas Geiping - #678

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today we're joined by Jonas Geiping, a research group leader at the ELLIS Institute, to explore his paper: "Coercing LLMs to Do and Reveal (Almost) Anything". Jonas explains how neural networks can be exploited, highlighting the risk of deploying LLM agents that interact with the real world. We discuss the role of open models in enabling security research, the challenges of optimizing over certain constraints, and the ongoing difficulties in achieving robustness in neural networks. Finally, we delve into the future of AI security, and the need for a better approach to mitigate the risks posed by optimized adversarial attacks. The complete show notes for this episode can be found at twimlai.com/go/678.

2024-04-01
Link to episode

V-JEPA, AI Reasoning from a Non-Generative Architecture with Mido Assran - #677

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today we?re joined by Mido Assran, a research scientist at Meta?s Fundamental AI Research (FAIR). In this conversation, we discuss V-JEPA, a new model being billed as ?the next step in Yann LeCun's vision? for true artificial reasoning. V-JEPA, the video version of Meta?s Joint Embedding Predictive Architecture, aims to bridge the gap between human and machine intelligence by training models to learn abstract concepts in a more efficient predictive manner than generative models. V-JEPA uses a novel self-supervised training approach that allows it to learn from unlabeled video data without being distracted by pixel-level detail. Mido walks us through the process of developing the architecture and explains why it has the potential to revolutionize AI. The complete show notes for this episode can be found at twimlai.com/go/677.

2024-03-25
Link to episode

Video as a Universal Interface for AI Reasoning with Sherry Yang - #676

Audio Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Today we?re joined by Sherry Yang, senior research scientist at Google DeepMind and a PhD student at UC Berkeley. In this interview, we discuss her new paper, "Video as the New Language for Real-World Decision Making,? which explores how generative video models can play a role similar to language models as a way to solve tasks in the real world. Sherry draws the analogy between natural language as a unified representation of information and text prediction as a common task interface and demonstrates how video as a medium and generative video as a task exhibit similar properties. This formulation enables video generation models to play a variety of real-world roles as planners, agents, compute engines, and environment simulators. Finally, we explore UniSim, an interactive demo of Sherry's work and a preview of her vision for interacting with AI-generated environments. The complete show notes for this episode can be found at twimlai.com/go/676.

2024-03-18
Link to episode

Subscribe

Website

Episodes

Waymo's Foundation Model for Autonomous Driving with Drago Anguelov - #725

Dynamic Token Merging for Efficient Byte-level Language Models with Julie Kallini - #724

Scaling Up Test-Time Compute with Latent Reasoning with Jonas Geiping - #723

Imagine while Reasoning in Space: Multimodal Visualization-of-Thought with Chengzu Li - #722

Inside s1: An o1-Style Reasoning Model That Cost Under $50 to Train with Niklas Muennighoff - #721

Accelerating AI Training and Inference with AWS Trainium2 with Ron Diamant - #720

?0: A Foundation Model for Robotics with Sergey Levine - #719

AI Trends 2025: AI Agents and Multi-Agent Systems with Victor Dibia - #718

Speculative Decoding and Efficient LLM Inference with Chris Lott - #717

Ensuring Privacy for Any LLM with Patricia Thaine - #716

AI Engineering Pitfalls with Chip Huyen - #715

Evolving MLOps Platforms for Generative AI and Agents with Abhijit Bose - #714

Why Agents Are Stupid & What We Can Do About It with Dan Jeffries - #713

Automated Reasoning to Prevent LLM Hallucination with Byron Cook - #712

AI at the Edge: Qualcomm AI Research at NeurIPS 2024 with Arash Behboodi - #711

AI for Network Management with Shirley Wu - #710

Why Your RAG System Is Broken, and How to Fix It with Jason Liu - #709

An Agentic Mixture of Experts for DevOps with Sunil Mallya - #708

Building AI Voice Agents with Scott Stephenson - #707

Is Artificial Superintelligence Imminent? with Tim Rocktäschel - #706

ML Models for Safety-Critical Systems with Lucas García - #705

AI Agents: Substance or Snake Oil with Arvind Narayanan - #704

AI Agents for Data Analysis with Shreya Shankar - #703

Stealing Part of a Production Language Model with Nicholas Carlini - #702

Supercharging Developer Productivity with ChatGPT and Claude with Simon Willison - #701

Automated Design of Agentic Systems with Shengran Hu - #700

The EU AI Act and Mitigating Bias in Automated Decisioning with Peter van der Putten - #699

The Building Blocks of Agentic Systems with Harrison Chase - #698

Simplifying On-Device AI for Developers with Siddhika Nevrekar - #697

Genie: Generative Interactive Environments with Ashley Edwards - #696

Bridging the Sim2real Gap in Robotics with Marius Memmel - #695

Building Real-World LLM Products with Fine-Tuning and More with Hamel Husain - #694

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Decoding Animal Behavior to Train Robots with EgoPet with Amir Bar - #692

How Microsoft Scales Testing and Safety for Generative AI with Sarah Bird - #691

Long Context Language Models and their Biological Applications with Eric Nguyen - #690

Accelerating Sustainability with AI with Andres Ravinet - #689

Gen AI at the Edge: Qualcomm AI Research at CVPR 2024 with Fatih Porikli - #688

Energy Star Ratings for AI Models with Sasha Luccioni - #687

Language Understanding and LLMs with Christopher Manning - #686

Chronos: Learning the Language of Time Series with Abdul Fatir Ansari - #685

Powering AI with the World's Largest Computer Chip with Joel Hestness - #684

AI for Power & Energy with Laurent Boinot - #683

Controlling Fusion Reactor Instability with Deep Reinforcement Learning with Aza Jalalvand - #682

GraphRAG: Knowledge Graphs for AI Applications with Kirk Marple - #681

Teaching Large Language Models to Reason with Reinforcement Learning with Alex Havrilla - #680

Localizing and Editing Knowledge in LLMs with Peter Hase - #679

Coercing LLMs to Do and Reveal (Almost) Anything with Jonas Geiping - #678

V-JEPA, AI Reasoning from a Non-Generative Architecture with Mido Assran - #677

Video as a Universal Interface for AI Reasoning with Sherry Yang - #676