Agents of Tomorrow

Ep5: Multimodal AI Agents: Benchmarking, Adapting, and Adversarial attacks

Nov 25, 202400:48:10Tap to summarize

In this episode, we dive into the multimodal AI agents, starting with the recent release of runner H and diving into groundbreaking research, including:04:15 VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks by Jing Yu Koh et. al19:18 AdaptAgent: Adapting Multimodal Web Agents with Few-Shot Learning from Human Demonstrations by Gaurav Verma et. al.32:32 Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast by Xiangming Gu et. al.

Transcribe →

Ep4: AI Agents in Creativity and Design: Minecraft as a 3D Playground, Creating Dream Spaces and 3D Modeling

Nov 18, 202400:37:47Tap to summarize

In this episode, we dive into the groundbreaking world of AI agents transforming creativity and design. We start with navigating 3D environments in Minecraft, setting the stage for more complex real-world tasks. Then, we explore how AI agents are revolutionizing 3D modeling in Blender, bringing intricate designs to life. Finally, we delve into the fascinating applications in interior design, where spatial reasoning is used to create dream spaces. Subscribe and tune in to discover new agentic applications every week.Papers covered:SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code by Ziniu Hu et. al.I-Design: Personalized LLM Interior Designer by Ata Çelen et. al

Transcribe →

Ep3: What’s new in AI agents for browser & edge

Nov 12, 202400:25:40Tap to summarize

In this episode, we explore various paradigms enabling AI agents to take actions on the web and on the edge. We delve into the multi-modal method used by Magentic-One, recently released by Microsoft, and the reinforcement learning-based navigation method Agent Q developed by MultiOn. Additionally, we discuss TinyAgent from UC Berkeley, which allows agents to operate on edge devices and assist users with daily tasks.

Transcribe →

Ep2: How AI Agents Are Shaping Hiring, Healthcare and Knowledge Work

Nov 4, 202400:19:43Tap to summarize

In this episode, we dive into LinkedIn’s use of AI agents for hiring, Oracle’s clinical AI agent, and review three papers on AI agents in knowledge work, including applications in machine learning and software engineering.Papers covered:(03:56 - 07:07) WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks by Léo Boisvert et al.(07:08 - 10:12) SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning by Yizhou Chi et al.(10:13 - 19:30) MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework by Sirui Hong et al.

Transcribe →

Ep1 - Claude3.5, AgentForce, Copilot Studio, AgentC, Mixture-of-Agents, Agent-as-a-Judge, On the limits of agency

Oct 26, 202400:42:44Tap to summarize

Applications covered:(0:00 - 5:15) Claude 3.5 - AnthropicAgentForce - SalesforceCopilot Studio - MicrosoftAgentC - CelonisPapers covered:(5:16 - 19:28) Mixture-of-Agents Enhances Large Language Model Capabilities by Junlin Wang et al.(19:29 - 32:10) Agent-as-a-Judge: Evaluate Agents with Agents by Mingchen Zhuge et al. (32:11 - 42:44) On the limits of agency in agent-based models by Ayush Chopra et al.

Transcribe →

All episodes

Ep5: Multimodal AI Agents: Benchmarking, Adapting, and Adversarial attacks

Ep4: AI Agents in Creativity and Design: Minecraft as a 3D Playground, Creating Dream Spaces and 3D Modeling

Ep3: What’s new in AI agents for browser & edge

Ep2: How AI Agents Are Shaping Hiring, Healthcare and Knowledge Work

Ep1 - Claude3.5, AgentForce, Copilot Studio, AgentC, Mixture-of-Agents, Agent-as-a-Judge, On the limits of agency