#700: re:Invent 2024 - Swami Sivasubramanian Keynote - AWS Podcast

Summary6 min read

AWS Podcast Episode #700 Summary: re:Invent 2024 - Swami Sivasubramanian Keynote

Release Date: December 4, 2024
Host: Amazon Web Services (Simon Elisha and Hawn Nguyen-Loughren)
Podcast: AWS Podcast
Episode: #700 – Special Re:Invent Episode featuring Swami Sivasubramanian's Keynote

Introduction: Celebrating 700 Episodes

In Episode #700 of the AWS Podcast, hosted by Amazon Web Services, the hosts express their excitement in reaching this significant milestone. They thank their loyal listeners and set the stage for a deep dive into the keynote delivered by Swami Sivasubramanian, AWS's Vice President of AI and Data, at re:Invent 2024.

“I think we've hit episode 700. Incredible to me... I'm so thrilled for the many of you that have stuck with me through the many, many years of recording.”
— Host, [00:00]

Swami Sivasubramanian's Keynote Overview

Swami Sivasubramanian's keynote focused on Amazon's latest advancements in AI, data processing, and responsible AI practices. The discussion covered new capabilities in Amazon Bedrock, SageMaker enhancements, updates to Amazon Q and QuickSight, and initiatives aimed at promoting responsible AI and educational equity.

Amazon Bedrock: Expanding Capabilities and Enhancements

1. Amazon Bedrock Marketplace

Amazon Bedrock Marketplace has expanded to include over 100 publicly available and proprietary foundation models, complementing its serverless models. These models can be deployed on SageMaker endpoints, offering customers flexibility in selecting instances and types.

“Amazon Bedrock Marketplace has brought over 100 models... This lets you have lots of different choices.”
— Swami Sivasubramanian, [02:15]

2. Multimodal Toxicity Detection with Bedrock Guardrails

Bedrock Guardrails now supports multimodal toxicity detection for image content, ensuring generative AI applications handle content responsibly across text and images. This feature is available in preview across 11 AWS regions.

“Amazon Bedrock Guardrails gives you a comprehensive solution enabling detection and filtration of undesirable and potentially harmful image content.”
— Swami Sivasubramanian, [04:30]

3. Prompt Caching for Optimization

Prompt caching is introduced to reduce costs by up to 90% and latency by up to 85%. This feature caches frequently used prompts, minimizing the need for reprocessing and optimizing resource usage.

“Prompt caching is a new capability that can reduce costs by up to 90%. That's a lot and latency by up to 85%.”
— Swami Sivasubramanian, [06:10]

4. Bedrock Data Automation (BDA)

BDA automates the generation of insights from unstructured multimodal content, such as documents, images, video, and audio. This feature streamlines the creation of GenAI-based applications by integrating with Bedrock Knowledge Bases.

“BDA Bedrock Data Automation... lets developers automate the generation of valuable insights from unstructured multimodal content.”
— Swami Sivasubramanian, [08:45]

5. Intelligent Prompt Routing

This feature dynamically routes prompts to the most appropriate model within a family, optimizing for both quality and cost. Currently in preview, it supports various model combinations and will expand over time.

“Intelligent prompt routing predicts the performance of each model for each request and dynamically routes each request to the model that it predicts will be the most likely to give the best result at the lowest cost.”
— Swami Sivasubramanian, [10:20]

6. Enhancements to Bedrock Knowledge Bases

Bedrock Knowledge Bases now handle multimodal data, including text and visual content, and support structured data retrieval with a managed natural language to SQL module. Additionally, Graph Rag integration with Amazon Neptune allows for advanced data relationship insights.

“Bedrock Knowledge Bases extracts content from both text and visual data and generates semantic embeddings... Additionally, retrieved results now include source attribution for visual data.”
— Swami Sivasubramanian, [12:35]

Amazon SageMaker: Empowering Machine Learning Developers

1. SageMaker Partner AI Apps

This new capability allows customers to discover, deploy, and use machine learning applications from leading providers directly within SageMaker. Partner AI apps enhance productivity and reduce time to market by integrating seamlessly with the SageMaker environment.

“With SageMaker partner AI apps you can quickly subscribe to a partner solution, seamlessly integrate the app with your SageMaker development environment and get up and running.”
— Swami Sivasubramanian, [14:50]

2. SageMaker Hyperpod: Flexible Training Plans

SageMaker Hyperpod introduces flexible training plans that align with timelines and budgets. These plans ensure predictable training durations and costs while optimizing resource utilization and performance.

“SageMaker Hyperpod now provides flexible training plans. So you can get predictable training timelines and budget requirements whilst benefiting from resiliency, performance, optimized distribution training and better observation.”
— Swami Sivasubramanian, [16:40]

3. SageMaker Hyperpod Recipes

Hyperpod Recipes simplify training and fine-tuning of foundation models, allowing users to achieve state-of-the-art performance quickly. These recipes include pre-tested training stacks and support seamless transitions between different instance types.

“SageMaker Hyperpod recipes help you get up and running very quickly in terms of training and fine tuning publicly available foundation models in just minutes with state of the art performance.”
— Swami Sivasubramanian, [19:05]

Amazon Q and QuickSight: Advanced Data Analysis

1. Scenario Analysis in QuickSight (Preview)

Amazon QuickSight now offers scenario analysis capabilities powered by Amazon Q, enabling AI-assisted data analysis. This feature guides users through complex analyses, significantly speeding up decision-making processes.

“Amazon Queuing Quicksight simplifies in depth analysis with step by step guidance, which saves hours of manual data manipulation and unlocks data driven decision making in your organization.”
— Swami Sivasubramanian, [21:20]

2. Enhancements for SageMaker Canvas Users

Amazon Q Developer integrates with SageMaker Canvas, providing generative AI-powered assistance throughout the machine learning development lifecycle, from model preparation to deployment and testing.

“Amazon Q Developer can now guide SageMaker Canvas users through machine learning development... It walks you through that process, which is nice.”
— Swami Sivasubramanian, [23:15]

Responsible AI: Enhancing Transparency and Trust

AWS AI Service Cards

AWS introduces AI service cards for various services, including Amazon Nova Real, Canvas, Micro Lite and Pro, Titan Image Generator, and Titan Text Embeddings. These cards provide comprehensive information on use cases, limitations, and responsible AI design choices.

“AI service cards are a resource designed to enhance transparency by providing customers with a single place to find information on the intended use cases and limitations...”
— Swami Sivasubramanian, [25:40]

Key Focus Areas:

Fairness
Explainability
Privacy and Security
Safety and Controllability
Veracity and Robustness
Governance and Transparency

AWS Education Equity Initiative: Expanding Access to Learning

AWS announces a five-year commitment to the AWS Education Equity Initiative, pledging up to $100 million in AWS credits and technical support. This initiative supports organizations developing digital learning solutions for underserved learners globally.

“We're announcing a five year commitment of cloud technology and technical support for organizations creating digital learning solutions that expand access for underserved learners worldwide through the AWS Education Equity Initiative.”
— Swami Sivasubramanian, [28:10]

Eligibility:

Socially minded ed techs
Social enterprises
Non-profits
Governments
Corporate social responsibility teams

Application Process:

Organizations must demonstrate how their solutions will benefit students from underserved communities. Applications are now being accepted.

Conclusion: Looking Ahead

The hosts wrap up Episode #700 by highlighting the extensive updates shared in the keynote and tease the upcoming episode with additional insights.

“Some great updates today, one more episode coming tomorrow with a bunch of other stuff as well. It's going to be exciting.”
— Host, [30:00]

Listeners are encouraged to visit the AWS Podcast website to provide feedback and stay tuned for future episodes.

Website: AWSpodcastmazon.com

This comprehensive summary encapsulates the key points from Swami Sivasubramanian's keynote at re:Invent 2024, covering advancements in Amazon Bedrock, SageMaker, data analysis tools, responsible AI practices, and AWS's commitment to educational equity. Whether you're a developer, IT professional, or an AI enthusiast, these updates provide valuable insights into the evolving landscape of AWS's AI and data services.

Loading summary

Transcript1 lines

[00:00]
A
This is episode 700 of the AWS podcast, released on December 4th, 2024. Hello everyone and welcome back to the AWS Podcast. Simlish here with you. Great to have you back for yet another of our special Re Invent episodes. And just as an aside, I think we've hit episode 700. Incredible to me. I mean, what does 700 mean? It doesn't mean anything in particular to 699 or 701, but it's kind of a nice round number. And I'm so thrilled for the many of you that have stuck with me through the many, many years of recording. And hopefully it it's given you some value. So today we're going to cover the latest keynote and this one was from AWS's vice president of AI and data, who is of course Swami Sivra Subraman, and he shared some really interesting information today that is probably relevant to you. A lot of it, of course was around AI and the handling of data and the processing of data, the interpretation of data. So let's get into it and let's start with some new capabilities that Bedrock now has. Amazon Bedrock Marketplace has brought over 100 models to Amazon Bedrock. So these are 100 publicly available and proprietary foundation models. And these are in addition to Amazon Bedrock's industry leading serverless models. So customers deploy these models onto SageMaker endpoints where they can select their desired number of instances and instance types. And Amazon Bedrock Marketplace models can be accessed through Bedrock's unified APIs. And models which are compatible with Bedrock's Converse APIs can be used with Amazon Bedrock's tools like Agents, Knowledge Bases and guardrails. I've been talking about Converse API for a while. It's my favorite way to access Bedrock. Yet again, it benefits you, so this lets you have lots of different choices. So you can incorporate models from highly specialized areas for like finance or healthcare, or language translation models for Asian languages all in one place. And this is supported in a wide variety of regions. And Amazon Bedrock Guardrails now also supports multimodal toxicity detection for image content in preview. Now this is all about building and scaling your generative AI application responsibly for a wide range of use cases. And Amazon Bedrock Guardrails gives you a comprehensive solution enabling detection and filtration of undesirable and potentially harmful image content whilst retaining safe and relevant visuals. So customers can now use content filters for both text and image data in a single solution with configurable thresholds to detect and filter undesirable content across categories like hate insults, sexual and violence and build generative AI applications based upon responsible AI policies. And this new capability is in preview and it's available with all foundation models on Amazon Bedrock that support images, including fine tuned foundation models in 11 AWS regions globally. Now if you start to use generative AI at scale, you realize that optimization is always an important thing and Amazon Bedrock has now announced support for prompt caching. Now prompt caching is a new capability that can reduce costs by up to 90%. That's a lot and latency by up to 85%. Also a lot for supported models by caching frequently used prompts across multiple API calls, it allows you to cache repetitive inputs and avoid reprocessing content such as long system prompts and common examples that help guide the model's response. When cache is used, fewer computing resources are needed to generate the output. It's our old friend caching. As a result, not only can we process your request faster, but we can also pass along the cost savings from using fewer resources. Now prompt caching is available on Claude 3.5 Haiku and Claude 3.5 Sonnet V2 in Oregon and North Virginia and of course also on Nova Micro Nova Lite Nova Pro models in North Virginia. Now at launch, only a select number of customers will have access to this, but it's going to be rolling out. If you want to participate in the preview, there is a link in the show notes as well. Now we all know that wrangling data is always fun, not and Amazon Bedrock Data Automation is now available in preview summarised as BDA Bedrock Data Automation It's a new feature that lets developers automate the generation of valuable insights from unstructured multimodal content like documents, images, video and audio to build Genai based applications. So these insights can include video summaries of key moments, detection of inappropriate image content, automated analysis of complex documents, and lots more. You can also customize the output to generate specific insights in consistent formats required by your own systems and applications. This is all about saving time and making it easy to go. Now this is also integrated with Bedrock knowledge bases so it's easier for you to generate meaningful information from your unstructured multimodal content as you're building your responses using RAG or Retrieval Augmented Generation. This preview is available in US West Oregon. Something else that's available in Preview is Amazon Bedrock Intelligent Prompt Routing. Now this is a really cool little feature here. This routes prompts to different foundational models within a model family, which means you can optimize for quality of responses and cost. So using advanced prompt matching and model understanding techniques, Intelligent prompt routing predicts the performance of each model for each request and dynamically routes each request to the model that it predicts will be the most likely to give the best result at the lowest cost. Our old friend trying to optimize the outcome with the price. Customers can choose from two prompt routers in preview and route requests between Claude Sonnet 3.5 and Claude Haiku or between llama 3.1.8B and llama 3.170B. Now this will of course grow over time. We'll have different combinations available, but this is just the preview. But you can sort of see where this is going and Amazon Bedrock Knowledge Bases now processes multimodal data. So if you haven't used Knowledge Bases before, it's an easy way to get all your data in one place to use for retrieval, augmented generation or rag. With this launch, Bedrock Knowledge Bases extracts content from both text and visual data and generates semantic embeddings using the selected embedding model and stores them in the chosen vector store. This enables users to retrieve and generate answers to questions derived not only from text but from visual data. Additionally, retrieved results now include source attribution for visual data, so this enhances transparency and building trust for the generated outputs. So check that one out. And speaking of Amazon Bedrock Knowledge Bases, it now also supports structured data retrieval, so this allows you to have an end to end managed workflow for customers to build custom generative applications that can access and incorporate contextual information from a variety of structured and unstructured sources. Using advanced natural language processing, Bedrock Knowledge Bases can transform natural language queries into SQL queries, which means that users can retrieve data directly from the source without needing to move or pre processing the data. So often this can be really a challenge because often we're trying to train data models to convert natural language to SQL queries and manage governance, et cetera. And Bedrock Knowledge Bases eliminates these hurdles by providing a managed natural language to SQL module, so it's called the NLTOSQL module Natural Language to SQL. So reach out analyst can now simply ask what are my top five selling products last month And Bedrock Knowledge Base automatically translates that query into SQL, executes it against the database and returns the results, or can provide a summarised narrative result to generate accurate SQL queries. Bedrock Knowledge Bases leverages database schema, previous query history and other contextual information that it has about the data sources. Pretty cool. You know how much I love SQL. And Amazon Bedrock Knowledge Bases also now supports Graph Rag in Preview so this allows you to incorporate rag techniques with graph data now. Previously, customers face challenges in conducting exhaustive multi step searches across disparate content. By identifying key entities across documents, Graph Rag delivers insights that leverage relationships within the data and it gives you improved responses to end users. So for example, users can ask a travel application for family friendly beach destinations with direct flights and good seafood restaurants. Developers building generative AI applications can enable Graph Rag in just a few clicks by specifying their data sources and choosing Amazon Neptune analytics as their vector store. When creating a knowledge base, that's the secret sauce. This will automatically generate and store vector embeddings in Amazon Neptune analytics along with a graph representation of entities and their relationships. So Graph Rag with Amazon Neptune is built right in to Amazon Bedrock Knowledge Bases so no additional setup or additional charges beyond the underlying services. It just works. So check out the preview of that one. We're also happy to announce Genai Index in Amazon Kendra Now Amazon Kendra is an AI powered search service which lets you build intelligent search experiences and RAG systems to power your gen AI applications. So now you can use a new index. This is the Gen AI Index for Rag and Intelligent search. With the Kendra Gen AI index, customers get out of the box search accuracy with the latest information retrieval technologies and semantic models. It also supports mobility across AWS generative AI services like Bedrock Knowledge Bases and Q Business so you get flexibility in where you choose. It's available as a managed retriever in Bedrock Knowledge Bases, which means you can create knowledge bases powered by this new index as well and you can also integrate it with other services like guardrails, prompt flows and agents, et cetera. The Genai Index supports connectors for 43 different data sources which means you can get your data for where you need to. So Swami also spoke a lot about Amazon SageMaker today and all the different things it can do for you and we're happy to announce the concept of Amazon SageMaker partner AI apps. This is a new capability that enables customers to easily discover, deploy and use best in class machine learning engine AI development applications from leading app providers privately and securely, all without leaving Amazon SageMaker AI so you can develop performant AI models faster. With SageMaker partner AI apps you can quickly subscribe to a partner solution, seamlessly integrate the app with your SageMaker development environment and get up and running. SageMaker partner AI apps are fully managed and run privately and securely in your SageMaker environment, which reduces the risk of data and model exfiltration. At launch you can boost your team's productivity and reduce time to market by enabling things like Comet, deepchecks, Fiddler and Lakera just to name a few. And this is available in all currently supported regions except for GovCloud and Amazon. SageMaker Hyperpod now provides flexible training plans. So this allows you to train generative AI models within your timelines and your budgets. So you can get predictable training timelines and budget requirements whilst benefiting from things like resiliency, performance, optimized distribution training and better observation. So in just a few quick steps you specify your preferred compute instances, the desired amount of compute resources, the duration of your workload and when you'd like it to start. Then SageMaker will help you create the most cost efficient training plan, reducing time to train your model by weeks. Once you create and purchase your training plans, SageMaker automatically provisions the infrastructure and runs the training workloads on these compute resources without requiring any manual intervention. It also automatically takes care of pausing and resuming training between gaps in computer availability as the plan switches from one capacity block to another. If you wish to remove all the heavy lifting of your infrastructure management that our friend undifferentiated heavy lifting, you can also create and run training plans using the SageMaker fully managed training jobs and also now available for Amazon SageMaker Hyperpod. You now have centralized governance across all generative AI development tasks like training and inference, and you have full visibility of control over compute resource allocation, which means the most critical tasks are prioritized and you can maximize your compute resource utilization. In fact, we found that it can help you reduce model development costs by up to 40% with hyperpod task governance. It allows you to define your priorities for different tasks, set your limits. You can also monitor and audit the tasks that are running or waiting and it just allows you to smooth out that process. And related to this, we're also really happy to announce SageMaker Hyperpod recipes now. This helps you get up and running very quickly in terms of training and fine tuning publicly available foundation models in just minutes with state of the art performance now obviously training is fun, but can be challenging. It is foundation model sizes continue to grow to hundreds of billions of parameters. The process of customizing these models can take weeks of extensive experimentation and debugging. In addition, performing training optimizations to unlock better price performance is often unfeasible for customers because you need deep machine learning expertise that could also take longer to do with SageMaker Hyperpod recipes. Customers of all skill sets can get state of the art performance while quickly getting up and running. With publicly available foundation models like Llama 3.1405 billion mixture 8x22B and Mistral 7B. And these hyperpod recipes include a training stack tested by AWS, which means you remove weeks of tedious work experimenting with different configurations. And you can quickly switch between GPU based and AWS training based instances with a one line recipe change. And you can also enable automated model checkpointing for improved resiliency. And finally, you can run workloads in Production on the SageMaker AI Training Service of your choice. Now in terms of analyzing your data, we're happy to announce scenario analysis capabilities of Amazon Q in Quicksight in Preview. This new capability provides an AI assisted data analysis experience that helps you make better decisions faster. Amazon Queuing Quicksight simplifies in depth analysis with step by step guidance, which saves hours of manual data manipulation and unlocks data driven decision making in your organization. In fact, it can help users perform complex scenario analysis up to 10 times faster than spreadsheets. You can ask a question or state your goal in natural language and Amazon Queuing Quicksight guides you through every step of advanced data analysis, suggesting analytical approaches, automatically analyzing data, surfacing relevant insights and summarizing findings with suggested actions. This agentic approach breaks down data analysis into a series of easy to understand executable steps, helping you find solutions to common problems without specialized skills or tedious error prone data manipulation in spreadsheets. Working on an expansive analysis canvas, you can intuitively iterate your way to solutions by directly interacting with the data, refining analysis steps, or exploring multiple analysis paths side by side. This scenario analysis capability is available from any Amazon Quicksight Dashboard, so you can move seamlessly from visualizing data to modeling solutions. And also starting now, Amazon Q Developer can now guide SageMaker Canvas users through machine learning development. So this allows you to use generative AI powered assistance through the entire lifecycle. So model preparation, data preparation, deployment, testing, all the fun tasks you have to do. It walks you through that process, which is nice. Now in using all these new technologies, we need to use them responsibly and we're happy to announce new AWS AI service cards to advance responsible generative AI. So these are the new service cards for Amazon Nova Real, Amazon Canvas, Amazon Nova Micro Lite and Pro, Amazon Titan Image Generator and Amazon Titan Text Embeddings. Now, AI service cards are a resource designed to enhance transparency by providing customers with a single place to find information on the intended use cases and limitations, responsible AI design choices and performance optimization best practices for AWS AI services. So this is a really important part of what we do. This is part of our comprehensive development process to build services in a responsible way. They focus on key aspects of AI development and deployment, including fairness, explainability, privacy and security, safety, controllability, veracity and robustness, governance and transparency. Lots of things to think about here. By providing these cards, we want to empower customers with the knowledge they need to make informed decisions about using AI services in their applications and workflows. Now our AI service cards will continue to evolve and expand as we engage with our customers and the broader community to get feedback and we're going to just keep iterating on our approach and when talking about new technologies like this, skills and access to skills always comes up. It's often the number one conversation I have with executives and AWS Education Equity Initiative has been launched to boost education for underserved learners. So we're announcing a five year commitment of cloud technology and technical support for organizations creating digital learning solutions that expand access for underserved learners worldwide through the AWS Education Equity Initiative. While the use of educational technologies continues to rise, many organizations lack access to cloud computing and AI resources needed to accelerate and scale their work to reach more learners in need. So Amazon is committing up to $100 million in AWS credits and technical advising to support socially minded organizations build and scale learning solutions that utilize cloud and AI technologies. This will help reduce initial financial barriers and provide guidance on building and scaling AI powered education solutions using AWS technologies. Eligible recipients, including socially minded ed techs, social enterprises, non profits, governments and corporate social responsibility teams must demonstrate how their solution will benef students from underserved communities. And the initiative is now accepting applications. That's pretty cool. So some great updates today, one more episode coming tomorrow with a bunch of other stuff as well. It's going to be exciting. Hope you're enjoying the series. AWspodcastmazon.com is the place to give feedback and until next time, keep on building.