
Hosted by smallest.ai · EN

What happens when you combine the best of old-school language models and the power of neural networks? You get NN-grams! In this episode, we break down how this new model blends n-grams (which remember word patterns) with neural networks (which can generalize like a pro). The result? More accurate and faster speech recognition. NN-grams are already outperforming traditional models on tasks like Italian speech recognition, and they’re faster too. Want to know how this hybrid model is changing the speech AI game? Tune in to learn more! Link to research paper- https://arxiv.org/abs/1606.07470 Follow us on social media: Linkedin: https://www.linkedin.com/company/smallest/ Twitter: https://x.com/smallest_AI Instagram: https://www.instagram.com/smallest.ai/ Discord: https://smallest.ai/discord

In this episode, we dive into the revolutionary Listen, Attend and Spell (LAS) model that transforms how speech-to-text systems work. Unlike traditional methods that separate the process into multiple stages, LAS combines everything into one model, making it faster and more efficient. The system has two key parts: a 'listener' that processes the audio input, and a 'speller' that converts the information into text using attention-based mechanisms. Tune in to learn how LAS outperforms older speech recognition models, achieving impressive accuracy without relying on dictionaries or language models! Link to research paper- https://arxiv.org/abs/1508.01211 Follow us on social media: Linkedin: https://www.linkedin.com/company/smallest/ Twitter: https://x.com/smallest_AI Instagram: https://www.instagram.com/smallest.ai/ Discord: https://smallest.ai/discord

In this episode, we explore how Scheduled Sampling helps Recurrent Neural Networks (RNNs) make better predictions for tasks like machine translation and image captioning. Normally, during training, RNNs use the actual previous word or token to predict the next one. But when making predictions, the model has to use its own previous predictions, which can lead to mistakes building up. Scheduled Sampling solves this by slowly shifting the model from using the correct token during training to using its own predictions, helping it learn more effectively and reduce errors. Tune in to learn how this approach helped improve results in a major image captioning competition! Link to research paper- https://arxiv.org/abs/1506.03099 Follow us on social media: Linkedin: https://www.linkedin.com/company/smallest/ Twitter: https://x.com/smallest_AI Instagram: https://www.instagram.com/smallest.ai/ Discord: https://smallest.ai/discord

How do you speed up deep neural network training and improve its performance simultaneously? Batch Normalization is the answer. By addressing internal covariate shift, it allows models to train faster, requiring fewer steps and lower learning rates. In this episode, we break down how this technique was applied to a state-of-the-art image classification model, cutting training time by 14 times and surpassing human-level accuracy on ImageNet. Tune in to learn how Batch Normalization is transforming deep learning and setting new benchmarks in AI research. Link to research paper- https://arxiv.org/abs/1502.03167 Follow us on social media: Linkedin: https://www.linkedin.com/company/smallest/ Twitter: https://x.com/smallest_AI Instagram: https://www.instagram.com/smallest.ai/ Discord: https://smallest.ai/discord

How does AI learn to predict and generate realistic human motion? In this episode, we dive into the power of Gated Recurrent Units (GRUs) for sequence modeling. Discover how this advanced RNN architecture captures long-term dependencies, predicts motion data point by point, and generates lifelike movements. From speech synthesis to machine translation, GRUs are proving their versatility—tune in to see how they’re reshaping AI’s ability to understand and create dynamic sequences. Link to research paper- https://arxiv.org/abs/1501.00299 Follow us on social media: Linkedin: https://www.linkedin.com/company/smallest/ Twitter: https://x.com/smallest_AI Instagram: https://www.instagram.com/smallest.ai/ Discord: https://smallest.ai/discord

What’s the secret to teaching AI to understand large vocabularies? This week, we’re unpacking the power of Long Short-Term Memory (LSTM) networks in speech recognition. These advanced RNN architectures overcome the limitations of traditional models, like vanishing gradients, to deliver state-of-the-art performance with compact designs. Tune in to learn how LSTMs are changing the game for large-scale acoustic modeling and why they’re a cornerstone of modern AI speech systems. Link to research paper- https://arxiv.org/abs/1402.1128 Follow us on social media: Linkedin: https://www.linkedin.com/company/smallest/ Twitter: https://x.com/smallest_AI Instagram: https://www.instagram.com/smallest.ai/ Discord: https://smallest.ai/discord

Can machines teach themselves to listen better? In this episode, we explore how the innovative "noisy student training" method—originally a game-changer for image classification—is now transforming automatic speech recognition. By combining self-training with smart data augmentation, researchers have achieved record-breaking word error rates on challenging datasets like LibriSpeech. Tune in to learn how this approach is setting new benchmarks in AI’s ability to understand and process human speech. Link to research paper- https://arxiv.org/abs/2005.09629 Follow us on social media: Linkedin: https://www.linkedin.com/company/smallest/ Twitter: https://x.com/smallest_AI Instagram: https://www.instagram.com/smallest.ai/ Discord: https://smallest.ai/discord

Why would an AI engineer intentionally turn off parts of a neural network during training? Sounds counterintuitive, right? In this episode, we’re uncovering the magic of dropout—a technique that forces neural networks to generalize better and avoid overfitting. Join us as we explore how this breakthrough is reshaping AI benchmarks across the board. Link to research paper- https://arxiv.org/abs/1207.0580 Follow us on social media: Linkedin: https://www.linkedin.com/company/smallest/ Twitter: https://x.com/smallest_AI Instagram: https://www.instagram.com/smallest.ai/ Discord: https://smallest.ai/discord

What if AI could learn to create new data that looks just like the real thing? In this episode, we dive into the groundbreaking concept of Generative Adversarial Networks (GANs). Learn how two AI models—one that generates data and another that judges its authenticity—work together in an adversarial game to create realistic images, sounds, and more. We’ll break down how this innovative approach eliminates the need for complex inference networks and opens up new possibilities for training AI. Tune in to discover how GANs are shaping the future of artificial intelligence and generative models! Link to research paper- https://arxiv.org/pdf/1406.2661 Follow us on social media: Linkedin: https://www.linkedin.com/company/smallest/ Twitter: https://x.com/smallest_AI Instagram: https://www.instagram.com/smallest.ai/ Discord: https://smallest.ai/discord

In this episode, we dive into the power of conditional adversarial networks and how they’re transforming image-to-image translation. Learn how the Pix2Pix approach not only maps images from one form to another but also learns how to train itself—eliminating the need for manually designed loss functions. We’ll explore its success in tasks like synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images. Plus, find out how artists and creators worldwide are embracing Pix2Pix to create stunning visuals without any need for complex tweaks. Tune in to understand how this general-purpose AI is reshaping digital creativity Link to research paper- https://arxiv.org/abs/1611.07004 Follow us on social media: Linkedin: https://www.linkedin.com/company/smallest/ Twitter: https://x.com/smallest_AI Instagram: https://www.instagram.com/smallest.ai/ Discord: https://smallest.ai/discord