Performance Optimization and Software/Hardware Co-design across PyTorch, CUDA, and NVIDIA GPUs - MLOps.community | Wave AI Podcast Notes