Artwork

Inhoud geleverd door EDGE AI FOUNDATION. Alle podcastinhoud, inclusief afleveringen, afbeeldingen en podcastbeschrijvingen, wordt rechtstreeks geüpload en geleverd door EDGE AI FOUNDATION of hun podcastplatformpartner. Als u denkt dat iemand uw auteursrechtelijk beschermde werk zonder uw toestemming gebruikt, kunt u het hier beschreven proces https://nl.player.fm/legal volgen.
Player FM - Podcast-app
Ga offline met de app Player FM !

Optimization Techniques for Powerful yet Tiny Machine Learning Models

59:37
 
Delen
 

Manage episode 422964912 series 3574631
Inhoud geleverd door EDGE AI FOUNDATION. Alle podcastinhoud, inclusief afleveringen, afbeeldingen en podcastbeschrijvingen, wordt rechtstreeks geüpload en geleverd door EDGE AI FOUNDATION of hun podcastplatformpartner. Als u denkt dat iemand uw auteursrechtelijk beschermde werk zonder uw toestemming gebruikt, kunt u het hier beschreven proces https://nl.player.fm/legal volgen.

Can machine learning models be both powerful and tiny? Join us in this episode of TinyML Talks, where we uncover groundbreaking techniques for making machine learning more efficient through high-level synthesis. We sit down with Russell Clayne, Technical Director at Siemens EDA, who guides us through the intricate process of pruning convolutional and deep neural networks. Discover how post-training quantization and quantization-aware training can trim down models without sacrificing performance, making them perfect for custom hardware accelerators like FPGAs and ASICs.
From there, we dive into a practical case study involving an MNIST-based network. Russell demonstrates how sensitivity analysis, network pruning, and quantization can significantly reduce neural network size while maintaining accuracy. Learn why fixed-point arithmetic is superior to floating-point in custom hardware, and how leading research from MIT and industry advancements are revolutionizing automated network optimization and model compression. You'll gain insights into how these techniques are not just theoretical but are being applied in real-world scenarios to save area and energy consumption.
Finally, explore the collaborative efforts between Siemens, Columbia University, and Global Foundries in a wake word analysis project. Russell explains how transitioning to hardware accelerators via high-level synthesis (HLS) tools can yield substantial performance improvements and energy savings. Understand the practicalities of using algorithmic C data types and Python-to-RTL tools to optimize ML workflows. Whether it's quantization-aware training, data movement optimization, or the fine details of using HLS libraries, this episode is packed with actionable insights for streamlining your machine learning models.

Send us a text

Support the show

Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org

  continue reading

Hoofdstukken

1. TinyML Talks (00:00:00)

2. Network Pruning and Quantization (00:10:51)

3. Optimizing Quantized Neural Networks (00:21:51)

4. High-Level Synthesis for ML Acceleration (00:37:27)

5. Hardware Design and Optimization Techniques (00:47:06)

35 afleveringen

Artwork
iconDelen
 
Manage episode 422964912 series 3574631
Inhoud geleverd door EDGE AI FOUNDATION. Alle podcastinhoud, inclusief afleveringen, afbeeldingen en podcastbeschrijvingen, wordt rechtstreeks geüpload en geleverd door EDGE AI FOUNDATION of hun podcastplatformpartner. Als u denkt dat iemand uw auteursrechtelijk beschermde werk zonder uw toestemming gebruikt, kunt u het hier beschreven proces https://nl.player.fm/legal volgen.

Can machine learning models be both powerful and tiny? Join us in this episode of TinyML Talks, where we uncover groundbreaking techniques for making machine learning more efficient through high-level synthesis. We sit down with Russell Clayne, Technical Director at Siemens EDA, who guides us through the intricate process of pruning convolutional and deep neural networks. Discover how post-training quantization and quantization-aware training can trim down models without sacrificing performance, making them perfect for custom hardware accelerators like FPGAs and ASICs.
From there, we dive into a practical case study involving an MNIST-based network. Russell demonstrates how sensitivity analysis, network pruning, and quantization can significantly reduce neural network size while maintaining accuracy. Learn why fixed-point arithmetic is superior to floating-point in custom hardware, and how leading research from MIT and industry advancements are revolutionizing automated network optimization and model compression. You'll gain insights into how these techniques are not just theoretical but are being applied in real-world scenarios to save area and energy consumption.
Finally, explore the collaborative efforts between Siemens, Columbia University, and Global Foundries in a wake word analysis project. Russell explains how transitioning to hardware accelerators via high-level synthesis (HLS) tools can yield substantial performance improvements and energy savings. Understand the practicalities of using algorithmic C data types and Python-to-RTL tools to optimize ML workflows. Whether it's quantization-aware training, data movement optimization, or the fine details of using HLS libraries, this episode is packed with actionable insights for streamlining your machine learning models.

Send us a text

Support the show

Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org

  continue reading

Hoofdstukken

1. TinyML Talks (00:00:00)

2. Network Pruning and Quantization (00:10:51)

3. Optimizing Quantized Neural Networks (00:21:51)

4. High-Level Synthesis for ML Acceleration (00:37:27)

5. Hardware Design and Optimization Techniques (00:47:06)

35 afleveringen

Alle afleveringen

×
 
Loading …

Welkom op Player FM!

Player FM scant het web op podcasts van hoge kwaliteit waarvan u nu kunt genieten. Het is de beste podcast-app en werkt op Android, iPhone en internet. Aanmelden om abonnementen op verschillende apparaten te synchroniseren.

 

Korte handleiding

Luister naar deze show terwijl je op verkenning gaat
Spelen