As deep learning models have grown in their complexity and applications, they’ve also grown large and cumbersome.
Large models running on cloud environments have huge compute demand resulting in high cloud cost for developers, posing a major barrier for profitability and scalability. For edge deployments, edge devices are resource-constrained and therefore cannot support large and complex models.
Whether the model is deployed on the cloud or at the edge, AI developers are often confronted with the challenge of reducing their model size without compromising model accuracy.
Quantization is a common technique used to reduce model size, though it can sometimes result in reduced accuracy. It allows practitioners to apply quantization techniques without sacrificing accuracy. QAT is done in the model training process rather than after the fact. The model size can typically be reduced by two to four times, and sometimes even more.
In this Live Coding session, Harpreet will teach you about quantization aware training and you’ll compare post-training quantization (PTQ) to quantization-aware training (QAT), and demonstrate how both methods can be easily performed using Deci’s SuperGradients library.
Don’t miss out on this opportunity to dive into the world of deep learning and see SuperGradients in action!
Register Here: Zoom Registration