Quantization in Deep Learning

In recent years deep learning models have become huge, reaching hundreds of billions of parameters. Hence the need to reduce their size. Of course, there was the need to accomplish this task without resulting in a reduced accuracy. Enters quantization. Background As you might know, deep learning models eat numbers, both during training and inference. When the task has to do with images, we just note that images are nothing more than matrices of pixels, so we’re already good to go....

September 25, 2023