Abstrakt | Current periocular and face recognition approaches utilize computationally costly deep neural networks, achieving notable recognition accuracies. Deploying such solutions in applications with limited computational resources requires minimizing their computational demand while maintaining similar recognition accuracies. Model compression techniques like model quantization can be used to reduce the computational costs of deep models. This approach is widely studied and applied to different machine-learning tasks, however it is understudied and investigated for biometrics. We propose in this work to reduce the computational cost of face and periocular recognition models using fixed- and mixed-precision model quantization. Specifically, we first quantize the full-precision models to fixed 8 and 6 bits, reducing the required memory footprint by 5x while maintaining, to a very large degree, the recognition accuracies. However, our achieved results demonstrated that by quantizing the models to extremely low b bits, e.g., below 6 bits, the accuracies significantly dropped, which motivated our investigation on mixed-precision quantization. Hence, we propose to utilize an iterative mixed-precision quantization scheme. In each iteration, the least important parameters are selected based on their weight magnitude and quantized to low b-bit precision and the model is fine-tuned. This approach is repeated until all parameters are quantized to low b-bit precision, achieving extreme reduction in memory footprint, e.g., 16x times, without significant loss in the model accuracies. The effectiveness of mixed- and fixed-precision quantization for biometric recognition models is studied and proved for two modalities, face and periocular, using three different deep network architectures and using different b bit precision. |
---|