EDGE AI: QUANTIZATION AS THE KEY TO ON-DEVICE SMARTNESS

Authors

  • Dwith Chenna Senior Embedded DSP Engineer, Computer Vision, MagicLeap Inc, United States Author

Keywords:

Edge AI, Deep Learning, Neural Network, Quantization

Abstract

In recent developments, the significance of Edge AI has come to the forefront. Edge devices, which encompass a wide array of IoT devices and embedded systems, benefit from the deployment of efficient and compact neural network models. Edge AI faces a significant hurdle due to their resource-constrained nature in terms of computation and memory. Deploying these models on resource-constrained devices like edge devices becomes challenging. To address this issue, quantization has emerged as an efficient strategy to reduce the computational and memory requirements, enabling their deployment on edge devices. Yet, a pivotal concern lies in preserving the performance of quantized models at levels comparable to their original floating-point counterparts. In this paper, we review the details of quantization, including its foundational assumptions, techniques and tradeoffs that influence the quantization accuracy to achieve optimal results while embracing quantization's benefits.

 

 

References

Vahid Dastjerdi, A.; Buyya, R. Fog Computing: Helping the Internet of Things Realize. IEEE Comput. Soc. 2016, 49, 112–116

Cui, L.; Yang, S.; Chen, F.; Ming, Z.; Lu, N.; Qin, J. A survey on application of machine learning for Internet of Things. Int. J. Mach. Learn. Cybern. 2018, 9, 1399–1417.

Makkar, A.; Ghosh, U.; Rawat, D.B.; Abawajy, J.H. FedLearnSP: Preserving Privacy and Security Using Federated Learning and Edge Computing. IEEE Consum. Electron. Mag. 2022, 11, 21–27.

Nawrat Z. Introduction to AI-driven surgical robots. Artificial Intelligence Surgery. 2023; 3(2): 90-7. http://dx.doi.org/10.20517/ais.2023.14

Chen, M.; Liu, W.; Wang, T.; Liu, A.; Zeng, Z. Edge intelligence computing for mobile augmented reality with deep reinforcement learning approach. Comput. Netw. 2021, 195, 108186.

Foukalas, F.; Tziouvaras, A. Edge Artificial Intelligence for Industrial Internet of Things Applications: An Industrial Edge Intelligence Solution. IEEE Ind. Electron. Mag. 2021, 15, 28–36.

Yang, B.; Cao, X.; Xiong, K.; Yuen, C.; Guan, Y.L.; Leng, S.; Qian, L.; Han, Z. Edge intelligence for autonomous driving in 6G wireless system: Design challenges and solutions. IEEE Wirel. Commun. 2021, 28, 40–47.

Zhang, X.; Li, Y.; Pan, J.; Chen, D. Algorithm/Accelerator Co-Design and Co-Search for Edge AI. IEEE Trans. Circuits Syst. II Express Briefs 2022, 69, 3064–3070.

Lee, Y.-L.; Tsung, P.-K.; Wu, M. Techology trend of edge AI. In Proceedings of the 2018 International Symposium on VLSI Design, Automation and Test (VLSI-DAT), Hsinchu, Taiwan, 16–19 April 2018; pp. 1–2.

Sipola, T.; Alatalo, J.; Kokkonen, T.; Rantonen, M. Artificial Intelligence in the IoT Era: A Review of Edge AI Hardware and Software. In Proceedings of the 2022 31st Conference of Open Innovations Association (FRUCT), Helsinki, Finland, 27–29 April 2022; pp. 320–331.

Sabry, F.; Eltaras, T.; Labda, W.; Alzoubi, K.; Malluhi, Q. Machine Learning for Healthcare Wearable Devices: The Big Picture. J. Healthc. Eng. 2022, 2022, 4653923.

Imran, H.; Mujahid, U.; Wazir, S.; Latif, U.; Mehmood, K. Embedded Development Boards for Edge-AI: A Comprehensive Report. arXiv 2020, arXiv:2009.00803.

Surianarayanan, C.; Lawrence, J.J.; Chelliah, P.R.; Prakash, E.; Hewage, C. A Survey on Optimization Techniques for Edge Artificial Intelligence (AI). Sensors 2023, 23, 1279. https://doi.org/10.3390/s23031279.

Dwith Chenna, Evolution of Convolutional Neural Network(CNN): Compute vs Memory bandwidth for Edge AI, IEEE FeedForward Magazine 2(3), 2023, pp. 3-13.

Alkhulaifi, A.; Alsahli, F.; Ahmad, I. Knowledge distillation in deep learning and its applications. PeerJ Comput. Sci. 2021, 7, e474.

Qi, C.; Shen, S.; Li, R.; Zhao, Z.; Liu, Q.; Liang, J.; Zhang, H. An efficient pruning scheme of deep neural networks for Internet of Things applications. EURASIP J. Adv. Signal Process. 2021, 2021, 31.

Krishnamoorthi, R. Quantizing deep convolutional networks for efficient inference: A whitepaper. arXiv 2018, arXiv:1806.08342

Yan, Z.; Shi, Y.; Wang, Y.; Tan, M.; Tian, Y. Towards Accurate Low Bit-Width Quantization with Multiple Phase Adaptations. Proc. AAAI Conf. Artif. Intell. 2020, 34, 6591–6598.

Dwith Chenna, Fixed Point Implementation ofConvolutional Neural Networks (CNN) on DigitalSignal Processor (DSP), International Journal ofArtificial Intelligence & Machine Learning (IJAIML),2(1), 2023, pp. 23-34.

Geraeinejad, V.; Sinaei, S.; Modarressi, M.; Daneshtalab, M. RoCo-NAS: Robust and Compact Neural Architecture Search. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18–22 July 2021; pp. 1–8.

Bichen, W.; Dai, X.; Zhang, P.; Wang, Y.; Sun, F.; Wu, Y.; Tian, Y.; Vajda, P.; Jia, Y.; Keutzer, K. Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019.

https://github.com/onnx/models/tree/main/vision/classification

Intel Neural Compressor: https://github.com/intel/neural-compressor

Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "ImageNet Classification with Deep Convolutional Neural Networks." In Advances in Neural Information Processing Systems, 2012.

Downloads

Published

2023-08-22

How to Cite

EDGE AI: QUANTIZATION AS THE KEY TO ON-DEVICE SMARTNESS. (2023). INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE & APPLICATIONS (IJAIAP), 2(1), 58-69. https://mylib.in/index.php/IJAIAP/article/view/IJAIAP_02_01_003