MULTIMODALITY AND EFFICIENCY IN NATURAL LANGUAGE PROCESSING

Authors

  • Vamsi Krishna Thatikonda Snoqualmie, Washington, United States Author

Keywords:

Natural Language Processing, Multimodality, Human-Computer Interaction, Data Processing, Machine Learning Algorithms, Sentiment Analysis, Recommendation Systems

Abstract

Progress in Natural Language Processing (NLP) has equipped machines to comprehend and produce human language, thus sparking intuitive conversations between humans and computers. This research delves into an exhaustive analysis of two significant aspects of NLP: multimodality and efficiency. The study accentuates the role of multimodality in NLP, which incorporates an array of data types such as text, pictures, and sounds, thereby magnifying a machine's adeptness in grasping human language. Many instances, including Google's Image Search and systems recognizing emotions, illustrate the practical uses of multimodal NLP. On the other hand, the concept of efficiency in NLP is associated with the system's capacity to perform tasks rapidly while consuming minimal computational resources, a key characteristic required in applications like voice recognition or translating languages. The research also delves into notable instances of this merger, like multimodal sentiment analysis and recommendation systems. With the domain evolving, the study anticipates future directions, such as incorporating more modalities like scent and touch, and creating smaller, quicker models, thereby sketching a hopeful landscape for the future of NLP. The study ends with a plea for efficient algorithm designs and hardware layouts to tackle the complications of rising complexity in multimodal NLP systems and further facilitate human-computer interaction.

References

J. Hirschberg and C. D. Manning, “Advances in natural language processing,” Science, vol. 349, no. 6245, pp. 261–266, 2015. doi:10.1126/science.aaa8685

S. Vajjala, B. Majumder, A. Gupta, and H. Surana, Practical Natural Language Processing: A Comprehensive Guide to Building Real-World NLP Systems. Cambridge: O’Reilly, 2020.

Y. Deldjoo, J. R. Trippas, and H. Zamani, “Towards multi-modal conversational information seeking,” Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021. doi:10.1145/3404835.3462806

M. Treviso et al., “Efficient Methods for Natural Language Processing: A survey,” Transactions of the Association for Computational Linguistics, vol. 11, pp. 826–860, 2023. doi:10.1162/tacl_a_00577

D. Khurana, A. Koli, K. Khatter, and S. Singh, “Natural language processing: State of the art, current trends and challenges,” Multimedia Tools and Applications, vol. 82, no. 3, pp. 3713–3744, 2022. doi:10.1007/s11042-022-13428-4

D. S. Zwakman, D. Pal, and C. Arpnikanondt, “Usability evaluation of artificial intelligence-based voice assistants: The case of Amazon alexa,” SN Computer Science, vol. 2, no. 1, 2021. doi:10.1007/s42979-020-00424-4

V. Vijayarajan, M. Dinakaran, P. Tejaswin, and M. Lohani, “A generic framework for ontology-based information retrieval and image retrieval in web data,” Human-centric Computing and Information Sciences, vol. 6, no. 1, 2016. doi:10.1186/s13673-016-0074-1

K. Sailunaz, M. Dhaliwal, J. Rokne, and R. Alhajj, “Emotion detection from text and speech: A survey,” Social Network Analysis and Mining, vol. 8, no. 1, 2018. doi:10.1007/s13278-018-0505-2

Z. Zeng, Y. Deng, X. Li, T. Naumann, and Y. Luo, “Natural language processing for EHR-based computational phenotyping,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 16, no. 1, pp. 139–153, 2019. doi:10.1109/tcbb.2018.2849968

Marchisio et al., “Deep Learning for Edge Computing: Current Trends, Cross-Layer Optimizations, and open research challenges,” 2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), 2019. doi:10.1109/isvlsi.2019.00105

Y. Gordienko et al., “‘last mile’ optimization of Edge Computing ecosystem with deep learning models and specialized tensor processing architectures,” Advances in Computers, pp. 303–341, 2021. doi:10.1016/bs.adcom.2020.10.003

Z. Li, Y. Fan, B. Jiang, T. Lei, and W. Liu, “A survey on sentiment analysis and opinion mining for Social Multimedia,” Multimedia Tools and Applications, vol. 78, no. 6, pp. 6939–6967, 2018. doi:10.1007/s11042-018-6445-z

Y. Deldjoo, M. Schedl, P. Cremonesi, and G. Pasi, “Recommender Systems Leveraging Multimedia Content,” ACM Computing Surveys, vol. 53, no. 5, pp. 1–38, 2020. doi:10.1145/3407190

X. Jiao et al., “TinyBERT: Distilling Bert for Natural language understanding,” Findings of the Association for Computational Linguistics: EMNLP 2020, 2020. doi:10.18653/v1/2020.findings-emnlp.372

M. Mayo, “The main approaches to natural language processing tasks,” KDnuggets, https://www.kdnuggets.com/2018/10/main-approaches-natural-language-processing-tasks.html (accessed Jul. 28, 2023).

Downloads

Published

2023-08-07