BUILDING RESILIENT AND ADAPTIVE AI MODELS FOR REAL-TIME ANALYTICS IN CLOUD ENVIRONMENTS

Authors

  • Srinivas Kolluri Quantum Integrators Group LLC, USA. Author

Keywords:

Cloud-Native Resilience Patterns (CNRP), Dynamic Learning Rate Adaptation (DLRA), Real-Time Analytics, Adaptive Performance, Enhancement Protocol (APEP)

Abstract

This article presents an innovative framework for developing resilient and adaptive artificial intelligence (AI) models specifically designed for real-time analytics in cloud computing environments. The proposed approach introduces a novel Dynamic Learning Rate Adaptation (DLRA) mechanism combined with Cloud-Native Resilience Patterns (CNRP) to create self-adjusting models that maintain high performance under varying workloads. The article implements an Adaptive Performance Enhancement Protocol (APEP) for resource optimization and an Integrity-Preserving Stream Processing (IPSP) mechanism for data consistency. The architecture utilizes microservices with containerization technologies, incorporating a Multi-Stage Recovery Protocol (MSRP) and Dynamic Resource Allocator (DRA) for enhanced system resilience. Experimental results demonstrate significant improvements, including a reduction in response time, sustained model accuracy, and reduction in resource utilization compared to traditional approaches. The system maintains performance under high loads and achieves availability during peak usage. The framework also shows remarkable adaptation capabilities with learning rate convergence within 5 iterations and resource allocation optimization in less than 1 second. These results validate the effectiveness of the approach in addressing the challenges of real-time analytics in cloud environments while providing a foundation for future advancements in autonomous systems and edge computing applications.

References

Gartner. (2024). Top Strategic Technology Trends for 2024. https://www.gartner.com/en/articles/gartner-top-10-strategic-technology-trends-for-2024

Alharthi, S.; Alshamsi, A.; Alseiari, A.; Alwarafy, A. Auto-Scaling Techniques in Cloud Computing: Issues and Research Directions. Sensors 2024, 24, 5551. https://doi.org/10.3390/s24175551

Buyya, R., Yeo, C. S., Venugopal, S., Broberg, J., & Brandic, I. (2009). Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility. Future Generation Computer Systems, 25(6), 599-616. https://doi.org/10.1016/j.future.2008.12.001

CNCF. “Cloud Native Survey”. [Online] Available: https://www.cncf.io/wp-content/uploads/2020/08/CNCF_Survey_Report.pdf

MLPerf. (2023). MLPerf Inference Benchmark Suite. https://mlcommons.org/en/inference-datacenter-56/

Google Cloud. (2024). AutoML Documentation. https://cloud.google.com/automl/docs

Apache Software Foundation. (2023). Apache Beam 2.49.0. https://beam.apache.org/blog/beam-2.49.0/

Netflix Technology Blog. (2023). “Scryer: Netflix’s Predictive Auto Scaling Engine” [ Online] Available: https://netflixtechblog.com/scryer-netflixs-predictive-auto-scaling-engine-a3f8fc922270

Zaharia, M., et al. (2012). Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. NSDI, 12(2), 15-28. https://www.usenix.org/system/files/conference/nsdi12/nsdi12-final138.pdf

De Sanctis, Martina & Bucchiarone, Antonio & Marconi, Annapaola. (2020). Dynamic adaptation of service-based applications: a design for adaptation approach. Journal of Internet Services and Applications. 11. 10.1186/s13174-020-00123-6. [Online] Available: http://dx.doi.org/10.1186/s13174-020-00123-6

Kubernetes. “Kubernetes Documentation”. [Online] Available: https://kubernetes.io/docs/

Microsoft Azure. (2023). Event-Driven Architecture Style. [Online] Available: https://learn.microsoft.com/en-us/azure/architecture/guide/architecture-styles/event-driven

Rajasekar, Vimal Raja & Santhi, G.. (2024). Enhancing Microservices Efficiency: Integrating Adaptive Learning for Automated Scaling in Cloud Environments. 1-6. 10.1109/INOCON60754.2024.10512047. [Online] Available: http://dx.doi.org/10.1109/INOCON60754.2024.10512047

Google Cloud Platform. “SRE fundamentals: SLIs, SLAs and SLOs” [Online] Available: https://cloud.google.com/blog/products/devops-sre/sre-fundamentals-slis-slas-and-slos

Yahoo! Research. (2023). Webscope Dataset: Reference Collection. https://webscope.sandbox.yahoo.com/

MongoDB. (2024). “Monitoring a Self-Managed MongoDB Deploymen”. https://www.mongodb.com/docs/manual/administration/monitoring/

Google Cloud. (2024). Performance Optimization Best Practices. https://cloud.google.com/architecture/framework/performance-optimization

Microsoft Azure. (2024). Cloud Integration Patterns and Practices. https://learn.microsoft.com/en-us/azure/architecture/patterns/

Malallah, Hayfaa Subhi, et al. "Performance analysis of enterprise cloud computing: a review." Journal of Applied Science and Technology Trends 4.01 (2023): 01-12. [Online] Available: https://www.jastt.org/index.php/jasttpath/article/view/139

CNFC AI Working Group“Cloud Native Artificial Intelligence” [Online] Available: https://www.cncf.io/wp-content/uploads/2024/03/cloud_native_ai24_031424a-2.pdf

Downloads

Published

2024-12-26

How to Cite

Srinivas Kolluri. (2024). BUILDING RESILIENT AND ADAPTIVE AI MODELS FOR REAL-TIME ANALYTICS IN CLOUD ENVIRONMENTS. INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING AND TECHNOLOGY (IJCET), 15(6), 1743-1764. https://mylib.in/index.php/IJCET/article/view/IJCET_15_06_149