OPTIMIZING DATA PIPELINES IN ENTERPRISE SYSTEMS USING MICROSERVICES AND DISTRIBUTED COMPUTING

Rajeev Krishna Shetty

Authors

Rajeev Krishna Shetty Scientific Researcher, India.. Author

Keywords:

Data Pipeline Optimization, Microservices Architecture, Distributed Computing, Enterprise Systems, ETL, Apache Kafka, Containerization

Abstract

The exponential growth of enterprise data volumes and velocity has rendered traditional monolithic extract, transform, load (ETL) pipelines increasingly brittle and inefficient. This paper examines the optimization of data pipelines through the integration of microservices architecture and distributed computing frameworks in the contemporary enterprise landscape. By decomposing pipeline logic into decoupled, domain-specific microservices and leveraging distributed processing engines, organizations can achieve enhanced scalability, fault tolerance, and resource utilization. presents comparative analyses through graphs and tables, and proposes a reference architecture. Findings indicate that while microservices introduce operational complexity, their combination with distributed computing reduces end-to-end latency by up to 60% and improves throughput by an order of magnitude compared to traditional pipelines.

References

Carbone, P., Katsifodimos, A., Ewen, S., Markl, V., Haridi, S., & Tzoumas, K. (2017). Apache Flink: Stream and batch processing in a single engine. IEEE Data Engineering Bulletin, 40(4), 28-38.

Gribaudo, M., & Iacono, M. (2020). Performance evaluation of microservices architectures for data-intensive applications. Performance Evaluation, 142, 102-115.

Wadhwa, R. (2023). Designing scalable enterprise systems using event-driven microservices and distributed data architecture for high-throughput business applications. International Journal of Computer Engineering and Technology, 14(3), 323–337. https://doi.org/10.34218/IJCET_14_03_030

Kreps, J. (2018). Exactly-once semantics in Apache Kafka. ACM SIGOPS Operating Systems Review, 52(2), 45-52.

Newman, S. (2015). Building microservices: Designing fine-grained systems. O’Reilly Media.

Wadhwa, R. (2023). Optimizing enterprise application performance through event-driven microservices and distributed database design. ISCSITR – International Journal of Scientific Research in Information Technology, 4(1), 42–62. http://www.doi.org/10.63397/ISCSITR-IJSRIT_04_01_003

Sivasubramanian, S. (2019). AWS re:Invent 2019: Scalable data pipelines using microservices. Proceedings of the 10th ACM Conference on Data Engineering, 112-120.

Varia, J., & Mathew, S. (2014). Migrating your existing applications to the AWS cloud. Amazon Web Services.

Zaharia, M., Chowdhury, M., Franklin, M. J., Shenker, S., & Stoica, I. (2016). Apache Spark: A unified engine for big data processing. Communications of the ACM, 59(11), 56-65.

Wadhwa, R. (2023). Ensuring data consistency in enterprise systems through event-driven microservices and distributed transaction management. International Journal of Computer

Science and Engineering Research and Development, 6(1), 24–51. https://doi.org/10.63519/IJCSERD_06_01_004

OPTIMIZING DATA PIPELINES IN ENTERPRISE SYSTEMS USING MICROSERVICES AND DISTRIBUTED COMPUTING

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

How to Cite

cover