Running Remote Shuffle Service to Solve a Well-Known Challenge for Apa... Melody Yang & Keyong Zhou
在Kubernetes上运行远程洗牌服务,以解决Apache Spark的一个众所周知的挑战。| Running Remote Shuffle Service to Solve a Well-Known Challenge for Apache Spark on Kubernetes - Melody Yang, Amazon & Keyong Zhou, Alibaba Cloud
在数据驱动一切的时代,Apache Spark已经成为流行的开源计算框架,用于机器学习和数据ETL的用例,因为它能够高效处理大规模数据处理。然而,Spark在Kubernetes上面临的一个众所周知的挑战是部分支持的动态资源分配(DRA)。本次演讲旨在探讨一种新颖的解决方案,通过利用开源的远程Shuffle 服务(RSS)来解决Spark在k8s上的DRA挑战。通过将Shuffle数据转移到Spark应用程序的执行器Pod之外的远程存储中,我们最终可以实现存储和计算的解耦,以支持动态扩展需求的最终目标。参与者将深入了解Spark在k8s上的动态资源分配,开源解决方案如何缓解Spark在k8s上面临的资源争用问题,以及如何在大数据领域提供更可靠和可扩展的解决方案。
In the era of data drives everything, Apache Spark has emerged as a popular open-source computation framework for the use cases of machine learning and data ETL, due to its ability to efficiently handle large-scale data processing. However, one of the well-known challenges faced by Spark on Kubernetes is the partially supported dynamic resource allocation (DRA). This talk aims to explore a novel solution to tackle the Spark's DRA challenge in k8s by leveraging an open-sourced remote shuffle service (RSS). By offloading the shuffle data to remote storage outside the Spark application's executor pods, we can eventually fulfill the ultimate goal of decoupling the storage and compute to support dynamic scaling needs. Attendees of this talk will gain insights into Spark's dynamic resource allocation in k8s, how the open-sourced solution can alleviate the resource contention issues faced by Spark on k8s, and how it provides a more reliable and scalable solution in the Big Data field.
CNCF概况(幻灯片)
扫描二维码联系我们!
CNCF (Cloud Native Computing Foundation)成立于2015年12月,隶属于Linux Foundation,是非营利性组织。
CNCF(云原生计算基金会)致力于培育和维护一个厂商中立的开源生态系统,来推广云原生技术。我们通过将最前沿的模式民主化,让这些创新为大众所用。请关注CNCF微信公众号。