Shuhe Accelerates AI Model Service Deployment with Knative - Peng Li, Alibaba Cloud & Wenzhe Wei
数禾使用Knative加速AI模型服务部署 | Shuhe Accelerates AI Model Service Deployment with Knative - Peng Li, Alibaba Cloud & Wenzhe Wei, Shanghai Shuhe Information Technology
在数禾(上海数禾信息技术有限公司)的金融业务场景中,AI模型经常进行迭代,并且会同时在线部署多个模型版本以评估模型。这样做会带来高资源成本。如何在确保服务质量的基础上提高AI服务运维效率并降低资源成本是一个具有挑战性的问题。Knative是一个基于Kubernetes的开源无服务器应用架构。目前,数禾通过Knative部署了500多个AI模型服务,节省了60%的资源成本,并且平均部署周期从1天缩短到了0.5天。在本次演讲中,我们将向您展示如何基于Knative部署AI工作负载,包括:● 扩展Serving的弹性能力,以支持基于并发数的精确弹性、弹性预测。● 如何在Knative中部署Stable Diffusion。● 数禾在Knative中的AI模型服务最佳实践。
In the financial business scenario of Shuhe(Shanghai Shuhe Information Technology Co., Ltd.), the AI model is iterated frequently, and multiple versions of the model will be deployed online at the same time for evaluating the model. The real effect has high resource costs. How to improve the efficiency of AI service operation and maintenance and reduce resource costs on the basis of ensuring service quality is challenging. Knative is an open source serverless application architecture based on Kubernetes. At present, Shuhe deploys 500+ AI model services through Knative, saving 60% of resource costs, and the average deployment cycle is shortened from 1 day to 0.5 days. In this talk, we will show you how to deploy AI workloads based on Knative, including: ● Expand the Serving elasticity capability to support precise elasticity based on the concurrency and predictive scaling. ● How to deploy Stable Diffusion in Knative ● Shuhe's Best Practices for AI model service in Knative
CNCF概况(幻灯片)
扫描二维码联系我们!
CNCF (Cloud Native Computing Foundation)成立于2015年12月,隶属于Linux Foundation,是非营利性组织。
CNCF(云原生计算基金会)致力于培育和维护一个厂商中立的开源生态系统,来推广云原生技术。我们通过将最前沿的模式民主化,让这些创新为大众所用。请关注CNCF微信公众号。