Talk

Sponsored

Thu 26 June

BST

EDT

Virtual

Virtual

Virtual

On demand

Optimizing AI workloads in Kubernetes: Pruning for efficiency and scale

This session will explore model pruning techniques and Kubernetes-native strategies for optimizing resource-intensive AI workloads, focusing on efficient scheduling, autoscaling, and inference serving in cloud environments.

As AI adoption continues to grow, managing resource efficiency and costs in cloud-native environments becomes increasingly critical. Shashidhar Shenoy and Achyut Sarma Boggaram will discuss the potential of model pruning as an optimization technique and its integration with Kubernetes-native tools. They will cover strategies for resource scheduling, autoscaling configurations, and best practices for deploying pruned AI models in Kubernetes environments. While model pruning is still an emerging practice for AI inference in the cloud, this session will examine its benefits, trade-offs, and technical considerations, providing valuable insights for platform teams seeking to optimize AI workloads. Attendees will gain practical knowledge on how to scale AI applications more efficiently while reducing resource usage and associated costs.

Talk

Sponsored

Thu 26 June

Virtual

Virtual

Virtual

On demand

Optimizing AI workloads in Kubernetes: Pruning for efficiency and scale

This session will explore model pruning techniques and Kubernetes-native strategies for optimizing resource-intensive AI workloads, focusing on efficient scheduling, autoscaling, and inference serving in cloud environments.

Thu 26 June

EDT time

EDT

CEST

EDT

BST

Presented by

Panelist

Achyut Sarma Boggaram

Senior Machine Learning Engineer, Torc AI

Panelist

Panelist

Moderator

Shashidhar Shenoy

Tech Lead, Google

Tell everyone

As AI adoption continues to grow, managing resource efficiency and costs in cloud-native environments becomes increasingly critical. Shashidhar Shenoy and Achyut Sarma Boggaram will discuss the potential of model pruning as an optimization technique and its integration with Kubernetes-native tools. They will cover strategies for resource scheduling, autoscaling configurations, and best practices for deploying pruned AI models in Kubernetes environments. While model pruning is still an emerging practice for AI inference in the cloud, this session will examine its benefits, trade-offs, and technical considerations, providing valuable insights for platform teams seeking to optimize AI workloads. Attendees will gain practical knowledge on how to scale AI applications more efficiently while reducing resource usage and associated costs.

Talk

Sponsored

Virtual

LiveDay NYC

LiveDay LDN

On demand

Thu 26 June

Optimizing AI workloads in Kubernetes: Pruning for efficiency and scale

This session will explore model pruning techniques and Kubernetes-native strategies for optimizing resource-intensive AI workloads, focusing on efficient scheduling, autoscaling, and inference serving in cloud environments.

CEST

BST

EDT

Duration:

90min

60min

Presented by

Shashidhar Shenoy

Tech Lead, Google

Achyut Sarma Boggaram

Senior Machine Learning Engineer, Torc AI

Tell everyone

As AI adoption continues to grow, managing resource efficiency and costs in cloud-native environments becomes increasingly critical. Shashidhar Shenoy and Achyut Sarma Boggaram will discuss the potential of model pruning as an optimization technique and its integration with Kubernetes-native tools. They will cover strategies for resource scheduling, autoscaling configurations, and best practices for deploying pruned AI models in Kubernetes environments. While model pruning is still an emerging practice for AI inference in the cloud, this session will examine its benefits, trade-offs, and technical considerations, providing valuable insights for platform teams seeking to optimize AI workloads. Attendees will gain practical knowledge on how to scale AI applications more efficiently while reducing resource usage and associated costs.

Talk

Sponsored

Virtual

LiveDay NYC

LiveDay LDN

On demand

BST

EDT

Optimizing AI workloads in Kubernetes: Pruning for efficiency and scale

This session will explore model pruning techniques and Kubernetes-native strategies for optimizing resource-intensive AI workloads, focusing on efficient scheduling, autoscaling, and inference serving in cloud environments.

Presented by

Panelist

Achyut Sarma Boggaram

Senior Machine Learning Engineer, Torc AI

Panelist

Panelist

Host

Shashidhar Shenoy

Tech Lead, Google

Tell everyone

Sign up now