Player FM uygulamasıyla çevrimdışı Player FM !
Dok Talks #111 - Scheduled Scaling with Dask and Argo Workflows
Manage episode 318002536 series 2865115
https://go.dok.community/slack
https://dok.community/
ABSTRACT OF THE TALK
Complex computational workloads in Python are a common sight these days, especially in the context of processing large and complex datasets. Battle-hardened modules such as Numpy, Pandas, and Scikit-Learn can perform low-level tasks, while tools like Dask makes it easy to parallelize these workloads across distributed computational environments. Meanwhile, Argo Workflows offers a Kubernetes-native solution to provisioning cloud resources in Kubernetes and triggering workflows on a regular schedule. Being Kubernetes-native, Argo Workflows also meshes nicely with other Kubernetes tools. This talk discusses the combination of these two worlds by showcasing a set-up for Argo-managed workflows which schedule and automatically scale-out Dask-powered data pipelines in Python.
BIO
Former academic in the field of renewable energy simulation and energy systems analysis. Currently responsible for architecting and maintaining the cloud- and data strategy at ACCURE Battery Intelligence
KEY TAKE-AWAYS FROM THE TALK
Argo Workflows + Dask is a nice combination for data-processing pipelines. There are a a few "gotchyas" to be on the look-out for, but in nevertheless this is still a generally-applicable and powerful combination.
https://github.com/sevberg
243 bölüm
Manage episode 318002536 series 2865115
https://go.dok.community/slack
https://dok.community/
ABSTRACT OF THE TALK
Complex computational workloads in Python are a common sight these days, especially in the context of processing large and complex datasets. Battle-hardened modules such as Numpy, Pandas, and Scikit-Learn can perform low-level tasks, while tools like Dask makes it easy to parallelize these workloads across distributed computational environments. Meanwhile, Argo Workflows offers a Kubernetes-native solution to provisioning cloud resources in Kubernetes and triggering workflows on a regular schedule. Being Kubernetes-native, Argo Workflows also meshes nicely with other Kubernetes tools. This talk discusses the combination of these two worlds by showcasing a set-up for Argo-managed workflows which schedule and automatically scale-out Dask-powered data pipelines in Python.
BIO
Former academic in the field of renewable energy simulation and energy systems analysis. Currently responsible for architecting and maintaining the cloud- and data strategy at ACCURE Battery Intelligence
KEY TAKE-AWAYS FROM THE TALK
Argo Workflows + Dask is a nice combination for data-processing pipelines. There are a a few "gotchyas" to be on the look-out for, but in nevertheless this is still a generally-applicable and powerful combination.
https://github.com/sevberg
243 bölüm
Tüm bölümler
×Player FM'e Hoş Geldiniz!
Player FM şu anda sizin için internetteki yüksek kalitedeki podcast'leri arıyor. En iyi podcast uygulaması ve Android, iPhone ve internet üzerinde çalışıyor. Aboneliklerinizi cihazlar arasında eş zamanlamak için üye olun.