27 subscribers
Ga offline met de app Player FM !
Podcasts die het beluisteren waard zijn
GESPONSORDE


1 Dave Ramsey: 5 Stages to Build and Scale a Business That Lasts | Entrepreneurship | E344 1:03:38
Unsticking Ourselves from Glue: Migrating PayIt’s Data Pipelines to Argo Workflows and Hera | DoKC Town Hall
Manage episode 399571016 series 2865115
Unsticking Ourselves from Glue: Migrating PayIt’s Data Pipelines to Argo Workflows and Hera
Presented by Matt Menzenski, Senior Software Engineering Manager, Payitgov
At PayIt, we’ve been deploying applications to Kubernetes almost since the beginning of the company. Our data workloads, however, have run instead in AWS Glue. This has worked well enough for the reporting use cases that have been the main focus of this team historically. However, at the beginning of 2022, the PayIt data team began building out a new data platform, and in the process, ran into a number of challenges with Glue. In this talk, I will share the difficulties that we encountered with building, deploying, and orchestrating ETL pipelines in AWS Glue, our decision process for moving those workloads into Kubernetes, and the ELT architecture that we’ve arrived at today. Related Links DoKC Website - https://dok.community/ DoKC Meetups - https://www.meetup.com/data-on-kubernetes-community/ Join Slack - https://join.slack.com/t/dokcommunity/shared_invite/zt-1vgv7ymz7-YtLFvZicrcLP9fS3o_r2_w
243 afleveringen
Manage episode 399571016 series 2865115
Unsticking Ourselves from Glue: Migrating PayIt’s Data Pipelines to Argo Workflows and Hera
Presented by Matt Menzenski, Senior Software Engineering Manager, Payitgov
At PayIt, we’ve been deploying applications to Kubernetes almost since the beginning of the company. Our data workloads, however, have run instead in AWS Glue. This has worked well enough for the reporting use cases that have been the main focus of this team historically. However, at the beginning of 2022, the PayIt data team began building out a new data platform, and in the process, ran into a number of challenges with Glue. In this talk, I will share the difficulties that we encountered with building, deploying, and orchestrating ETL pipelines in AWS Glue, our decision process for moving those workloads into Kubernetes, and the ELT architecture that we’ve arrived at today. Related Links DoKC Website - https://dok.community/ DoKC Meetups - https://www.meetup.com/data-on-kubernetes-community/ Join Slack - https://join.slack.com/t/dokcommunity/shared_invite/zt-1vgv7ymz7-YtLFvZicrcLP9fS3o_r2_w
243 afleveringen
Tất cả các tập
×
1 Unsticking Ourselves from Glue: Migrating PayIt’s Data Pipelines to Argo Workflows and Hera | DoKC Town Hall 23:17

1 Repel Boarders! How to find a Kubernetes operator that really protects your data | DoKC Town Hall 19:22

1 DoK @ Comcast - Deliver Business Outcomes & Improved DevX with Data Services on K8s | DoKC Town Hall 16:43

1 DoK Talks - What is Kafka? The rise of one of the world's most used streaming data technologies // Abbey Russell 15:28

1 DoK Talks - (almost)Everything you need to know about stateful cloud native network applications // W Watson 43:39

1 The Outer Nerd #001 - Dungeons & Dragons - Why should you care? // Abhi Vaidyanatha, Fabian Met & Chase Christensen 58:25

1 Data-driven Diversity, Equity, and Inclusion // Lisa-Marie Namphy, Melissa Logan, Tiffany Jachja, Audra Montenegro & Cortney Nickerson (DoK Day North America 2022) 19:50

1 Formula 1 telemetry processing using Apache Kafka on Kubernetes // Paolo Patierno (DoK Day North America 2022) 15:36

1 Choosing Kubernetes for Stateful Applications // Akshay Ram & Peter Schuurman (DoK Day North America 2022) 18:31

1 Kubernetes 360º - Data driven observability - from Secrets to logs // Ben Hirschberg (DoK Day North America 2022) 17:11

1 Shifting Left Stateful Applications In Kubernetes // Viktor Farcic (DoK Day North America 2022) 15:52

1 Medical - Healthcare Data on Kubernetes // Olyvia Rakshit & Prasad Dorbala (DoK Day North America 2022) 13:41

1 Highly Available Postgres Clusters In Kubernetes // John Long & Jonathan Gonzalez (DoK Day North America 2022) 15:04


1 Open Source Databases on Kubernetes- Best Practices // Peter Zaitsev (DoK Day North America 2022) 16:04


1 Databases on Kubernetes: Why are they important? // With Bhavin Shah, Xing Yang, Gabriele Bartolini & Patrick McFadin (DoK Day North America 2022) 34:51


1 Architecting Your First Event Driven Serverless Streaming Applications on K8 // Timothy Spann (DoK Day North America 2022) 13:29

1 Fybrik - A Kubernetes based platform for governed data use // Flora Gilboa-Solomon, Alexey Roytman, Maryna Strelchuk & Barry Hijkoop (DoK Day North America 2022) 20:59

1 The Challenges of Data Processing On Kubernetes - A look at Spark, Flink, Dask, and Ray // Holden Karau (DoK Day North America 2022) 20:09

1 Scaling our SaaS offering to thousands of clusters // Dax McDonald (DoK Day North America 2022) 21:04

1 Why we decided to migrate our Jaeger storage to ClickHouse on Kubernetes // Arul Jegadish Francis (DoK Day North America 2022) 13:48

1 Building a Digital Factory for the Sheet Metal Industry // Elie Assi (From the DoK Day North America 2022) 20:48

1 How we built our Big Data Stack (almost) entirely on top of Kubernetes // Neylson Crepalde (From DoK Day NA 2022) 16:00


1 Dok #152-Running PostgreSQL in Kubernetes:from day 0 to day 2 with CloudNativePG // Gabriele Bartolini 1:03:50


1 Dok Talks #150 - Building a Simple Postgres Async Streaming Cluster // Julian Fischer 1:04:45

1 DoK Talks #149 - Overcoming challenges with protecting and migrating data in multi-cloud K8s environments // Sebastian Glab & Martin Phan 47:40

1 DoK Talks #147 - Evaluating Cloud Native Storage Vendors // Dinesh Majrekar 1:00:03

1 Dok Talks #146 - OpenFeature - Making feature flags a commodity // Oleg Nenashev 1:01:30


1 DoK Talks #144 - We will Dok You! - The journey to adopt stateful workloads on k8s // Guy Menahem 1:06:30

1 DoK Talks #142 - Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your Stateful Workload // Peter Schuurman 58:45

1 DoK Talks #144 - Mastering MongoDB on Kubernetes, the power of operators // Arek Borucki 1:00:50


1 DoK Talks #141 - Dossier: multi-tenant distributed Jupyter Notebooks // Iacoppo Colonnelli & Dario Tranchitella 1:00:10







1 DoK Talks #134 - Introducing CloudNativePG // Gabriele Bartolini & Leonardo Cecchi 1:05:20


1 DoK Talks #132 - Time-series on SQL Server on Kubernetes on ARM64… without SQL Server! // Álvaro Hernández 1:05:15



1 What we've learned from running a PostgreSQL managed service on Kubernetes (DoK Day EU 2022) // Oleksii Kliukin 11:06

1 Weathering The Cloud Storm- Modern Data Management Patterns for Reliability and Availability (DoK Day EU 2022) // Denis Magda 10:46


1 The many uses of Kubernetes cross cluster migration of persistent data (DoK Day EU 2022) // Ryan Kaw 7:39

1 The future of data on Kubernetes with Adobe and CNCF (DoK Day EU 2022) // Joseph Sandoval, Xing Yang & Sylvain Kalache 17:29


1 Testing the Mettle- Evaluating data solutions for large-scale production to check who stacks up (DoK Day EU 2022) // Dinesh Majrekar 9:26

1 Serverless Event Streaming Applications as Functions on K8 (DoK Day EU 2022) // Timothy Spann 8:43


1 Running a database on local NVMes on Kubernetes (DoK Day EU 2022) // Tomáš Nožička & Maciej Zimnoch 9:42

1 PV TrashCan - Protection against accidental deletion of PVs or Namespaces (DoK Day EU 2022) // Veda Talakad, Aditya Kulkarni & Aditya Dani 11:07

1 Protecting data with CSI Volume Snapshots on Kubernetes (DoK Day EU 2022) // Grant Griffiths 11:10





1 Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Developers (DoK Day EU 2022) // Arsh Sharma, Lapo Elisacci & Ramiro Berrelleza 14:02

1 Kanister & Kopia - An Open-Source Data Protection Match Made in Heaven (DoK Day EU 2022) // Pavan Navarathna 13:38



1 Growing up fast - Kubernetes and Real-Time Analytic Applications (DoK Day EU 2022) // Robert Hodges 15:30


1 From Laptop to Cloud. Developing Cloud-Native Applications with Containerized Databases (DoK Day EU 2022) - Nic Vermandé 17:16

1 Disaggregated Container Attached Storage - Yet Another Topology with What Purpose (DoK Day EU 2022) // Nick Connolly 9:32

1 Datashim - a framework for declarative management of datasets on Kubernetes (DoK Day EU 2022) // Srikumar Venugopal 15:36



1 Autoscaling Stateful Workloads in Kubernetes (DoK Day EU 2022) // Mohammad Fahim Abrar & Md. Kamol Hasan 10:14


1 Dok Talks #131 - How to win friends and influence businesses // Fabian Met 1:00:48

1 Dok Talks #130- Leaning on Kubernetes Portability to Manage Databases Anywhere // Robert Hodges 1:04:45


1 Dok Talks #126- Automatically Instrument Kubernetes Apps with OpenTelemetry // James Blackwood-Sewell 1:03:40

1 Dok Talks #128- Getting Started with the Kubernetes Secrets Store CSI Driver // Kim Schlesinger 53:10

1 Dok Talks #127 - Flux for Helm Users! // Scott Rigby 1:21:35



1 Dok Talks #123 - Can Data Become a Declarative Resource? // Roey Libfeld, Michael Greenberg & Uri Zaidenwerg 1:07:10



1 Dok Talks #121 - Running Stateful Apps in Kubernetes Made Simple // Steve Buchanan 1:00:40

1 Dok Talks #120 - A Gentle Introduction to Building Data Intensive Applications // Joe Karlsson 1:01:50

1 Dok Talks #118 - Troubleshooting ClickHouse Performance // Shiv Lyer 1:02:50


1 Dok Specials - Ask Us Anything About Postgres // Gabriele Bartolini, Ryan Booz & Álvaro Hernández 1:03:30

1 Dok Specials - Ask Patrick and Jeff Anything About Data on Kubernetes // Patrick McFadin & Jeff Carpenter 1:02:15

1 Dok Specials - Unravel the key to your Kubernetes secrets 2:14:47

1 Dok Talks #116 - Nebula Graph: Open Source Distributed Graph Database // Wey (Siwei) Gu 1:07:10

1 Dok Special - Show me the money: The business side of DoK // Evan Powell, Brian Schechter & Misha Herscu 57:10

1 Dok Talks #115 - What More Can I Learn From My OpenTelemetry Traces? // John Pruitt 1:00:45



1 Dok Talks #112 - Production Postgres Made Easy on Kubernetes // Jonathan Katz 1:02:45

1 Dok Talks #111 - Scheduled Scaling with Dask and Argo Workflows 1:05:30


1 Dok Talks #109 - Benchmarking for PostgreSQL workloads in Kubernetes / Part 2 // Gabriele Bartolini 1:04:35

1 Dok Talks #108 - Postgres on Kubernetes Applied at Scale in Zalando // Álvaro Hernández & Alexander Kukushkin 1:02:20


1 DoK Talks #106- Cloud native data warehousing with Kubernetes // Mark Cusack & Matthew Ripley 59:25

1 DoK Special: Mental Health and Covid-19: Retrospective and Perspective // Andrea Dobson, Erin Grinshteyn and Julia simon 59:15

1 DoK Talks #105 - Run Graph Database on K3s with KubeSphere // Feynman Zhou & Wey Gu 1:02:10

1 DoK Talks #104- How to enable self-service Infrastructure by shifting your Data left with Kubernetes // Nic Vermande 1:08:30

1 DoK Talks#103 -Performant and Version-Aware Analytics With Spark & lakeFS on K8s // Itai Admi 39:25

1 DoK Talks #102- From Enemy to Evangelist // Rick Vasquez 1:01:40

1 DoK Talks #100- CAPE for data backup/restore on kubernetes // Sanjeev Ganjihal 1:02:41

1 DoK Talks #101- Redpanda: how to build a storage engine for kubernetes // Alexander Gallego 1:01:25

1 DoK Talks #99- ETL/ELT on Kubernetes with Airbyte: K8s Development Insights // Abhi Vaidyanatha 1:00:45

1 DoK Talks #98- It´s not me, it´s you: Migrating between third party storage solutions at scale // Dinesh Majrekar 48:30

1 DoK Talks #97- Learn about Developing a Multicluster Operator with K8ssandra Operator // John Sanda 1:02:10

1 DoK Talks #96- Persistent Disk or StatefulSet? The right way and the wrong way to make apps persist state inside a K8s cluster // Neil Cresswell 50:35

1 DoK Talks #95 I´ve got 99 Workloads and 53 of Them Are Data // Michelle Gienow 1:02:10

1 DoK Talks #94- Security and SRE // Tammy Butow & Prima Virani 1:02:20



1 DoK Talks #91- Leveraging Druid Operator to manage Apache Druid on Kubernetes // Adheip Singh 55:50
Welkom op Player FM!
Player FM scant het web op podcasts van hoge kwaliteit waarvan u nu kunt genieten. Het is de beste podcast-app en werkt op Android, iPhone en internet. Aanmelden om abonnementen op verschillende apparaten te synchroniseren.