Airflow and Kubernetes

Airflow and Kubernetes

At Bluecore, we want to run a wide variety of workflows across teams and projects. We were struggling to find one, easy way to manage the growing number of workflows. In 2017, we did research and analysis to determine that Airflow, a platform to create, execute, and monitor workflows, could be a good solution to our problems. So we spun up our first Airflow instance as a proof of concept, and it worked well!

I was originally just a user of Airflow at Bluecore, but I was brought on to the project to help make this technology “production ready”. We had a long list of important workflows we wanted to move over — we had to make sure that Airflow was ready to scale, correctly monitoring successes and failures, and easy to use. I worked on this project for about 8 months and led the project for 6 months of that time.

This post briefly talks about how Airflow works at Bluecore, what I learned implementing it, and how I’ve been able to share that knowledge. I’ve also written more in-depth blog posts on particular aspects of Airflow, check them out on my Medium account here!

Airflow at Bluecore

We run Airflow on Google Kubernetes Engine, Google’s managed Kubernetes, using an open-source project called kube-airflow. Basically, this just means that we run individual parts of Airflow as separate containers and allow Google to do a lot of the management and scaling for us.

We set up three (nearly identical) Airflow environments: production, QA, and development. We have over ten projects running workflows live in production now. Developers can easily use the QA or dev environments to test their workflows on different data sets and using different permissions.

To run something in Airflow, all you need to do is create a DAG, a directed acyclic graph, out of code that represents each step. At Bluecore, we recommend create DAGs using our in-house Kubernetes and App Engine Operators.

Best of all, Airflow isn’t just for the Engineering team! We have worked to make a simple interface that allows teams such as Data Insights (who perform in-depth business analysis) to manage their own workflows with ease.

What I Learned

“Make Airflow production-ready” was maybe the most vague project objective I’d ever been handed. But, it was fun to work on collecting requirements from users and future users, explore the limitations of Airflow, and read about how other people and companies were using the technology.

For this project, I had to grow very quickly technically. I ramped up on all of these topic in about a month:

  • Airflow, workflow tools, and workflow terminology

  • Containers, containerized applications, and container management

  • Scaling, resource estimation, and appropriate monitoring levels

  • Kubernetes and Google Kubernetes Engine

Teaching Others

As the primary developer and de facto evangelist for Airflow at Bluecore, I have given multiple tech talks, training, and guidance around how Airflow works and general best practices.

I have had phone calls, email conversations, and office visits with other companies to help set up Airflow within their own teams.

I have also written two blog posts on Airflow. The first is about a tough debugging problem we had when first trying to make it production ready. The second is about the correct programming paradigm for Airflow and our in-house Kubernetes Operator that I wrote to facilitate that.

Since my blog posts were published, they’ve received 10K+ views and 150+ fans. We have a few more posts to publish and a few things to open-source, I’m looking forward to sharing more about what we’ve learned!

One of my blog posts!

One of my blog posts!