Shopify’s Infrastructure Collaboration with Google

We’re always working to deliver the best commerce experience to our merchants and their customers. We provide a seamless merchant experience while shaping the future of retail by building a platform that can handle the traffic of a Kylie Cosmetic flash sale (they sell out in 20 seconds), ship new features into production hundreds of times a day, and process more than double the amount of orders year over year.

For Production Engineering to meet these needs, we regularly review our technology stack to ensure we are using the best tools for the job and our journey to the Cloud is a perfect example. That’s why, we are excited to share that Shopify is now building our Cloud with Google, but before sharing the details of this announcement, we want to provide some context on our journey.

Shopify has been a cloud company since day one. We provide a commerce cloud to our merchants, solving their worries about hiring full-time IT staff to manage the infrastructure side of the business. Cloud is part of our DNA and our public cloud connection goes back to 2006, the same year both Shopify and Amazon Web Services (AWS) launched. Early on, we leveraged the public cloud as a small piece of our commerce cloud. It was great for hosting some of our smaller services, but we found the public cloud wasn’t a great fit for our main Rails monolith.

We’re pragmatic about how to evolve and invest in our infrastructure. In our startup days - with a small team - we valued simplicity and chose to focus on shipping the foundations of a commerce platform by deferring more complex infrastructure like database sharding. As we grew in scale and engineering expertise, we took on solving more complex patterns. With each major infrastructure scalability feature we shipped, like database sharding, application sharding, and production load testing, we continued to revisit how to horizontally scale our Rails application across thousands of servers. Over the years, we moved more and more of our supporting services to the Cloud, gaining additional context which fed into our developing monolith Cloud strategy.

Our latest push to the Cloud started over two years ago. Google launched Google Kubernetes Engine (GKE) (formerly Google Container Engine) as we had just finished production-hardening Docker. In 2014, Shopify invested in Docker to capitalize on the benefits of immutable infrastructure: predictable, repeatable builds and deployments; simpler and more robust rollbacks; and elimination of configuration management drift. Once you’re running containers, the next natural step is to take inspiration from Google’s Borg and start building out a dynamic container management and orchestration system. Being early adopters of Docker meant there weren’t many open-source options available, so we decided to build minimal container management features ourselves. The community and codebase were in its infancy and changing rapidly. Building these features allowed us to focus on application scalability and resilience while avoiding additional complexity as the Docker community matured.

In 2016, internal discussions began around what Shopify would look like in the future. The infrastructure changes from 2012 to 2016 allowed us to lay the foundation for using the Cloud in a pragmatic way via database sharding, application sharding, perf testing and automated failovers, but we were still missing an orchestration solution. Luckily, several exciting developments were happening, and the most promising one for Shopify was Kubernetes, an open-source container management system created by the teams at Google that built Borg and GKE.

After 12 years of building and running the foundation of our own commerce cloud with our own data centers, we are excited to build our Cloud with Google. We are working with a company who shares our values in open-source, security, performance and scale. We are better positioned to change the face of global commerce while providing more opportunities to the 600,000+ merchants on our platform today.

Since we began our Google Cloud migration, we have:

  • Built our Shop Mover, a selective database data migration tool, that lets us rebalance shops between database shards with an average of 2.5s of downtime per shop
  • Migrated over 50% of our data center workloads, and counting, to Google Cloud
  • Contributed and leveraged, Grafeas, Google’s open source initiative to define a uniform way for auditing and governing the modern software supply chain
  • Grown to over 400 production services and built a platform as a service (PaaS) to consolidate all production services on Kubernetes
  • Joined the Cloud Native Computing Foundation (CNCF) and participated in the Kubernetes Apps Special Interest Group and Application Definition Working Group

By leveraging Google’s deep understanding of global infrastructure at scale, we’re able to ensure that every engineer we hire focuses on building and shaping the future of commerce on a global scale.

Stay tuned. We’re excited to share more stories about Shopify’s journey to Google Cloud with you.

Dale Neufeld, VP of Production Engineering