I’ve been fielding this question a lot these days: “We’re building a data warehouse – should we build it here or in the cloud?” It’s a fair question, but it’s not the question that should be asked. The more appropriate question is this: “What part of our data warehouse solution should be in the cloud, and how does it work together with our on-premises data?”
I shared a few of my thoughts on this topic a few weeks ago in a podcast interview with Carlos Chacon, when we discussed whether or not the on-premises data warehouse was dead. Without spoiling all of the details of that conversation, my short answer is that the on-premises data warehouse is alive and well but is no longer the only DW option.
As recently as three years ago, the cloud was still relatively new and not yet widely in use in most organizations. At the same time, companies selling cloud services were in the midst of a massive marketing effort to direct customers to the cloud. Microsoft famously declared themselves to be all-in on cloud well before the market was ready to follow. Many IT leaders and technologists bristled at the thought of being forced into the cloud at the expense of tried-and-true on-premises solutions.
However, in the past couple of years the message from cloud providers has softened. No more is it “cloud or bust”. Rather, cloud services companies – and Microsoft in particular – have reshaped the message to one in which the cloud is just one piece of a heterogeneous architecture that may include on-prem, PaaS, IaaS, and SaaS solutions. At the same time, consumers are realizing the value of cloud-based solutions for some of their architecture. Although I rarely have a client that wants to build an all-cloud infrastructure, most everyone I work with is at least exploring if not actively using cloud services for a portion of their data systems.
Cloud services are here to stay. No, the cloud absolutely will not take over on-premises data storage and processing. Rather, cloud offerings will be one more option for managing data and the code around it. So the question is not whether you should be in the cloud – the answer is yes (or it soon will be). The more practical question is how to best leverage cloud services as part of a hybrid strategy to minimize development time and total cost of ownership.
This post originally appeared in the Data Geek Newsletter.