We’ve moved quite a bit as a family within the last 15 years – from moving between cities, different countries to different continents. Each move was different.
The first one was pretty easy since we didn’t have a lot of stuff. All we needed was a medium-sized pickup truck. The next move was to a different country. At this point, we’ve accumulated several personal belongings. But because the move was no longer a driving distance away, we had to be very intentional about what we want to bring with us. Two adults, two toddlers and four large suitcases were all that made it to the new country. That meant discarding a lot of unnecessary baggage that we won’t be needing in our new home.
The last major move was to a different continent. It required a large crate, six large suitcases and all four of us. The suitcases went with us while the crate had to be shipped separately and would arrive 2 months after. We still had to discard a lot of unnecessary baggage. The amount of personal belongings that we had were just a by-product of a growing family.
Asking The Right Question
Your databases are also moving on a regular basis – be it to a newer version of the database engine, a larger machine, a larger storage capacity, a hypervisor or the public cloud. I bet you’ve done a fair amount of moving databases around in your career.
A standard practice for moving databases has always been to move everything. That’s why a backup-copy-restore process is pretty common for database upgrade and migration. Whether you’re using log shipping, database mirroring or Availability Groups in SQL Server to move databases, you still need to do the standard backup-copy-restore process.
A common question we ask when moving databases is this: “where are we moving?” Because it’s always been the question that everyone ask.
I think we need to change the question to this: “do we really need everything for the move?” Because asking this question forces us to think about the following:
- Data Lifecycle Management. Far more often, you’ve only considered maintaining databases. You worry about backups, think about optimizing queries and protecting from unauthorized access (recall treating your databases to a S.P.A.) You are too focused on the present that you fail to consider the past and the future. Business intelligence has started to consider the future with analytics. But what about the past? How do you archive old data? How do you deal with old data governed by compliance requirements?
- Operational Efficiency. I’m sure you’ll agree that care and feeding of current data is more important than old data. Some folks categorize this as hot versus cold data. In fact recovery objectives for hot data are different from cold data. In case something goes wrong, hot data should be recovered as fast as you possibly can. Not so much if the data was from five years ago. But when you move all of your data, you will be forced to treat them the same. Hot and cold data is on the same storage subsystem and same machine. Same recovery objectives and service level agreements. To a point where you are sacrificing your SLAs because of cold data.
- Operational Cost. A server with 24-CPU cores, 256 GB of memory and solid state drives. Add SQL Server licensing on top of that. That ain’t cheap. All because you have to manipulate records in a very large table stored in a single filegroup. You’re running index maintenance jobs for a table where majority of the records were from five years ago. It’s like paying someone to clean your storage facility when you haven’t even touched your personal stuff for years. And I haven’t even covered storage cost for backup, for data replicated in an Availability Group configuration, for buffer pool memory, etc.
What Does This Have to Do With the Cloud?
With on-premises deployments, we barely look at the cost of doing something. That’s because the SAN has already been purchased and the hardware lease extended. It’s the bean counters that get to see how much it costs to deploy something. And it’s not even specific to your database. The cost gets buried under the sum total for all IT costs – the SAN is shared and the physical servers run virtual machines from all business units.
Not so with the cloud. You get a line item for everything that you use. Need fast machines and storage for your database? Larger VM sizes and premium storage cost more. Want additional storage for your backups? You need to pay for additional capacity. Most cloud providers won’t charge you for ingress – getting your data into their facilities. But they will charge for egress – getting your data outside of their facilities.
Asking the question “do we really need everything for the move?” does change the tone of the conversation. As you are designing your hybrid SQL Server high availability and disaster recovery (HA/DR) solution, start thinking about whether or not you want to bring all of your data to the cloud. Because I can guarantee you, this won’t be the last time you’ll move databases around.
Please note: I reserve the right to delete comments that are offensive or off-topic.