Since the introduction of the new security features in SQL Server 2016, one of the most common question I get asked is around the performance impacts of these new features. Whenever someone asks a question about security in relation to performance, I tell them how someone would secure their house or any valuable items:
A house is like any other container that contains valuable items. You can choose to just secure your house or secure the valuables – jewellery, cash, documents, etc. – as well. When you secure the valuables, you add another layer of security that restricts and slows down your access. This explanation is more than enough to convince any technology professional of the impacts of security on performance, be it about SQL Server or something else.
But there are those who require proof. Hence, this blog post. I wanted to show how data is stored on disk when Dynamic Data Masking and Always Encrypted are enabled on the database. For this example, you can use the AdventureWorks2016CTP3 database available from the Microsoft Download Center. The database has already been configured to use the new SQL Server 2016 features.
How Data Is Presented To The User
This example shows Dynamic Data masking in action and what users see if they do not have privileges to see the masked data. I highlighted the two columns that are configured with a masking function.
Notice that the SSN and CreditCardNumber column values are encrypted. These columns use the Always Encrypted feature in SQL Server 2016. Now, going back to our “house” analogy, imagine your database being your house and your data as your valuables. We already have several layers of “locks” in our “house” in terms of logins and permissions. But how do those “valuables” get stored inside the house?
How Data Is Stored On Disk
Let’s use the undocumented DBCC PAGE (unless you consider this Microsoft KB article official) to look at the underlying storage mechanism for our “valuables.” Note that the [Sales].[CustomerPII] table in the AdventureWorks2016CTP3 database does not have a clustered index nor any constraints so you may need to make some adjustments in the schema to easily navigate to the data page that contains the record for CustomerID value 11732. What I did was I modified the schema and defined the CustomerID column as a clustered index to get the page ID value.
Looking at the data, the masked columns appear as they are on disk. This validates Ronit Reger’s statement on his blog post Use Dynamic Data Masking to obfuscate your sensitive data.
* Note that Dynamic Data Masking is not a replacement for access control mechanisms, and is not a method for physical data encryption.
In contrast, the encrypted columns are encrypted on disk and the data types are different on disk compared to how they were defined in the table schema – SSN is defined with nvarchar(11) while CreditCardNumber is defined with nvarchar(25). This means that those “valuables” are even more secured on disk, requiring additional layers of security just to get access to them.
The way the data is stored on disk directly impacts performance, regardless of whether the columns are masked or a GUID value is used as a clustered index. And while these features are primarily security-focused, there is no denying the fact that any layer of security added to a system will impact its overall performance. Now, whether the impacts are negligible or not is a whole different story. And this is where testing plays a very important role.
Please note: I reserve the right to delete comments that are offensive or off-topic.