Two brand new and very exciting security features of SQL Server 2016 are Dynamic Data Masking and Always Encrypted
Dynamic data masking limits (DDM) sensitive data exposure by masking it to non-privileged users. It can be used to greatly simplify the design and coding of security in your application. The purpose of dynamic data masking is to limit exposure of sensitive data, preventing users who should not have access to the data from viewing it. Dynamic data masking does not aim to prevent database users from connecting directly to the database and running exhaustive queries that expose pieces of the sensitive data.
In the example below, assume there are only two roles of users. The first role is Sally in charge of fraud investigation. The second role is John who is in customer service. Sally requires access to all data to complete her job while John only needs a few elements and maybe the part of the customer’s email address to verify identity. Both users query SQL Server for the customer information, but Sally is able to see the entire email address while John can only see certain parts of the email address.
What are the benefits of data masking? What are some specific reasons it should be used?
Data Masking Does |
Data Masking Does NOT |
|
Prevent database users from connecting directly to the database and running exhaustive queries that expose pieces of the sensitive data |
Data Masking Platform Options
Supported |
Platform |
Yes |
SQL Server 2016 |
Yes |
Azure SQL Database (SQL DB) |
No, but it will be supported in the future |
Azure SQL Data Warehouse (SQL DW) |
No, but it will be supported in the future |
Analytics Platform System (APS) |
SQL Server 2016 and Azure SQL DB are the only platforms currently supporting data masking. Azure SQL DW will support this feature in the near future while APS will be supported with AU6.
The key concern I had looking at data masking was whether it would support all formats that I may need. The short answer is that I don’t feel that it does. I think they try to solve the flexibility issue with the partial function, but it isn’t very flexible. I have hopes this will be expanded with future updates. Below are the data masking formats currently available.
To interact with data masking for SQL Server 2016, use the following syntaxes:
— Enable data masking on dbo.DimCustomer.EmailAddress ALTER TABLE dbo.DimCustomer ALTER COLUMN EmailAddress ADD MASKED WITH (FUNCTION = ‘partial(3,”XXXXXX”,4)’); — Grant DataMaskingDemo permission to see the data GRANT UNMASK TO DataMaskingDemo; — Revoke DataMaskingDemo permission to see the data REVOKE UNMASK TO DataMaskingDemo; — Drop the data masking ALTER TABLE dbo.DimCustomer ALTER COLUMN EmailAddress DROP MASKED;
So, this is all fine and dandy and seems like it may work alright, but let’s get to the meat and see what limitations there are.
In the example above, when converting a masked column to another string, all business value disappears and the column is completely masked.
In the example above, although the column has applied data masking, remember that the data still resides unmasked in the database. This means that predicate logic can be applied to infer masked values.
Another example would be John wanting to know Frank’s salary information. The salary column is masked, but this value can be derived using <, >, etc. Eventually John can pinpoint what Frank’s salary is.
The bottom line is that data masking is not the most secure option, so if data needs to be secure, there are other options such as Transparent Data Encryption and Row Level Security and Always Encrypted.
Always Encrypted is a data encryption technology that helps protect sensitive data at rest on the server, during movement between client and server, and while the data is in use, ensuring that sensitive data never appears as plaintext inside the database system.
After you encrypt data, only client applications or app servers that have access to the keys can access plaintext data.
The following illustration depicts the Always Encryption process.
There are 2 encryption types that can be selected, deterministic and random, each of which has different attributes and the business needs will determine which one.
Once the encryption type is chosen, keys will need to be created. These will generally be created by a security and/or compliance officer. These keys are what will be used to encrypt and decrypt the data. The only encryption standard currently available is AEAD_AES_256_CBC_HMAC_SHA_256.
Now that concepts are a little better understood, it’s time to actually implement our chosen security model.
2. Run the Always Encrypted wizard
3. Choose the table column to encrypt
4. Choose the Encryption Type
• Deterministic
• Randomized
5. Create/Select Keys
Finally, the easy part
Whew, that’s a mouthful. It’s clear that this is a new technology and a first iteration. Polybase has been around for years, Query Store is a newer version of Extended Events, etc. So while some SQL Server 2016 technologies are relatively complete and stable, Always Encrypted still has some work to do. The limitations will weigh heavily in your architectural approach.
Selecting the Right Technology……or Both?
Is Always Encrypted superior to data masking? Should I should always choose this?
The short answer is that one provides a higher level of security over the other, BUT each business use case will dictate the technology chosen, which sometimes results in them being COMPLEMENTARY!
Combining deterministic encryption AND data masking allows for the highest FUNCTIONAL level of security for data warehouses and relational databases.
If both technologies are selected collectively, be sure to encrypt all columns FIRST, then add Dynamic Data Masking, otherwise: