Data compression in SQL Server 2008

Data compression – SQL Server 2008

I found the Data compression feature in SQL Server 2008 is very handy. This definitely helps in optimizing query performance in terms of I/O cost and additionally it helps in saving storage space.

There are two types of compression in SQL Server 2008

Row Level Compression

When we enable the Row level compression, the data stored in the fixed length data type will be stored in variable length format.

Before I show you the example, it is worth explaining the difference between CHAR and VARCHAR data type

“CHAR is a fixed-length data type; the storage size of the char value is equal to the maximum size for this column. Whereas VARCHAR is a variable-length data type, the storage size of the VARCHAR value is the actual length of the data entered, not the maximum size for this column.”

Okay, now let’s look at the example,

Let’s assume, we have a table with CHAR (15) column. It is not always true that you will store the data in this column which has 15 characters in length; it may vary, but still depends on your application. For example if you are storing “SQL Server 2008” and “SQL” in the CHAR(15) column each value will take 15 characters space to store the data irrespective of the actual number of characters. When the same value stored in variable length column then the storage space will depend on the length of the character stored.

You may ask yourself, why I need a feature “Row level Compression” if I change the CHAR data type to VARCHAR of my table? Yes, absolutely no sense of having such a feature but it is applicable for any fixed length data type and more importantly another advantage of this feature is, it doesn’t take space to store zero and NULL values.

Page Level Compression

Page level compression is very useful functionally which I was expecting for a long time. I was working in analytical processing application, which uses SQL Server 2000,and of course SQL Server 2000 doesn’t have such compression feature. However I have managed to reduce or compress the data size by doing some workaround. I can explain in detail in another post.

Let’s see how we can get benefit by using the Page level compression; Page level compression is nothing but removing the duplicate values within the page (i.e.) irrespective of rows and columns.

For example, when you insert value “SQL Server 2000”, “SQL Server 2008” into a table the SQL Server storage engine will store “SQL Server 200” only once in a page and it will refer this value for all other occurrences.

It is advisable to estimate the benefit before you implement the compression, you can make use of the system stored procedure sp_estimate_data_compression_savings or data compression wizard to estimate how much space you will save by implementing.