Reducing data footprints on test systems


Picture the scene:

You are a DBA. You have to make everyone happy, including DEVs (yes, including them too). Part of that is to ensure that your DEVs have a fresh copy of your production data so that they can test their programming with current data and structures. The issue with that is your production database is multiple hundreds of GB in size and your test system doesn’t have the capacity to store a full copy of your database. Your IT Budget will also not stretch to purchasing more storage! What can you do?

I had a similar situation recently and took a look at the possibilties and here is what I came up with:

1. Try out tools like SQL Virtual Restore to allow you to “mount” a backup from a remote server avoiding the space issue all together. This is, of course, a solution that requires a license (no budget maybe?)

2. Use the great features offered to you by SQL Server Developer Edition: data compression to the rescue!

So of course, I chose door number 2.

As you may know, Developer Edition offers the same features as Enterprise Edition with the caveat that the instance cannot be used for production purposes in any way. This means that the seriously powerful and useful data compression becomes available for your test servers! This counts even if you only use Standard Edition in production – joy of joys! The good thing being that Developer Edition only costs around $50 (or less if you have an MSDN subscription) – even more joys!

If you have never seen/used compression take a quick look over on MSDN to see what it all is (Data Compression). Basically, you can compress data in indexes and tables at the storage level avoiding any query rewrites and still profiting from the storage savings. This can be a major advantage in terms of raw storage needs, but can also benefit you in high read environments with low RAM sizes. The data remains compressed when held in the buffer pool and is only de-compressed when being accessed. This means that you can keep more data in the buffer pool and reduce hard disk I/O (obviously not for free, compressing and de-compressing costs CPU cycles).  This may be acceptable on a test system with extremely constrained storage space.

The usage in my example scenario is now an even better proposition, not only can I reduce my storage footprint, I can also potentially increase test system performance on the I/O side of the equation (who doesn’t have I/O issues, especially on a test box!).

The next hurdle is of course identifying which indexes and tables that are in the database you want to squeeze down. This is possible via SSMS’ object explorer, but only if you want to spend an eternity doing so! The best way is to take a look at the meta-data tables/views/functions to interrogate the system objects. These are really interesting on their own (at least they are to me), but after writing my Index Script Creator, I realised the potential for using the information in these system objects to allow me to automate certain tasks like this one.

Similarly to the Index Script Creator, my Index Compressor Script (uploaded as a doc  but it is a .sql file really – stupid wordpress restrictions!) runs off and investigates all tables and indexes (down to the partition level) and then generates an index rebuild command to compress any currently non-compressed partitions of indexes / tables. The resulting commands can then be run and space can be reclaimed – just be aware that rebuilding indexes does cause log activity.

After compressing your tables and indexes you should have a much smaller amount of data (depending on how well your data compresses) and be able to reclaim some disk space.

I realise that this method will mean that you have to have a large enough disk to have the database in its uncompressed state to begin with, but you can end up with a cost effective solution to a budget-constrained test environment.

Happy zipping!

Advertisements

Good News / Bad News


Given the choice, most people want to hear the bad news before the good:

The Bad News

I recently found out that the owner of Atlantis Interactive has decided to throw in the towel with regards to his absolutely amazing SQL Server Tools and IDE.

Matt Whitfield has produced (on his own) a set of tools that puts a lot of full blown companies to shame! The IDE SQL Everywhere is a complete replacement for SSMS, but boasts a number of improvements/extra features that seem blindingly obvious when you see them in SQL Everywhere. My favourites are “extract as CTE” where you mark some code and allow SQL Everywhere to create a CTE for you, and the smart renaming of variables/aliases etc.

When I heard that he had made this decision, I could understand it, but still find it a shame. He was obviously in a hard market to start with, with some very firm competition. I am still surprised that the tools haven’t made any noticeable headway – there are loads of people looking for good tools, even in the niche of T-SQL development.

The Good News

Matt has not decided to throw away all that he created. He has unlocked all the tools and set them free! His blood, sweat and tears (well, not so much blood – I hope!) have not been wasted.

If you have had enough of the limitations of SSMS intellisense, or general SSMS strangeness, take a good look at SQL Everywhere. Whilst you are there, have a look at the other great tools he created, you will be surprised at what you find.

My hat goes off to him and I hope that this setback doesn’t stop him from giving it another go in the future.

@Matt – if you read this, I’m looking forward to meeting you at SQLBits. I would love to hear about any other ideas you have on the back burner 🙂