Sunday, November 8, 2009

SQL Server 2008 Compression Features

SQL Server 2005 introduced the concept of variable length fields to enable space conservation by using only the storage needed for the value even if the declared storage is more than that. SQL Server 2008 takes storage space saving not only a step further, but a giant leap ahead. This is especially important since databases in today’s times are reaching sizes of Tera bytes and beyond. I will be covering some high level concepts behind the compression features introduced in SQL 2008. For more detailed reading, you can follow the links provided at the bottom of the post.

Essentially, SQL 2008 provides 2 ways to compress the data:

  • Backup Compression
  • Data Compression
    • Row Compression
    • Page Compression

Backup compression enables shrinking the size of the backup files when they are written to the file system. Since almost all important databases have a backup strategy in place, this is a feature that should be applicable in a wide range of scenarios. It can be enabled either on an individual basis using an option in the Backup command or using the system SP sp_configure. Restoring would be the same regardless of whether the backup is compressed or not except the fact that older versions of SQL Server would not be able to read the compressed backups.

Data compression is a slightly more complex topic as compared to Backup compression. Let me first explain the 2 compression techniques in Data compression.

  • Row compression uses variable length storage technique for all data types, including fixed data types. For the application, it would still seem that the fixed types are implemented as such but under the hood, SQL Server stores them using variable storage. Row compression doesn’t impose a very big overhead on the system and should generally be enabled on all tables and indexes
  • Page compression is a slightly more complicated beast. First and foremost, when you enable Page compression, Row compression is enabled automatically. Additionally, Page compression uses something called Column prefix and Page dictionary to provide a greater degree of compression.
    • In Column prefix, SQL Server stores the common values at the beginning of values in a column in the CI(Compression Information) section and the actual cells will have pointers to the CI block
    • For Page dictionary, a 'dictionary' of repeating column values on the given page is created in the CI structure, and again, pointers to the value in the dictionary are left in the original column value location

Page compression involves additional overhead in case of querying as well as inserts and updates. So it must be enabled selectively. Generally, it should be enabled only for tables and indexes with very few updates and/or high number of scans.

SQL Server also provides some DMVs and system SPs to help developers/administrators to get a better understanding of the impact and possible gains from enabling compression. Some of these are:

  • sp_estimate_data_compression_savings - Allows you to estimate the savings you will be able to perceive for a given table/index
  • sys.dm_db_persisted_sku_features - To determine whether a database is using compression
  • data_compression_desc column in the catalog view sys.partitions - To determine what is compressed, and how (row or page)
  • sys.dm_db_index_operational_stats - To determine the number of Scans (S) and Updates (U) on a table

Some additional points about SQL Compression:

  • The data compression feature is available in the Enterprise and Developer editions of SQL Server 2008 only
  • It is recommended to compress one table, index, or partition at a time in most cases because compression requires additional temporary structures to be created which impose space and processing overheads
  • It is also recommended to compress smaller objects before larger ones so that the space freed up by the smaller ones can be utilised as temporary storage when compressing the larger objects.
  • After data compression has completed, the space saved is released to the respective data file(s). However, the space is not released to the file system, because the file size doesn’t reduce automatically as part of data compression. There are ways to release the space to the file system, including using DBCC SHRINKDATABASE and DBCC SHRINKFILE

As mentioned earlier, I tried to cover most of the high level concepts that one would need to understand for using compression. There are some excellent articles on MSDN that cover various implementation details and performance benchmarks regarding compression. I am providing a couple of links below, which you can use for reference.

Data Compression: Strategy, Capacity Planning and Best Practices

Katmai (Sql 2008) - Data Compression (including Backup Compression)

Sunday, August 9, 2009

Modify SharePoint website Quick Launch

So after a series of non Technical posts, I am back at writing some technical ones, starting with this one. I was in a Solution Manager role for the last 6 months so I wasn’t getting my hands dirty with tech stuff as much. But now I am back into a Developer role so hopefully there would be many more tech posts from now on.

To start with, I would like to talk about a very basic requirement in SharePoint – adding new links to the Quick launch or modifying the existing links. For the uninitiated, Quick launch is the set of links we get on every page (as long as Quick launch is enabled for a web page; I will be talking about enabling Quick launch for new pages in another post) on the left hand side, just below the title of the site. Commonly, you would have links like Shared Documents, Tasks, Calendar etc. in here by default when you create a basic site. Now you might not want some of these links to appear because your site won’t be using a particular feature, for example, you don’t need the calendar and hence, having the Calendar link in the Quick launch would be meaningless. Or suppose you have created a new page of your own and want it to appear in the Quick launch so that users can visit this page from any page they are currently browsing. Both these requirements can be easily accomplished by editing the Quick launch section.

Here is how you can play around with the Quick launch:

  1. From any page, click on Site Actions in the top right corner and then click on Site Settings from the drop down that appears(you need to be an admin to have this option available to you)
  2. In the Site settings page, under the Look & Feel section, there would be a link either called Navigation or Quick Launch, depending on the version of SharePoint Server you are working with
  3. On clicking this link, a page would open that would show the existing Quick launch items along with options to add a heading, a new link, edit the existing links, change the order of links etc
  4. After making the changes, click OK to save the changes and you should be seeing the changes on any of the pages with Quick launch from now onwards

Sunday, July 5, 2009

Effects of Internet on developers : Positive or negative?

I have been thinking recently about what effects the explosion of information on the Internet has been having on software developers and how it has been shaping their skills. On the one hand, one can argue that it has helped them immensely in improving their knowledge and broadening their horizons. It has also helped them in getting answers to their problems quicker than ever, specially with forums like StackOverflow and having an abundance of search engines like Bing and Google to get relevant links at the click of a button.

On the flip side though, it has also in some ways made it less likely for the developers to feel the need to investigate into their problems and do thorough debugging and troubleshooting. All they need to do now is to copy paste the error in their favorite search engine and get links to web pages directly giving them the workarounds and resolutions. Even though this does improve their productivity in the short term, over a period of time it always helps for developers to troubleshoot the problems they face. This would enable them to understand the system better, get familiar with the quirks that almost any system typically has and also, the researching that goes on in resolving the problem also helps them to learn more. This blog post also talks about a similar problem being caused by search engines but for people in general, not specifically for developers.

I think that in today’s times, where developers’ time comes at a premium, there has to be a middle ground that needs to be chosen by the developers. It certainly helps and makes sense to make use of the search engines and developer forums to get answers to get unblocked and resolve problems. However, instead of doing a search the moment a problem is encountered, developers must try and see if they can find the resolution themselves and invest time in troubleshooting as much as they can afford to. Only if they cannot get a resolution and time is running out, should they look it up on the Internet.

What do you think developers should be doing? Would trying to troubleshoot it themselves help or should they just give importance to saving time and resolving problems as soon as possible?

Wednesday, January 14, 2009

Review : Windows Live Writer

Microsoft recently announced the final release of Windows Live Wave 3 at CES 2009. The Windows Live Wave contains a suite of applications like Windows Live Messenger, Windows Live Mail, Silverlight etc. It also contains a slightly lesser known application called “Windows Live Writer” which allows you to post articles to your blogs which may be hosted on any of the popular hosting platforms such as Blogger, Wordpress or Windows Live Spaces

I recently downloaded the Beta version of Windows Live Writer after reading some good things being written about it. I am really glad I installed this application because it is a much more efficient and user friendly way of writing, publishing and managing your posts as compared to doing it from Blogger. Setting up Live Writer is pretty straightforward. You just need to select the blogging platform and give the url of your blog along with the credentials which will be used to connect to the blog. Once that is done, Live Writer connects to the blog and downloads all sorts of information such as your blog template, list of tags defined by you etc.As a last (optional) step, it will also publish a dummy post so that you can verify that the posts appear as expected. Once this is done, you can start writing posts in Live Writer and publishing them.

Here are some of the features of Live Writer that I really like :

  1. It works OFFLINE!!!!!!!!!!! Since it is a Windows application, you can run it without Internet connectivity and write all your posts, saving them as drafts. Once you get the connectivity, you can just publish them.
  2. There are a number of options to insert items in your post including Images, Video, Map, Tables etc.

    InsertOptions

  3. There is an “Add a plug in” option which takes you to a website from where you can install plug ins for added functionality. One of the first plug ins I installed was one to format code snippets that you would like to put into your posts.
  4. Another useful option is one which lets you control what happen when a user selects the “Blog This” option from the browser. This setting can be accessed from Tools > Options > Blog This

  5. You can also see a list of the “Drafts” and “Recently posted articles” on the right hand sidebar and work with them by simply clicking the relevant article once.

  6. In addition, it has all the normal features you would expect such as Spell check, ability to publish posts in the future, updating previously published posts etc.

There is 1 feature though that is still missing and I am hoping the Live Writer team incorporates this soon. I am talking about the AutoCorrect feature which comes handy ever so often in editing software like MS Word. It is really tedious to capitalize the words yourself and to insert appropriate apostrophes instead of the software taking care of it.

All in all, it is a very good blog writing software and I am loving it. I will definitely be using it from now onwards, Good bye Blogger editor. Given the overwhelming good things to rave about, I would like to give kudos to the Windows Live Writer team for creating a great app.

Sunday, January 11, 2009

Beware when Copy Pasting

Often times, a developer needs to copy code from one place and paste it at another place, followed by some minor modifications at the destination. Of course, copy pasting the same code at multiple places such that no modification would be needed isn’t a good practice and essentially, the developer must refactor and move the common code into a function and insert function calls at the appropriate places.

There are many scenarios where the former operation is required. Even though it is at times possible to move slightly variant code into a separate function and use parameters for the differing functionality, this might not always suffice for a multitude of reasons. The code might be getting used only in a couple of places for example. In such cases, the best way to reuse code is to copy paste. However, from my experience, most errors (often times so stupid that you won’t suspect them to be the cause of problems) occur when one forgets to make a particular change after pasting the code.

One major reason why one tends to miss one or two modifications is that a very minor/no change is needed when doing such reuse (of course, if major changes were needed then one should prefer rewriting code instead of reusing). So people just assume that no change is needed and leave the code as is. Some of the changes that are needed frequently include,

  1. Change of the value of a enum
  2. Change of index variable
  3. Changing the index value.
  4. Needing to declare a variable which was available for use in the copied code but not present at the place the code has been pasted.

Forget these kind of changes and you would spend another hour trying to find the issue and fixing it. Hence, be extra careful whenever copy-pasting code to be able to make the most of the time saving gained from not having to write the code again and in case some error crops up after you pasted code from somewhere, the first check you should make is whether the minor changes like the ones for variable names etc have been taken care of.

UPDATE : After discussions with Dinkar, one important point came up which was that such errors are more likely in Weakly typed languages for obvious reasons.