Tuesday, October 28, 2008

Importance of sending Status Reports

In this post I would like to highlight the importance of having a habit of sending regular status reports to your manager and also, if need be, to your immediate team.

I generally send out the report on Friday evenings so that the manager has an idea of what happened during the week, before that week ends. One can also send the report on Monday mornings but then, that would, theoretically at least, mean you are communicating to your manager a tad late. Status reports that I send out contain 3 main sections :
  1. Activities completed this week : What were the things you worked on and managed to complete this week (do not include items that you are still working on and haven't completed, those will go into the next section)
  2. Actions items for next week : What things are you planning to work on and/or complete in the following week
  3. Blocking issues : This is the most important section that you should send out to your manager. Communicate clearly issues that restrict your progress and follow up on them to get them sorted out asap

Given these sections, one can either send out a simple list of activities for each section or have multiple columns under the first 2 sections to be more elaborate. The columns that I include under the first 2 sections are :

  1. Activity : The activity that you completed/are going to complete in the following week
  2. Effort : An indication of the effort needed for completing an activity (if that activity falls under the first section) or an estimate of the effort that will be required(for activities that are part of the second section)
  3. Main challenges/points : Highlight the main technical issues that had to be /will need to be addressed/solved to complete the stated activity

The big question in your mind would be, why should I waste somewhere around half an hour every week in sending this to my manager, who might not even give it anything more than a cursory glance. Well, here are some of the reasons why that half an hour will be a time well invested :

  1. When you sit to jot down the activities, it helps you to get an idea of how much work has been done and allows you to get an understanding of how efficient you are being at what you are doing. You will be able to catch early signs when there is a need for you to pull up your socks and work harder.
  2. Jotting down the action items for the coming week streamlines your work and thinking
  3. Highlighting the blocking items gives you a better chance of getting it resolved sooner so that you can get back on track with your work
  4. If you organise your status reports in a separate folder (like I do using Outlook rules), you can just take a look at the status reports to get an idea of what all work you did during any given time period. This will prove priceless when you sit down at the year end to fill up your performance review.
  5. Finally, and probably most importantly, it's your chance to show off your efficiency and abilities to your manager. Thanks to this post for highlighting this point

I agree that some of these advantages are already inherent in software development methodologies like Scrum, but there are plenty of other reasons, as can be seen from the list above, for you to use regular status reports. Feel free to add more advantages or some of the best practices you follow when it comes to status reports. Also, share your views if and why you think that sending status reports is a waste of time.

Sunday, October 5, 2008

Dataset v/s DataReader

In this post, I will be discussing the behavioral characteristics and the performance differences of the DataSet and the DataReader as well as indicate the suitability of use of these objects in various scenarios.

The Dataset is a "disconnected" data store. What this means is that the DataSet object need not maintain a connection with the database at all times, a connection is needed only at the time of fetching data and updating it. The DataSet can be populated with data using something like this ,

SqlConnection conn=new SqlConnection();
conn.ConnectionString="Data Source=.;Database=TempDB;Integrated Security=true;";
SqlDataAdapter da=new SqlDataAdapter("select * from Temp",conn);
DataSet ds=new DataSet();
da.Fill(ds);
conn.Close();
//Process data in the DataSet ds

As can be seen from the above snippet, once the data has been read into the DataSet, the connection can be closed immediately. The data can still be accessed from within the DataSet.

DataReader, on the other hand, is a "connected" data store which means that there needs to be a connection maintained to the database in order to be able to access the values in the DataReader. The DataReader can be populated with data using something like this ,


SqlConnection conn = new SqlConnection();
conn.ConnectionString = "Data Source=.;Database=TempDB;Integrated Security=true;";
conn.Open();
SqlCommand cmd = conn.CreateCommand();
cmd.CommandText = "select * from Temp";
cmd.CommandType = CommandType.Text;
SqlDataReader rdr = cmd.ExecuteReader();
while (rdr.Read())
{
//Process the data read from the DB
}
conn.Close();

Notice here that the connection is closed only after iterating through the entire record set returned by the query.

As can be seen from above, a DataSet, although providing a lot of flexibility in terms of usage scenarios, is memory intensive. Since it is a disconnected data store, it stores all the data read from the database in memory. As can be inferred, this can really slow down the application if the DataSet is populated with millions of rows of data. On the other hand, the DataSet provides "random" access : any record stored in it can be directly accessed and records can be accessed in any order desired. One can also go back and forth through the DataSet records. Another important flexibility with DataSets is that data within it can be modified and the updates will be percolated to the database automatically. Thus, a DataSet is suitable for scenarios in which data update is needed or where random access is needed but should be used cautiously when huge data is being fetched from the database.

As for the DataReader, it is almost an opposite of the DataSet. Since it maintains an open connection to the database at all times, it needn't store data in local memory. Instead records are read in chunks on a need basis. Thus, it proves to be pretty efficient in terms of memory usage. However, this efficiency comes at a cost : records within a DataReader can be traversed only forward and that too, only once. If a record needs to be read a second time, the query needs to be executed again as there is no provision to move backwards through the DataReader. Also the data read through the DataReader is read only. Thus, a DataReader lends itself to scenarios where huge amounts of data are being read without having the need to update them or have random access over that data.

Accessing return value of SPs

When working with databases, it is recommended that all the database queries should be moved to Stored Procedures (hereafter referred to as SP). This makes perfect sense because having all the queries in one place makes it easy to debug in case the database integration is not working as well as makes it less cumbersome to make changes (since it is known where those changes need to be made and the changes need not be duplicated in multiple code files). Given this best practice, it is often a requirement to capture the return value of the SP in order to determine whether the SP execution succeeded or failed.

ADO.NET has the SqlCommand and SqlConnection objects to allow the user to invoke the SP and have access to the results returned by the SP. Specifically, there are 3 methods that can be used to execute a query or an SP on the database :
  1. SqlCommand.ExecuteNonQuery() : Used for queries which don't return any value, i.e. insert, update and delete queries
  2. SqlCommand.ExecuteScalar() : Used for queries which are guaranteed to return a single value
  3. SqlCommand.ExecuteReader() : Used for queries that return multi column and/or multi row result sets

There is 1 important difference in the use of these 3 methods when it comes to accessing return values. It turns out that when using ExecuteScalar() and ExecuteNonQuery(), we can access the return value immediately following this method call whereas, for the ExecuteReader() method this is not the case. If we try to access the value of the parameter object created for the return value, it will have a null value. The return value is set only after we iterate through the entire result set returned by the reader object. The reasoning behind this behaviour can be as follows. If the result set was returned successfully, it means the SP succeeded so there's no point of checking the return value. It only makes sense to check the return value in case there was no result returned. The return value will then enable us to determine whether there was actually no data in the database for the given query or there is a bug which caused incorrect results to be returned.

Do keep this slight variation in the behaviour of the ExecuteReader() the next time you use it.