In this post, I will be discussing the behavioral characteristics and the performance differences of the DataSet and the DataReader as well as indicate the suitability of use of these objects in various scenarios.
The Dataset is a "disconnected" data store. What this means is that the DataSet object need not maintain a connection with the database at all times, a connection is needed only at the time of fetching data and updating it. The DataSet can be populated with data using something like this ,
SqlConnection conn=new SqlConnection();
conn.ConnectionString="Data Source=.;Database=TempDB;Integrated Security=true;";
SqlDataAdapter da=new SqlDataAdapter("select * from Temp",conn);
DataSet ds=new DataSet();
//Process data in the DataSet ds
As can be seen from the above snippet, once the data has been read into the DataSet, the connection can be closed immediately. The data can still be accessed from within the DataSet.
DataReader, on the other hand, is a "connected" data store which means that there needs to be a connection maintained to the database in order to be able to access the values in the DataReader. The DataReader can be populated with data using something like this ,
SqlConnection conn = new SqlConnection();
conn.ConnectionString = "Data Source=.;Database=TempDB;Integrated Security=true;";
SqlCommand cmd = conn.CreateCommand();
cmd.CommandText = "select * from Temp";
cmd.CommandType = CommandType.Text;
SqlDataReader rdr = cmd.ExecuteReader();
//Process the data read from the DB
Notice here that the connection is closed only after iterating through the entire record set returned by the query.
As can be seen from above, a DataSet, although providing a lot of flexibility in terms of usage scenarios, is memory intensive. Since it is a disconnected data store, it stores all the data read from the database in memory. As can be inferred, this can really slow down the application if the DataSet is populated with millions of rows of data. On the other hand, the DataSet provides "random" access : any record stored in it can be directly accessed and records can be accessed in any order desired. One can also go back and forth through the DataSet records. Another important flexibility with DataSets is that data within it can be modified and the updates will be percolated to the database automatically. Thus, a DataSet is suitable for scenarios in which data update is needed or where random access is needed but should be used cautiously when huge data is being fetched from the database.
As for the DataReader, it is almost an opposite of the DataSet. Since it maintains an open connection to the database at all times, it needn't store data in local memory. Instead records are read in chunks on a need basis. Thus, it proves to be pretty efficient in terms of memory usage. However, this efficiency comes at a cost : records within a DataReader can be traversed only forward and that too, only once. If a record needs to be read a second time, the query needs to be executed again as there is no provision to move backwards through the DataReader. Also the data read through the DataReader is read only. Thus, a DataReader lends itself to scenarios where huge amounts of data are being read without having the need to update them or have random access over that data.