Automatically updating Reporting Services caches: Getting best of both worlds

Reporting services allows you to create advanced reports, based on data which comes from a diversity of data sources. Reports are usually build based on data which accesses many records in the underlying database: often you visualise in the report information that spans multiple days, products, departments,… and you aggregate all the data (sum, average, minimum,…) to get an easy to interpret report. As you know, business data is more about finding the haystack than it is about finding the needle. Because we are addressing a large amount of data when building a report, it is common to use a caching technique to make sure users are served cached reports, in that way avoiding unnecessary access to the database.

Reporting services offers many types of caching. By default, it uses session caching: when the same user over the same IIS session requests the same report with the same query parameters within 10 minutes of the first request, a cached instance is served. However, you can’t control this cache-timeout, nor can a user B use the cached report of user A. That’s why another type of caching is available as well: cached instances. With cached instances, a report is cached on the report server and is made available to all users. Administrators can specify when the cache should expire: either the cache has a specified life expectance (e.g. the cache can become at most 3 hours old), or there is a fixed schedule that expires the caches (e.g. at noon and at 3 p.m. the cache is being flushed). Especially this latter option is interesting in scenarios where reports are created on top of a (relational or OLAP) data warehouse, which in itself is updated periodically from the live OLTP (on-line transactional processing) database server. Because we know when the data warehouse is updated, we know that we need to expire our caches after the update is finished. For example, when our data warehouse is updated weekly at Saturday night 1 a.m., and this update takes at most 3 hours, we know that we can expire all report caches at 4 a.m. Saturday morning.

However, in some scenarios, we don’t know when the data in the relational database will change. For instance, we have a report which describes all our products. New products are not added very frequently, but they are added at random moments in time. This causes a dilemma: either our product report queries the database for every report request (which guarantees an up-to-date report, but causes a high load on the database server), or we use caching (which improves the report creation time, reduces the load on the relational database, but renders outdated reports). To get the best of both worlds, we need a system which gets triggered when the data in the database changes, and which flushes the outdated cache at the report server, so that the next request for this report creates a new cached report. At the database, we do have triggers, which allow for this type of response. However, triggers slow down insert, update and delete statements, and because of that are a potential problem. Another problem is that a trigger fires for each insert, update or delete statement against a table, also for records which are not part of the report. Despite these disadvantages, on non-SQL Server databases, they are the only option we have. But on SQL Server 2005, combined with ADO.NET 2.0 there is a second option: SqlDependencies.

A SqlDependency is a more fine-grained and more efficient triggering mechanism than regular triggers. First, it only fires when specific fields within a table are changed, not just on any insert or update against the table. Second, it only fires once; further changes to the same table don’t cause extra invocations (unless we resubscribe our SqlDependency). Finally, the SQL server doesn’t contact our ADO.NET application within the transaction that inserts, updates or deletes the data; it runs in an asynchronous thread (using the SQL Server 2005 Service Broker), in that way preventing unnecessary blocking.

In the remainder of this article, we demonstrate how to set-up a simple cached report, and create an ADO.NET application (nicknamed ‘WatchDog’), which will flush outdated caches. The report simply lists all courses at a training company, and SQL Server 2005 Management Studio will be used to update this data. The required software for implementing this example is SQL Server 2005, SQL Server 2005 Reporting Services (although this technique will also work for the 2000 edition of Reporting Services), and .NET 2.0.

Creating the sample database and report

Start by creating a blank database for storing the data about the courses, and create a table for storing the data:

CREATE DATABASE Training
CREATE TABLE [dbo].[Course](
[CourseID] [int] NOT NULL,
[CourseCode] [char](10) NULL,
[CourseName] [varchar](50) NULL)

Voeg enkele records in (bijvoorbeeld via de Management Studio, of vanuit Visual Studio).

 

Next, we create a basic report which uses this table as dataset:

We deploy this report to the Report Server. After it has been deployed, we go to the ReportManager (or the Management Studio, connected to the report server), and we change the Execution properties to activate caching (don’t forget to click Apply!):

With this setting, our reports are cached for 5 hours. Check this by rendering the report. After the report has been rendered, add a new entry to the Course table in our database. Refresh the report and verify that the report does not contain the newly entered data.

Creating the Watchdog application

To solve the ‘outdated cache’ problem, we implement a .NET application which flushes the cache as soon as changes in the database are detected. In our example, we implement this as a winform application, though in real life, it seems to make more sence to implement this as a Windows Service. The first step is creating a regular connection to a database and a command which uses this connection:

SqlConnection con = new SqlConnection(connectionBox.Text);
SqlCommand cmd = con.CreateCommand();
cmd.CommandText = queryBox.Text;

Note that in this sample application, we assume that the administrator copies the query used to create the dataset in our report into a textbox (named queryBox). In a more realistic scenario, one could fetch this query from the Reporting Services Web Service or rdl file.

Now we implement the ‘special SQL Server 2005 trick’: we associate a SQL dependency object with this command. This class is defined in the System.Data.SqlClient namespace.

SqlDependency dep = new SqlDependency(cmd,null,18000);
dep.OnChange += new OnChangeEventHandler(dep_OnChange);
con.Open();
SqlDependency.Start(con.ConnectionString);
cmd.ExecuteNonQuery();
con.Close();

In the constructor of the SqlDependency object, we associate it with the SqlCommand object. The third parameter specifies that after 18000 seconds, this SqlDependency will stop to exist. Next, we add an event handler to the dependency object. After that, we open our connection, execute the query and close the connection again. There are two special statements involved. First, we have to ‘start’ our SqlDependency, because a connection needs to be set up between the SqlServer and our application. Second, we do not call the ExecuteReader method but the ExecuteNonQuery. The reason for this is that we are not interested in the data, so we don’t want to waste and bandwidth on it; we only execute this query to get the SqlDependency in place.

Handling changes

Finally, we need to write the event handler which is going to respond to the OnChange event in our dependency object. In our case, we just need to call the FlushCache method defined in the web service of Reporting Services. In order to do this, we first have to add a Web Reference to http://<MachineName>/ReportServer /ReportService2005.asmx, which in my code sample I put in the rs namespace. The code for my event handler then becomes:

void dep_OnChange(object sender, SqlNotificationEventArgs e)
{
rs.ReportingService2005 myService = new
WatchDog.rs.ReportingService2005();
myService.Credentials = System.Net.CredentialCache.DefaultCredentials;
myService.FlushCache(reportLocation);
}

So we only need to create an instance of our web service proxy class (WatchDog is the name of my Winform project, and hence the namespace in which the rs namespace is embedded by default), but before we can call the FlushCache method, we have to set the credentials (recall that Reporting Services uses Windows authentication only). In our case, the code will contact reporting services using the credentials of the user who starts up this windows application, so make sure this user has enough privileges on the report server to flush the cache. Finally, we flush the cache of the report of which the full name (path + report name) has been stored in the reportLocation private string variable.

That is it. Fire up the application, refesh your report to verify that we are still served our old, cached version. Next, make again a change on our Course database table, and refresh the report again. You will see that the new entry is in the report.

Remarks

This article shows the basic principle for getting best of both worlds using Reporting Services and SQL Server 2005. However, to run this app in a real world scenario, there are a few things to take care of:

  1. A SqlDependency will only fire once. As soon as a change has been reported, the system will not fire again on the next change. In most cases this is beneficial, because updates occur in bursts, and we don’t want the cache to be flushed for every change to the database. However, we should make sure that after all changes to the database have been applied, a new SqlDependency subscribes. On approach for implementing all this is to use some sort of polling approach: as soon as the SqlDependency fires, a polling mechanism checks e.g. every minute if changes still happen to the database. If so, we wait a little longer, if not, we subscribe a new SqlDependency.
  2. Another limitation is that, since the SqlDependency object uses the same triggering technique as used by indexed views, it also suffers from the same limitations that apply to this technique. Check Books OnLine for exact details, but the most prominent limitations are that the query must target tables (no views, functions or stored procedures), that two part names should be used for the table (explicit schema name, such as dbo.Course, should be used), and column names must be mentioned explicit (no SELECT * allowed).
  3. As we already mentioned throughout the text, it is better to implement this as a windows service instead of a winform application.

Conclusions

SqlDependecies are useful objects in scenarios where we have cached data. In this article we showed how this technology can help to have on-line reports, even when caching is used to speed up the report server and free resources on the relational database.

Contact me Contact


Contact me Receive U2U Newsletter.
Looking for a challenging job Download Brochure On Site Training Looking for a challenging job
Favorites Favorites

Copyright © 1999-2010 by U2U