MrKWatkins

Improving LINQ To SQL Performance Part 2 - Combining Compiled Queries And Load Options

Welcome to the second part of a series of blog posts discussing methods I've used to improve performance when using LINQ to SQL. In this post I will use the code to change load options from the first part of this series to make working with compiled queries easier.

If you don't know what compiled queries are in LINQ to SQL then you'd probably best go and do a quick search on the internet and find out as there are lots of very good explanations and examples already out there. I've gained some massive performance improvements in my applications by using them to cache the LINQ to SQL query plans. For complex queries often the time taken by LINQ to SQL to generate the plans is much more than the execution of the queries themselves! Compiling complex queries in your applications is definitely something I'd recommend.

One downside with them however is that you have to be careful with load options. Because load options help define the shape of the query they are baked into the query when it is first compiled. Which means if you then come along and use a different set of load options you will get a nice big exception. Combine that with the fact that the load options have to be exactly the same instance, i.e. you can't create a new DataLoadOptions instance that has the same load options, you have to use exactly the same instance and they can start to be a pain. You find that you can't reuse DataContext instances as much because they have to have specific load options and you end up with lots of static fields and load option construction code scattered around your codebase.

However as discussed in the first part of this series it is possible to write extension methods that can change the load options for a DataContext. This makes it easy to ensure you have the correct load options when running your query, just do something like this:

private static readonly Func<DataContext, int, Car>
   SelectCarByIdQuery = CompiledQuery.Compile<DataContext, int, Car>(
      (dataContext, id) => 
         (from car in dataContext.Cars
          where
             car.Id == id
          select car)
         .SingleOrDefault());

private static readonly DataContext SelectCarByIdQueryLoadOptions;

static CarRepository()
{
   SelectCarByIdQueryLoadOptions = new DataLoadOptions();
   SelectCarByIdQueryLoadOptions.LoadWith<Person>(car => car.Owner);
}

// I'm missing the rest of the repository class code here, but you'd probably have
// a DataContext property that I can convieniently use in the following example method...

public Car SelectCarById(int id)
{
   using (DataContext.TemporarilyChangeLoadOptions(SelectCarByIdQueryLoadOptions))
   {
      return SelectCarByIdQuery(DataContext, id);
   }
}

The query will now always use the same load options and we won't get any annoying exceptions. However the above code is a bit long winded. Do we really want to have a static DataLoadOptions instance, code somewhere to create them and a using statement each time we want to use a compiled query? Probably not. How can we improve it? Well the first step would be to have the using statement inside the compiled query function itself. But how? Well if I were a functional programmer then I'd probably do something like this:

public static Func<TDataContext, TArg1, TResult> Compile<TDataContext, TArg1, TResult>(Expression<Func<TDataContext, TArg1, TResult>> query, DataLoadOptions loadOptions)
   where TDataContext : DataContext
{
   var compiledQuery = CompiledQuery.Compile(query);

   return (dataContext, arg1) =>
   {
      using (dataContext.TemporarilyChangeLoadOptions(loadOptions))
      {
         return compiledQuery(dataContext, arg1);
      }
   };
}

The code above constructs a function for us. First it compiles the query we pass in using the normal CompiledQuery.Compile method. It then returns a new function that calls the compiled query function with the load options we have specified. If we put this function into a class called CompiledQueryWrapper then we could change our repository code to something like this:

private static readonly Func<DataContext, int, Car>
   SelectCarByIdQuery = CompiledQueryWrapper.Compile<DataContext, int, Car>(
      (dataContext, id) => 
         (from car in dataContext.Cars
          where
             car.Id == id
          select car)
         .SingleOrDefault(), 
      SelectCarByIdQueryLoadOptions);

private static readonly DataContext SelectCarByIdQueryLoadOptions;

static CarRepository()
{
   SelectCarByIdQueryLoadOptions = new DataLoadOptions();
   SelectCarByIdQueryLoadOptions.LoadWith<Person>(car => car.Owner);
}

public Car SelectCarById(int id)
{
   return SelectCarByIdQuery(DataContext, id);
}

Better, but not perfect. We still need a static field for our LoadOptions instance and code to construct it in the static constructor. Ideally want all the code that constructs our compiled query in one place, including the code to setup the load options. We can do that by getting more functional and adding an overload for our Compile method that accepts a function to make the necessary calls to LoadWith/AssociateWith:

public static Func<TDataContext, TArg1, TResult> Compile<TDataContext, TArg1, TResult>(Expression<Func<TDataContext, TArg1, TResult>> query, Action<DataLoadOptions> loadOptionsBuilder)
   where TDataContext : DataContext
{
   var loadOptions = new DataLoadOptions();
   loadOptionsBuilder(loadOptions);
   return Compile(query, loadOptions);
}

Our repository code can now be rewritten to:

private static readonly Func<DataContext, int, Car>
   SelectCarByIdQuery = CompiledQueryWrapper.Compile<DataContext, int, Car>(
      (dataContext, id) => 
         (from car in dataContext.Cars
          where
             car.Id == id
          select car)
         .SingleOrDefault(), 
       loadOptions => loadOptions.LoadWith<Person>(car => car.Owner));

public Car SelectCarById(int id)
{
   return SelectCarByIdQuery(DataContext, id);
}

Much tidier! The load options are firmly tied to the compiled query.

The only downside to this approach is that we have to create 15 overloads of our CompiledQueryWrapper.Compile method to match the 15 overloads of CompiledQuery.Compile in .NET 4.0... (.NET 3.5 is slightly less work as it has less overloads) However given the core of the methods will be the same we can use T4 to generate the overloads for us. Find a ZIP containing just such a template and the generated code here. The code it generates is a bit 'nicer' than the example code above as it includes parameter checking and some overloads to use the empty load options from the DataContextExtensions in the first part of this series. Feel free to download and use as you please in your own projects. (Just leave the link to this article in the comments please!)

In the next part of this series I will look at some of the drawbacks with using load options and how it's easy to accidently have hundreds of queries run if you're not careful.

Warning! One thing to bear in mind with compiled queries is that they should only be used for fetching data! If you use compiled queries to get entities and then try to update or delete them you will get random errors due to bugs in LINQ to SQL (3.5 at least; haven't tried with .NET 4.0 yet) that stops object tracking working correctly with compiled queries. You have been warned!