Welcome to the second part of a series of blog posts discussing methods I've used to improve performance when using LINQ to SQL. In this post I will use the code to change load options from the first part of this series to make working with compiled queries easier.
If you don't know what compiled queries are in LINQ to SQL then you'd probably best go and do a quick search on the internet and find out as there are lots of very good explanations and examples already out there. I've gained some massive performance improvements in my applications by using them to cache the LINQ to SQL query plans. For complex queries often the time taken by LINQ to SQL to generate the plans is much more than the execution of the queries themselves! Compiling complex queries in your applications is definitely something I'd recommend.
One downside with them however is that you have to be careful with load options. Because load options help define the shape of the query they are baked into the query when it is first compiled. Which means if you then come along and use a different set of load options you will get a nice big exception. Combine that with the fact that the load options have to be exactly the same instance, i.e. you can't create a new DataLoadOptions instance that has the same load options, you have to use exactly the same instance and they can start to be a pain. You find that you can't reuse DataContext instances as much because they have to have specific load options and you end up with lots of static fields and load option construction code scattered around your codebase.
However as discussed in the first part of this series it is possible to write extension methods that can change the load options for a DataContext. This makes it easy to ensure you have the correct load options when running your query, just do something like this:
private static readonly Func<DataContext, int, Car>
SelectCarByIdQuery = CompiledQuery.Compile<DataContext, int, Car>(
(dataContext, id) =>
(from car in dataContext.Cars
where
car.Id == id
select car)
.SingleOrDefault());
private static readonly DataContext SelectCarByIdQueryLoadOptions;
static CarRepository()
{
SelectCarByIdQueryLoadOptions = new DataLoadOptions();
SelectCarByIdQueryLoadOptions.LoadWith<Person>(car => car.Owner);
}
// I'm missing the rest of the repository class code here, but you'd probably have
// a DataContext property that I can convieniently use in the following example method...
public Car SelectCarById(int id)
{
using (DataContext.TemporarilyChangeLoadOptions(SelectCarByIdQueryLoadOptions))
{
return SelectCarByIdQuery(DataContext, id);
}
}
The query will now always use the same load options and we won't get any annoying exceptions. However the above code is a bit long winded. Do we really want to have a static DataLoadOptions instance, code somewhere to create them and a using statement each time we want to use a compiled query? Probably not. How can we improve it? Well the first step would be to have the using statement inside the compiled query function itself. But how? Well if I were a functional programmer then I'd probably do something like this:
public static Func<TDataContext, TArg1, TResult> Compile<TDataContext, TArg1, TResult>(Expression<Func<TDataContext, TArg1, TResult>> query, DataLoadOptions loadOptions)
where TDataContext : DataContext
{
var compiledQuery = CompiledQuery.Compile(query);
return (dataContext, arg1) =>
{
using (dataContext.TemporarilyChangeLoadOptions(loadOptions))
{
return compiledQuery(dataContext, arg1);
}
};
}
The code above constructs a function for us. First it compiles the query we pass in using the normal CompiledQuery.Compile method. It then returns a new function that calls the compiled query function with the load options we have specified. If we put this function into a class called CompiledQueryWrapper then we could change our repository code to something like this:
private static readonly Func<DataContext, int, Car>
SelectCarByIdQuery = CompiledQueryWrapper.Compile<DataContext, int, Car>(
(dataContext, id) =>
(from car in dataContext.Cars
where
car.Id == id
select car)
.SingleOrDefault(),
SelectCarByIdQueryLoadOptions);
private static readonly DataContext SelectCarByIdQueryLoadOptions;
static CarRepository()
{
SelectCarByIdQueryLoadOptions = new DataLoadOptions();
SelectCarByIdQueryLoadOptions.LoadWith<Person>(car => car.Owner);
}
public Car SelectCarById(int id)
{
return SelectCarByIdQuery(DataContext, id);
}
Better, but not perfect. We still need a static field for our LoadOptions instance and code to construct it in the static constructor. Ideally want all the code that constructs our compiled query in one place, including the code to setup the load options. We can do that by getting more functional and adding an overload for our Compile method that accepts a function to make the necessary calls to LoadWith/AssociateWith:
public static Func<TDataContext, TArg1, TResult> Compile<TDataContext, TArg1, TResult>(Expression<Func<TDataContext, TArg1, TResult>> query, Action<DataLoadOptions> loadOptionsBuilder)
where TDataContext : DataContext
{
var loadOptions = new DataLoadOptions();
loadOptionsBuilder(loadOptions);
return Compile(query, loadOptions);
}
Our repository code can now be rewritten to:
private static readonly Func<DataContext, int, Car>
SelectCarByIdQuery = CompiledQueryWrapper.Compile<DataContext, int, Car>(
(dataContext, id) =>
(from car in dataContext.Cars
where
car.Id == id
select car)
.SingleOrDefault(),
loadOptions => loadOptions.LoadWith<Person>(car => car.Owner));
public Car SelectCarById(int id)
{
return SelectCarByIdQuery(DataContext, id);
}
Much tidier! The load options are firmly tied to the compiled query.
The only downside to this approach is that we have to create 15 overloads of our CompiledQueryWrapper.Compile method to match the 15 overloads of CompiledQuery.Compile in .NET 4.0... (.NET 3.5 is slightly less work as it has less overloads) However given the core of the methods will be the same we can use T4 to generate the overloads for us. Find a ZIP containing just such a template and the generated code here. The code it generates is a bit 'nicer' than the example code above as it includes parameter checking and some overloads to use the empty load options from the DataContextExtensions in the first part of this series. Feel free to download and use as you please in your own projects. (Just leave the link to this article in the comments please!)
In the next part of this series I will look at some of the drawbacks with using load options and how it's easy to accidently have hundreds of queries run if you're not careful.
Warning! One thing to bear in mind with compiled queries is that they should only be used for fetching data! If you use compiled queries to get entities and then try to update or delete them you will get random errors due to bugs in LINQ to SQL (3.5 at least; haven't tried with .NET 4.0 yet) that stops object tracking working correctly with compiled queries. You have been warned!
Welcome to the first part of a new series of blog posts where I discuss methods I've used to improve performance when using LINQ to SQL. For the first part I will be discussed changing load options.
If you've used LINQ to SQL then you'l have come across load options. Load options let you specify that you want certain child data loaded along with the parent entities when querying a DataContext. For example if you had a Car entity with an Owner relationship that points to a Person entity then you could do the following to load the Person at the same time as the Car:
var loadOptions = new DataLoadOptions();
loadOptions.LoadWith<Car>(car => car.Owner);
myDataContext.LoadOptions = loadOptions;
You can do lots more with them such as only load certain sets of data, etc. I won't bore you with details you probably already know and can find out elsewhere on the interweb if not.
Now one of the problems with load options is that you can't change them for a DataContext instance once you've performed a query with that DataContext. The reason you're not allowed to change the options is to maintain a consistent view of the data, which is fair enough but can be a bit of a pain at times. Often when you come to improve the performance of your LINQ to SQL code you'll find one or two cases where you should use slightly different load options. I'm going to neatly sidestep the ethical debate of whether you should be changing the load options for a DataContext or not and just explain how to do it.
A quick peek under the hood of the DataContext class with Reflector shows that you just need to change the value of the private loadOptions field on your DataContext to change the load options. Simple enough! On top of that you should really call the private Freeze method on the LoadOptions to stop them being changed after being attached to the DataContext. I guess you could miss this call out if you wanted to be able to change the LoadOptions instance, however I think it's probably easier (and safer!) to just set the value again with some new load options. Especially if using compiled queries which are very picky about their load options. (See the next part of this series for more about compiled queries)
How do we change the value of the loadOptions field? Well we could use reflection but it's a little slow. A much better way is to use the new Expression.Assign method in .NET 4 that allow us to create a compiled assignment expression to set the private field. Apart from the initial hit of the compilation using assignment expressions is much, much faster than using reflection. The code to create such a function looks something like this:
// Define expressions for the two parameters.
var dataContext = Expression.Parameter(typeof(DataContext), "dataContext");
var loadOptions = Expression.Parameter(typeof(DataLoadOptions), "loadOptions");
// Define an expression to access the private field.
var field = Expression.Field(dataContext, "loadOptions");
// Define an expression to assign a value to the field. (This is the .NET 4 only bit)
var assign = Expression.Assign(field, loadOptions);
// Build a lambda for the assignment.
var lambda = Expression.Lambda<Action<DataContext, DataLoadOptions>>(assign, dataContext, loadOptions);
// Compile it.
Action<DataContext, DataLoadOptions> changeLoadOptions = lambda.Compile();
If you don't have the luxury of being able to use .NET 4 yet then you can always fall back to using reflection instead. A compiled expression can also be used to call the private Freeze method on the LoadOptions; that can be done in both .NET 3.5 and .NET 4.
Once we have these basic building blocks we can create a set of extension methods that allow us to change load options. Which is exactly what I've done for you here. Feel free to download and use as you please in your own projects. (Just leave the link to this article in the comments please!)
There are extension methods to change or remove (i.e. replace with an empty set) the load options for a data context:
// Change the load options.
var loadOptions = new DataLoadOptions();
loadOptions.LoadWith<Car>(car => car.Owner);
myDataContext.ChangeLoadOptions(loadOptions);
// Remove the load options.
myDataContext.RemoveLoadOptions();
There are also methods to temporarily change or remove the load options:
// Remove the load options for the duration of the using block.
using (myDataContext.TemporarilyRemoveLoadOptions())
{
// Queries performed here will not use any load options.
}
// Load options will have been restored at this point.
I recommend you use these temporary methods to change your load options as then you can be quite clear in your code that you're only changing the options for a small period of time, e.g. one query that you need to optimise. Changing the load options willy nilly can easily lead to you forgetting to restore load options and having the wrong load options some of the time, i.e. everything the LINQ to SQL designers stopped happening by not allowing you to change the load options.
In the next part of this series I'll look at compiled queries and how you can use the extension methods above to make them much less of a pain to use.