Part 1 – Analyzing LINQ to SQL Entities
For the past six months or so I’ve been playing a lot with LINQ to SQL to dissect and examine the internals of what makes it tick. It didn’t take long before I became a fan of LINQ to SQL, expression trees and IQueryable<T>. Those guys at Microsoft have really hit the nail on the head with this stuff. I started writing lots of LINQ to SQL code, debugging and watching SQL Profiler. I had a blast trying to find ways to break it. For the most part, it did really well in producing accurate, not always concise, SQL statements; albeit some queries could stand to be refactored. Regardless, LINQ to SQL delivers beyond my expectations. This is the point where I started to look for an elegant way to integrate LINQ to SQL with a classic N-Tier architecture. Unknowingly, I took the “red pill” and I dove head first into the rabbit hole. “How hard can this be?”, I thought to myself.
In my previous post, I compared the relative merits and challenges of the Active Record pattern with the Unit of Work pattern. The Active Record pattern, in my opinion, creates a natural coding paradigm where business objects expose persistence behaviors as public methods. The challenge being that Active Record implementations ranging from Ruby to Subsonic to Jammer.NET have had issues with cardinality. Since most Active Record implementations do not implement the Identity Map pattern, lazy loading relationships in the object graph could create an infinite loop condition when recursive persistence logic is invoked. LINQ to SQL avoids these types of issues by implementing the Unit of Work pattern through its DataContext class. Because the DataContext tracks object state, persistence calls are batched with only those objects that need to be updated and there would be no need to traverse the object graph. A challenge with the DataContext lies with cases where the objects become disconnected from the DataContext. Another potential issue with the DataContext, in my opinion, is code clutter. I personally don’t like having logic to re-attach disconnected objects to the context cluttering up my code.
Weighing both patterns and considering their benefits and disadvantages, I decided to try an experiment to marry the two patterns. The Active Record persistence behaviors can encapsulate the logic required for re-attaching and, in turn, the Unit of Work container can handle tracking of object state and persistence. The only thing missing is the minister and a prenup to make it official. So off I go to try my hand at match-making.
Let’s inspect the entity class that was generated by LINQ to SQL’s code generator (SQLMetal). Firstly, as seen in the following code sample, each generated class is adorned with a TableAttribute to create a mapping to the database table. Additionally, the class is declared partial to allow for customizations to be kept in a separate file. This is nice, but falls short for our desire to create a business layer on top of these data entities. The class then implements a couple of interfaces from the System.ComponentModel namespace to support notification of property changes. It is really nice that it doesn’t need to derive from a concrete class as this gives us the opportunity to create a custom base class that all entities can derive from should we choose to do so. Unfortunately, the class is not adorned with the SerializableAttribute. The designer allows you to control the “Serialization Mode”, but this does not do what I was expecting it to do. Setting this property to “Unidirectional” causes the properties of the data entities to get adorned with a DataMemberAttribute which is a WCF XML serialization method.
//LINQ to SQL’s mapping attribute implementation
[Table(Name=“dbo.Customers”)]
public partial class Customer : INotifyPropertyChanging, INotifyPropertyChanged
{
…
}
Looking further in the class I found the public property mapping implementation. I was surprised to find its mapping attributes were adorned to the public properties (below). I was surprised because this meant that I could hide the mapping by simply deriving from the class and declaring a new property with the same name. What about polymorphism? I know what you’re thinking, this is supposed to be a data entity not a domain entity. I agree. However, I’m a believer in code reuse. If the data model entities closely match the domain model, why not derive from those entities to maximize code reuse? Furthermore, your domain entities can completely encapsulate domain logic from the data entity without having to model yet another class that implements all of the same properties. In other words, you get all of the data without all of the work for free. All that’s left to do is to write the business logic in a class that elegantly holds only business layer implementation. So naturally, I was surprised to see the mapping made at the property.
//LINQ to SQL’s mapping attribute implementation
[Column(Storage = “_CompanyName”, DbType = “NVarChar(40) NOT NULL”, CanBeNull = false)]
public string CompanyName
{
get { … }
}
I later realized that LINQ to SQL deliberately creates mappings this way in order to support deferred execution. Because the public property is the referenced member of the class in the expression tree, a simple reflection call is all that is needed to retrieve the attribute instance. Formatting a SQL command statement becomes possible this way and is, indeed, a very nifty way of storing schema details without the need for an external mapping file (as found with the Hibernate mapping paradigm). I prefer having an inline attribute-based approach to Hibernate’s XML-based mapping files. I’ve always felt that having the mapping definitions inline with code was a cleaner way to implement mappings as the definitions are always only a click or two away. However, I do recognize the simplicity and elegance of an external mapping file as well as having the ability to cache the map in memory for performance.
As a quick test, I tried to derive from the LINQ to SQL entity and run a query from the context with out success. The reflection call that discovers the mapping attribute does not flatten the hierarchy and can’t retrieve the base property reference. Bummer. Microsoft, if you’re reading this, please modify the GetProperty() method call to include a binding flag value of “BindingFlags.FlattenHierarchy”. I think it would be really cool to have business objects that derive from data objects that are context connected. Maybe the answer is to wait for the ADO.NET Entity Framework for this type of support.
Next, we find the associations section. This is where relationships to other entities are made. Here we are introduced to the AssociationAttribute and the EntitySet<T> object. The EntitySet<T> came as a surprise to me because I was expecting a simple Collection<T> or even simpler, an array of Order objects to handle relationships. According to the MSDN documentation for EntitySet<T>, it provides deferred loading and relationship maintenance for the one-to-one and one-to-many collection relationships for LINQ to SQL applications.
[Association(Name=“Customer_Order”, Storage=“_Orders”, OtherKey=“CustomerID”)]
public EntitySet<Order> Orders
{
get
{
return this._Orders;
}
set
{
this._Orders.Assign(value);
}
}
Another relationship object that is found in the generated code is the EntityRef<T> class. Again, for deferred loading purposes a specialized class is used to handle direct parent relationships.
private EntityRef<Customer> _Customer;
How do these references get instantiated? Well, with the constructor of course! Yeah, I was confused at first too. The concept to note here is that the EntitySet<T> object is instantiated but not the actual entity until the Entity property is requested. I’m not sure what to think of this one. I’m guessing they wanted to push the responsibility of tracking when a related entity has been requested to another class in an effort to keep things clean with the entity class implementation. Certainly, this could have been achieved by adding more properties to the class.
public Customer()
{
this._Orders = new EntitySet<Order>(new Action<Order>(this.attach_Orders), new Action<Order>(this.detach_Orders));
…
}
So far so good. The entities generated by LINQ to SQL are relatively straight forward and relationships to other entities are handled relatively elegantly. The class does not derive from another concrete class and there is no data access logic to muddy up the waters. Inline attributes and the use of custom relationship classes make up the list of less desirable items, but shouldn’t impede our efforts to create an N-Tier architecture. Out of the box, LINQ to SQL provides an excellent way to quickly model a data layer with rich relationship structures.
In the next post, we’ll take a look at the DataContext class and evaluate how we could best take advantage of its object tracking capabilities. I’ll also explore writing a context manager class that helps ease the pains in dealing with the DataContext.