Build object relationships with LINQ
If you need another reason to start liking LINQ it’s because it allows you to build relationships between your object collections. I’m not talking about LINQ to SQL or LINQ to XML where the data source already has an underlying architecture for relationships; I’m talking about in-memory object collections (For example, in our offices we use powerful Unix-based business layer services so by the time the data reaches the .NET client it’s wrapped in simple object arrays and devoid of any meta data).
To see what I mean we have to start looking beyond the simple, one dimensional arrays that most demos have dealt with and start assuming we want to see aggregated data from a collection of list. For this example I have a list of Customers and a list of Orders joining on a CustomerID and for my resultset I want to see a list of the names of all the customers and the sum of the amounts of their orders. If I were to write this as a SQL statement it would look something like:
SELECT Customers.Name, SUM(Orders.Amount)
FROM Customers
LEFT JOIN Orders ON Customers.CustomerID = Orders.CustomerID
GROUP BY Customers.Name
Thanks to some guidance from Anders Hejlsberg himself, we know:
“A grouping join produces a hierarchical result that pairs each element from the outer sequence with its corresponding sequence of elements from the inner sequence.”
Which means if we join a list we can get a reference to another list of the same type that will be filtered on the join. The important trick is to put the results into a new variable.
var query = from c in customers
join o in orders on c.CustomerID equals o.CustomerID into co
select new { c.Name, Total = co.Sum(o => o.Amount) };
Using the code snippet above; if you hover over co in Visual Studio it will tell you that it is a “(range variable) IEnumerable<Order>” which we can now use in the select projection. This is the piece I was missing when I first started playing with aggregation because the aggregation functions need to take in an array and c and o are single instances.
Now for the interesting part, the optimization. When you iterate over the first instance in the outer sequence (customers) the application will iterate over each instance in the inner sequence (orders). After that it won’t iterate over inner sequence again while you iterate over the outer sequence except that because I have an aggregate function in my projection the application will have to iterate over each item in the hierarchical result (co) to get the values out. Remember that the hierarchical result will be smaller for each instance in the outer sequence (and can be null) because of the join predicate.
In conclusion we’ve seen how LINQ can help us look at various disparate sets of data and dynamically build relationships between them; and while there might still be valid questions about the size and complexity of the data you can parse through LINQ, the old adage of “garbage in, garbage out” still applies to how much you get out of using this compared to writing your own queries.


