The Missing Linq

Recently I seem to have been finding more and more old code that would benefit from refactoring using Linq. Whilst still valid, much of the code I see can be significantly improved in terms of clarity and conciseness. Let’s look at a simple example.

The scenario I’ll be using here involves filtering a collection of Person objects to find those with an age lower than 30. First let’s add some boilerplate code as a basis to work with.

class Person
    public int Id { get; set; }
    public string Name { get; set; }
    public int Age { get; set; }

List<Person> people = new List<Person>()
    new Person() { Id = 1, Name = "First Person", Age = 67},
    new Person() { Id = 2, Name = "Second Person", Age = 18},
    new Person() { Id = 3, Name = "Third Person", Age = 42},
    new Person() { Id = 4, Name = "Fourth Person", Age = 8}

private static void PrintListOfPeople(string title, IEnumerable<Person> people)
    foreach (var person in people)
        Console.WriteLine("[{0}] {1}, aged {2}", person.Id, person.Name, person.Age);

Here we have a Person class, a collection of 4 Person instances and a function to display the collection in the console. All pretty straightforward and nothing too exciting. The aim is to take the collection and filter it so that only the people aged under 30 remain.

To do so without Linq one might use the following approach.

// Without Linq
IList<Person> youngPeople = new List<Person>();
foreach (var person in people)
    if (person.Age < 30)
PrintListOfPeople("List without Linq", youngPeople);

This code is pretty simple and it’s probably similar to code you’ve written hundreds of times in the past – initialise a new collection, iterate through the source collection, test each value and if there is a match add it to the new collection. No problems, works very well.

Now consider how the same code could be written using Linq. Note this code requires an Import of the System.Linq namespace.

// With Linq
IEnumerable<Person> youngPeopleLinq = people.Where(p => p.Age < 30);
PrintListOfPeople("List with Linq", youngPeopleLinq);

Wow, we’ve managed to condense 7 lines into 2 and arguably produced code that is more readable. When I first saw the LInq syntax it seemed a bit alien though I think that was more down to the lambda expression than anything else. Let’s dissect this code and see how it works.

Linq Fundamentals

At the high level, Linq really is just a series of extension methods to IEnumerable that offer additional ways to work with collections. In the above example I used the Where extension method. There are many other methods available. So far, so good but what is going on with that crazy lambda expression? To explain, consider the method signature of Where.

The Where extension method takes as a parameter a Func<Person,bool>. For those unfamiliar with Func, this is a language feature which provides a predefined delegate with a particular signature. It just dispenses with the need to define your own delegate. In this case our Func takes a parameter of type Person and returns a bool i.e.

   bool MyFunction(Person p)

So we have the Where extension method which needs to be supplied with a function that accepts a Person object and returns a boolean. The Where method will pass each item in the collection to the function and the boolean returned will indicate if the item being examined should be included or excluded in the resultset. The function itself must apply the test criteria.

We could therefore just create a function which does this test and pass that to the Where method. For example:

IEnumerable<Person> youngPeopleLinq = people.Where(CheckPersonAge);

private static bool CheckPersonAge(Person p)
    return (p.Age < 30);

Whilst this is a perfectly valid approach, the use of a Lambda expression provides a more concise syntax. Let’s look more closely at the lambda used in the original example.

Lambda Expressions 101

A Lambda Expression is a shorthand syntax for an anonymous function. It uses the => operator which translates as “goes into”. So the expression p => p.Age < 30 is translated to “pass p into the expression p.Age < 30”. So p is simply the name of the Person parameter passed into the expression which returns the boolean result of the test p.Age < 30. To summarise, p => p.Age < 30 is equivalent to the CheckPersonAge function shown earlier but with a much more concise syntax.

The lambda syntax confused the heck out of me when I first encountered it. My initial thoughts were “what is ‘p’, it’s not defined anywhere?”  The key thing to remember here is that the compiler is expecting a delegate with a parameter of type Person. All we need to do is provide a name for that parameter so that is can be used inside the target expression.

Linq-ing the Pieces

Hopefully this whirlwind tour of Linq and Lambdas has provided some food for thought. Linq does not really provide anything you could not achieve by other means. However, it does offer a powerful, concise syntax for working with collections. I would encourage any .NET developer to explore the capabilities of Linq and include it as another option in the developer toolbox.


Posted on July 27, 2012, in c#, linq, programming and tagged . Bookmark the permalink. Leave a comment.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: