Recently CodeGuru published the last part of my introductory series of articles on LINQ. Here are the links to the articles:

I hope you’ll enjoy and benefit from the reading.

Hits for this post: 11115 .

When I delivered the LINQ presentation at the RONUA meeting in April, I was asked how does LINQ perform on big data sets. To answer that I decided to test LINQ to XML against a 100+MB file. I decided to extract three different sets of data from this XML file:

  • 1 set representing about 0.5MB of the XML file,
  • 1 set representing about 10MB of the XML file, and
  • 1 set representing about 80MB of the XML file

Of course, I designed some data structures to map on the data from the XML file and run three queries against this file that would project instances of those data structures. The result was that all the three (different) queries took about same time to execute and generate my internal objects. Each time the entire file was re-parsed. The time for each query was about 3.5 seconds. Thus, I can draw two conclusions:

  • LINQ is very performant: it took less than 12 seconds to extract 90% of the data from a 100MB file; the performance is several times greater than the one I get in C++ for parsing the file; not to mention that the code is more than several times simpler;
  • there was’t too much difference between extracting 0.5MB or 100 times that;

I am quite confident that LINQ to SQL is as performant as LINQ to XML. If I’ll find a really big data base, I will query it.

Hits for this post: 32209 .

LINQ to SQL

LINQ to SQL is an API that allows querying relational databases. The samples in this post are based on the concepts of:

  • Entity classes (which are instances of Entity Types). Such a class is a regular .NET class that is decorated with the attribute Table and its properties and fields with the attribute Column
  • DataContext, is the channel for doing operations with the database; it is used like an ADO.NET connection; actually its constructor takes either a connection string or an ADO.NET connection

Considering the known Winner class from the previous posts,

public class Winner
{
     public string Name { get; set; }
     public string Country { get; set; }
     public int Year { get; set; }
}


Decorating it with Table and column like this

[Table(Name = "Winners")]
public class Winner
{
    [Column]
    public string Name { get; set; }

    [Column]
    public string Country { get; set; }

    [Column(IsPrimaryKey = true)]
    public int Year { get; set; }
}


will create a direct mapping between Winner and the table call Winners, and between the fields Name, Country and Year of the class and the columns with the same name from the table. Both Table and Column have several properties. One of them is Name, which specifies the actual name of the table or column corresponding to the class or property. If the property Name is not specified, the same name as for the class and properties is used.

Assuming we have a SQL Server database, located at C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\UCL.mdf, with a table Winners that has three columns Year (which is also the primary key), Name and Country, and that this table is populated with the winners of UEFA Champions League, we could write the following code to retrieve and show the winners:

public void PrintWinners()
{
     // creates a data context that takes the path of the database
     DataContext dc = new DataContext(@"C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\UCL.mdf");

     // retrieves a Table of Winner
     Table winners = dc.GetTable();

     // creates a sequence of winners ordered descending by the winning year
     var result = from w in winners
                  orderby w.Year descending
                  select w;

     // prints the sequence of winners
     foreach (var w in result)
     {
          Console.WriteLine("{0} {1}, {2}",
          w.Year, w.Name, w.Country);
     }
}


First, we must instantiate a DataContext, passing as argument the path to the database. DataContext has a method GetTable, that returns a Table. To get the winners we call it as shown above. On this table, we can perform a query and show the results:

2006 Barcelona, Spain
2005 Liverpool, England
2004 FC Porto, Portugal
2003 AC Milan, Italy
2002 Real Madrid, Spain
2001 Bayern Munchen, Germany
2000 Real Madrid, Spain
1999 Manchester Utd., England
1998 Real Madrid, Spain
1997 Borussia Dortmund, Germany
1996 Juventus, Italy
1995 AFC Ajax, Netherlands
1994 AC Milan, Italy
1993 Olympique de Marseille, France


It is however recommended that we use a so called strongly-typed version of DataContext. In other words, a derived class from DataContext that keeps as members all the table collections. In this case we don’t have to call directly GetTable<>();

public class UCLDataContext : DataContext
{
    public Table Winners;

    public UCLDataContext(string connection)
      :
      base(connection)
    {}
}


The PrintWinnes function would have to change to:

public void PrintWinners()
{
    // creates a data context that takes the path of the database
    UCLDataContext dc = new UCLDataContext(@"C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\UCL.mdf");

    // creates a sequence of winners ordered descending by the winning year
    var result = from w in dc.Winners
                 orderby w.Year descending
                 select w;

    // prints the sequence of winners
    foreach (var w in result)
    {
        Console.WriteLine("{0} {1}, {2}",
            w.Year, w.Name, w.Country);
    }
}


Only querying the database is not enough. DataContext also allow us to submit changes to the database.

The following function shows how to add a winner to the table Winners:

public void AddWinner()
{
     // creates a data context that takes the path of the database
     UCLDataContext dc = new UCLDataContext(@"C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\UCL.mdf");

     // adds a new winner to the table
     dc.Winners.Add(new Winner { Name = "Manchester United", Country = "England", Year = 2007});

     // submites the changes
     dc.SubmitChanges();
}


If you want for instance to remove all the winners from Spain, we can do the following:

public void DeleteWinner()
{
    // creates a data context that takes the path of the database
    UCLDataContext dc = new UCLDataContext(@"C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\UCL.mdf");

    // remove a sequence of winners
    dc.Winners.RemoveAll(from w in dc.Winners
                      where w.Country == "Spain"
                      select w);

    // submites the changes
    dc.SubmitChanges();
}


Hits for this post: 25681 .

On Saturday, April 22, I delivered a presentation on LINQ at a meeting of the RONUA community in Timisoara. The presentation was focused on sample codes for LINQ to Objects, LINQ to XML and LINQ to SQL. Today I uploaded on the site the presentation and demo programs so that you can download them.

Here is the list of downloads:

  • power point presentation in Romanian (88.5 KB); requires PowerPoint 2007
  • demo programs: a solution with three VC# projects (16.2 KB): LinqToObjects, LinqToXML and LinqToSQL; requires Visual Studio Orcas March CTP; comments are in English
  • UCL database (135 KB): SQL Server 2005 Express database required by the LinqToSQL project

Make sure that in the LinqToSQL project you use the right path for the UCL.mdf file. Please post any kind of comments you might have about it.

Hits for this post: 18872 .

LINQ to XML

LINQ offers an API called LINQ to XML, formally known as XLinq, that provides support for working with XML. This API resides in the System.Xml.Linq namespace, and you need to add a reference to the assembly with the same name to be able to use it. If you installed the Orcas March CTP bits, the assembly can be found in folder C:\Windows\Microsoft.NET\Framework\v3.5.20209.

In the namespace you can find classes such as XNode, XElement, XAttribute, XText, etc.

XElement implements an XML element. It has several constructor. The following snippet constructs an empty element called winner.

   XElement root = new XElement("winner");
   Console.WriteLine(root.ToString());


The output is

< winner />


Overloaded constructor can take additional parameters. If we pass a string:

XElement root = new XElement("winner", "Manchester Utd.");


We get:

< winner >Manchester Utd.< /winner >


We can also pass an Xattribute

XElement root = new XElement("winner", "Manchester Utd.",
                             new XAttribute("Year", 1999));


Or

XElement root = new XElement("winner", new XAttribute("Year", 1999),
                             "Manchester Utd.");


In this case the winner element will have an attribute called Year having the value 1999, and the text of the element will be Manchester Utd..

< winner Year="1999" >Manchester Utd.< /winner >


We can nest the all these to make a hierarchy of xml elements:

XElement root = new XElement("winners",
                              new XElement("winner",
                                   new XElement("Name", "Barcelona"),
                                   new XElement("Country", "Spania"),
                                   new XElement("Year", 2006)
                              ),
                              new XElement("winner",
                                   new XElement("Name", "Liverpool"),
                                   new XElement("Country", "Anglia"),
                                   new XElement("Year", 2005)
                              )
                );


Of course, the XML elements don’t have to be created like that. They can be dynamically created. One way is using methods like Add, AddFirst, RemoveNodes, etc., methods from the XContainer class.

IEnumerable< Winner > winners = UCL.GetWinners();
XElement root = new XElement("winners");

foreach (Winner w in winners)
{
      root.Add(new XElement("winner",
                             new XElement("Name", w.Name),
                             new XElement("Country", w.Country),
                             new XElement("Year", w.Year)));
}


And we can write this to a file with:

Root.Save("winners.xml");


However, LINQ to XML offers a better way to generate XML content.

IEnumerable< Winner > winners = UCL.GetWinners();

XElement root = new XElement("winners", 
                    from w in winners
                    select new XElement("winner",
                                        new XElement("Name", w.Name),
                                        new XElement("Country", w.Country),
                                        new XElement("Year", w.Year)));  

The result in this case is the same as above.

XElements has several overloads for saving its content to a file:

public void Save(string fileName);
public void Save(TextWriter textWriter);
public void Save(XmlWriter writer);
public void Save(string fileName, bool preserveWhitespace);
public void Save(TextWriter textWriter, bool preserveWhitespace);


On the other hand, XElement offers several overloaded static methods for loading content from XML files:

public static XElement Load(string uri);
public static XElement Load(TextReader textReader);
public static XElement Load(XmlReader reader);
public static XElement Load(string uri, bool preserveWhitespace);
public static XElement Load(TextReader textReader, bool preserveWhitespace);
public static XElement Parse(string text);
public static XElement Parse(string text, bool preserveWhitespace);


The following code loads the content of the file winners.xml and prints it in the console.

XElement root = XElement.Load("winners.xml");
Console.WriteLine(root.ToString());


Considering that we have in winners.xml the list of UEL winners, we can load the content of this XML file and create Winner objects:

IEnumerable< Winner > winners =
             from e in XElement.Load("winners.xml").Elements("winner")
             select new Winner
                    {
                         Name = (string)e.Element("Name"),
                         Country = (string)e.Element("Country"),
                         Year = (int)e.Element("Year")
                    };

foreach (Winner w in winners)
{
     Console.WriteLine("{0} {1}, {2}", w.Year, w.Name, w.Country);
}


XElement.Load() creates an XElement containing all the elements in the file. Elements() returns only the children called winner (in our case all the children elements of the root). After that we project Winners created by accessing the children of element “winner” in the XML file. The output is

2006 Barcelona, Spania
2005 Liverpool, Anglia
2004 FC Porto, Portugalia
2003 AC Milan, Italia
2002 Real Madrid, Spania
2001 Bayern Munchen, Germania
2000 Real Madrid, Spania
1999 Manchester Utd., Anglia
1998 Real Madrid, Spania
1997 Borussia Dortmund, Germania
1996 Juventus, Italia
1995 AFC Ajax, Olanda
1994 AC Milan, Italia
1993 Olympique de Marseille, Franta


Now suppose you want to project only the names of the winners. In this case we could write:

var winners =
            from e in XElement.Load("winners.xml").Elements("winner")
            select (string)e.Element("Name");

foreach (var w in winners)
{
     Console.WriteLine("{0}", w);
}


The ouput of the program is:

Barcelona
Liverpool
FC Porto
AC Milan
Real Madrid
Bayern Munchen
Real Madrid
Manchester Utd.
Real Madrid
Borussia Dortmund
Juventus
AFC Ajax
AC Milan
Olympique de Marseille


This output however lists a team multiple times. If we want to have these winners listed only once we could apply the Distinct operator on the result and select the winners only once:

var winners =
         from e in XElement.Load("winners.xml").Elements("winner")
         select (string)e.Element("Name");

var winnersDistinct = Enumerable.Distinct(winners);

foreach (var w in winnersDistinct)
{
    Console.WriteLine("{0}", w);
}


The new output would be

Barcelona
Liverpool
FC Porto
AC Milan
Real Madrid
Bayern Munchen
Manchester Utd.
Borussia Dortmund
Juventus
AFC Ajax
Olympique de Marseille


Hits for this post: 24368 .

yield is a new contextual keyword introduced to C# 2.0, vital for lazy evaluation and the performance of queries in LINQ. Being a contextual keyword means that yield can be used as a variable name in C# without any problem. When put before return it becomes a keyword.

yield allows one enumerable class to be implemented in terms of another. This enables the delay of execution of queries until the latest possible moment, skipping the generation of intermediate results that would drastically reduce poerformance. The query operators in LINQ operate on sequence. The result of a query is often another sequence. Lazy evaluation means that until you iterate over the result of the query, the source of the query is not iterated.

To show you how yield works, let’s consider the same class I used in my last post, Winner.

public class Winner
{
    string _name;
    string _country;
    int _year;

    public string Name
    {
        get { return _name; }
        set { _name = value; }
    }
    public string Country
    {
        get { return _country; }
        set { _country = value; }
    }
    public int Year
    {
        get { return _year; }
        set { _year = value; }
    }
    public Winner(string name, string country, int year)
    {
       _name = name;
       _country = country;
       _year = year;
    }
}


and create a class WinnerDB, that contains a list of UEFA Champion League winners. This class implements IEnumerable and returns an enumerator, WinnerEnumerator, to be able to iterate over the winners.

    public class WinnerEnumerator : IEnumerator
    {
        int pos = -1;
        private Winner[] _winners;
        public WinnerEnumerator(Winner[] winners)
        {
            _winners = winners;
        }

        public void Reset()
        {
            pos = -1;
        }

        public bool MoveNext()
        {
            pos++;
            return (pos < _winners.Length);
        }

        public object Current
        {
            get
            {
                try
                {
                    return _winners[pos];

                }
                catch (IndexOutOfRangeException)
                {
                    throw new InvalidOperationException();
                }
            }
        }
    }

    public class WinnersDB : IEnumerable
    {
        private Winner[] _winners;
        public WinnersDB()
        {
            _winners = new Winner[]
            {
                new Winner("Barcelona", "Spain", 2006),
                new Winner("Liverpool", "England", 2005),
                new Winner("FC Porto", "Portugal", 2004),
                new Winner("AC Milan", "Italy", 2003),
                new Winner("Real Madrid", "Spain", 2002),
                new Winner("Bayern Munchen", "Germany", 2001),
                new Winner("Real Madrid", "Spain", 2000),
                new Winner("Manchester Utd.", "England", 1999),
                new Winner("Real Madrid", "Spain", 1998),
                new Winner("Olimpique Marseille", "France", 1993),
            };
        }

        public IEnumerator GetEnumerator()
        {
            return new WinnerEnumerator(_winners);
        }
    }


The usage of this class would look like this:

    class Program
    {
        static void Main(string[] args)
        {
            WinnersDB db = new WinnersDB();
            foreach (Winner w in db)
            {
                Console.WriteLine("{0}\t{1}, {2}",
                    w.Year, w.Name, w.Country);
            }
        }
    }


and the output of the program

2006    Barcelona, Spain
2005    Liverpool, England
2004    FC Porto, Portugal
2003    AC Milan, Italy
2002    Real Madrid, Spain
2001    Bayern Munchen, Germany
2000    Real Madrid, Spain
1999    Manchester Utd., England
1998    Real Madrid, Spain
1993    Olimpique Marseille, France


So far so good. But with yield, you can let the compiler do all that stuff for you. When you use yield, the compiler generates an enumerator that keeps the current state of the iteration.

class Program
{
    public static IEnumerable WinnersDB()
    {
        Winner [] winners = new Winner[]
        {
                new Winner("Barcelona", "Spain", 2006),
                new Winner("Liverpool", "England", 2005),
                new Winner("FC Porto", "Portugal", 2004),
                new Winner("AC Milan", "Italy", 2003),
                new Winner("Real Madrid", "Spain", 2002),
                new Winner("Bayern Munchen", "Germany", 2001),
                new Winner("Real Madrid", "Spain", 2000),
                new Winner("Manchester Utd.", "England", 1999),
                new Winner("Real Madrid", "Spain", 1998),
                new Winner("Olimpique Marseille", "France", 1993),
        };

        foreach (Winner w in winners)
        {
            yield return w;
        }
    }

    static void Main(string[] args)
    {
        foreach (Winner w in WinnersDB())
        {
            Console.WriteLine("{0}\t{1}, {2}",
                w.Year, w.Name, w.Country);
        }
    }
}


Running this code will produce the same output as the previous one, except that the implementation is much simpler. Perhaps the example is not the best, but should give you a hint of the use of the yield keyword. To see that the source is actually iterated only when the result is iterated, we can modify the WinnersDB method to print a message in the console:

foreach (Winner w in winners)
{
    Console.WriteLine("yield: {0} {1}, {2}", w.Year, w.Name, w.Country);
    yield return w;
}


In this case, the output looks like this:

yield: 2006 Barcelona, Spain
2006    Barcelona, Spain
yield: 2005 Liverpool, England
2005    Liverpool, England
yield: 2004 FC Porto, Portugal
2004    FC Porto, Portugal
yield: 2003 AC Milan, Italy
2003    AC Milan, Italy
yield: 2002 Real Madrid, Spain
2002    Real Madrid, Spain
yield: 2001 Bayern Munchen, Germany
2001    Bayern Munchen, Germany
yield: 2000 Real Madrid, Spain
2000    Real Madrid, Spain
yield: 1999 Manchester Utd., England
1999    Manchester Utd., England
yield: 1998 Real Madrid, Spain
1998    Real Madrid, Spain
yield: 1993 Olimpique Marseille, France
1993    Olimpique Marseille, France


Hits for this post: 30397 .

In my last post about LINQ I shown you an example about how to use the language integrated query to select information about directories. In this post I’ll get more into the syntax and show you something about the functional querying style.

My examples will focus on displaying information about UEFA Champions Leage winners. Thus, I have created a class called Winner that looks this this:

class Winner
{
    string  _name;
    string  _country;
    int     _year;

    public string Name
    {
        get { return _name; }
        set { _name = value; }
    }
    public string Countryr
    {
        get { return _country; }
        set { _country = value; }
    }

    public int Year
    {
        get { return _year; }
        set { _year = value; }
    }

    public Winner(string name, string country, int year)
    {
        _name = name;
        _country = country;
        _year = year;
    }
};


Also I have created a utility class that returns a list (incomplete) of UCL winners:

class UCL
{
    public static IEnumerable GetWinners()
    {
        Winner [] winners = 
        {
            new Winner("Barcelona", "Spain", 2006),
            new Winner("Liverpool", "England", 2005),
            new Winner("FC Porto", "Portugal", 2004),
            new Winner("AC Milan", "Italy", 2003),
            new Winner("Real Madrid", "Spain", 2002),
            new Winner("Bayern Munchen", "Germany", 2001),
            new Winner("Real Madrid", "Spain", 2000),
            new Winner("Manchester Utd.", "England", 1999),
            new Winner("Real Madrid", "Spain", 1998),
            new Winner("Olimpique Marseille", "France", 1993),
        };

        return winners;
    }
};


Now, let’s see how we could display all this info ascending by the year of winning:

IEnumerable winners = UCL.GetWinners();
var result = from w in winners
             orderby w.Year
             select w;

foreach (var w in result)
{
    Console.WriteLine("{0}\t{1}, {2}",
            w.Year, w.Name, w.Country);
}


That lists the following:

1993    Olimpique Marseille, France
1998    Real Madrid, Spain
1999    Manchester Utd., England
2000    Real Madrid, Spain
2001    Bayern Munchen, Germany
2002    Real Madrid, Spain
2003    AC Milan, Italy
2004    FC Porto, Portugal
2005    Liverpool, England
2006    Barcelona, Spain


This SQL-like syntax is however only a “shell” over the functional syntax. It’s just like with the foreach statement. To be able to iterate with foreach, the collection must implement IEnumerable, which has a method that returns a class that implements IEnumerator. What the compiler is doing when using foreach is calling GetEnumerator to get an iterator, and then Reset() on it, and inserts a while(iterator.MoveNext()), using Current to access the current object from the collection. The same happens here with the declarative query syntax.

Now, let’s suppose we want to list only the winners from England. What we have to do is adding a filtering:

var result = from w in winners
             orderby w.Year
             where w.Country == "England"
             select w;


What I wrote above is actually the same with:

var result = winners.
               OrderBy(w => w.Year).
               Where(w => w.Country == "England").
               Select(w => w);


Here we used the operators OrderBy, Where and Select. These are two of the query operators that allow you to perform filtering, projection and key extraction. These are built of the concept of Lambda expression, which are similar to the CLR delegates. We could rewrite the last query list this:

Func filter = w => w.Country == "England";
Func criteria = w => w.Year;
Func project = w => w;

var result = winners.OrderBy(criteria).Where(filter).Select(project);


OrderBy and OrderByDescending are operators that impose a partial order over the keys. Operators ThenBy and ThenByDescending are used to apply additional sorting criteria by only on sorted sequences (SortedSequence).

Where is used to exclude items from the collection.

Select and SelectMany are operators for projecting only those fields or info that is wanted.

Now, let’s try something more complicated: grouping the winners by country, and inside each group ascending by the winning year. In declarative syntax that would be like this:

var result = from w in winners
             orderby w.Year
             group w.Name by w.Country into groups
             orderby groups.Key
             select groups;

foreach (var w in result)
{
     Console.WriteLine("\n{0}", w.Key);

     foreach (var e in w)
     {
        Console.WriteLine("{0}", e);
     }
}


Enagland
Manchester Utd.
Liverpool

France
Olimpique Marseille

Germany
Bayern Munchen

Italy
AC Milan

Portugal
FC Porto

Spain
Real Madrid
Real Madrid
Real Madrid
Barcelona


In functional programming syntax, the same query is written as:

var result = winners.
                  OrderByDescending(w => w.Year).
                  GroupBy(w => w.Country, w => w.Name).
                  OrderBy(w => w.Key).
                  Select(w => w);



The GroupBy operator imposes a partitioning over a sequence of values based on a key extraction function. It returns a sequence of Grouping values, which contains both the key as well as the group of values mapped to the key. The interface of Grouping is:

public sealed class Grouping{
  public Grouping(K key, IEnumerable group);
  public Grouping();
  public K Key {get; set;}
  public IEnumerable Group {get; set;}
}


You may wonder why the declarative syntax starts with the from clause. The SQL language has a problem, i.e. the order of clauses is not natural. Select is the first by at the time of selecting, you don’t know where do you select from. The from clause is naturally the first one, and this was rectified in LINQ. You always have to put from as the first clause in a query (when using the declarative syntax). In the functional syntax you can see you have to apply the operators on a collection, a sequence of items (in our case winners). That is the functional equivalent of from.

Hits for this post: 18982 .

LINQ is a valuable feature for database and XML programming, but not only. Basically, you can use LINQ with everything that returns an IEnumerable. Here is an example, when I want to list the directories of a parent directory, in alfabetical order, and grouped by the file attributes (first all that have only the flag Directory set, then those that have both ReadOnly and Directory, etc.).

First, let’s try how you would put this clasically in C# 2.0. Basically you need these:

  • obtain a list of directory names
  • create a sorted dictionary with the key representing the file attributes, and the value a list of directories
  • iterate over the list of names and create a DirectoryName object
  • add the object to the list corresponding to the key represented by the directory attributes
  • iterate over the dictionary and print its content

The code I wrote for that looks like this:

void ClassicList(string path)
{
    string [] entries = System.IO.Directory.GetDirectories(path);

    SortedDictionary> dic = 
        new SortedDictionary>();

    foreach (string entry in entries)
    {
        DirectoryInfo dir = new DirectoryInfo(entry);

        try
        {
            dic[dir.Attributes].Add(dir);
        }
        catch (KeyNotFoundException)
        {
            dic.Add(dir.Attributes, new List());
            dic[dir.Attributes].Add(dir);
        }
    }

    foreach (KeyValuePair> group in dic)
    {
        Console.WriteLine("Directories with attributes: {0}", group.Key);

        foreach (DirectoryInfo dir in group.Value)
            Console.WriteLine("  {0}", dir.Name);
    }
}


With LINQ, things get much simpler, as you no longer need to create a dictionary. You can apply SQL-like queries on objects. What you have to do is:

  • obtain a list of directory names
  • create a query to select the directories and group them according to their attributes
  • iterate over the result and print the its content

This is how it looks like:

void LinqList(string path)
{
    string[] entries = System.IO.Directory.GetDirectories(path);

    var result = from dir in
                     from e in entries select new DirectoryInfo(e)
                 orderby dir.Name
                 group dir by dir.Attributes into dirGroups
                 orderby dirGroups.Key
                 select dirGroups;

    foreach (var group in result)
    {
        Console.WriteLine("Directories with attributes: {0}", group.Key);

        foreach (var dir in group)
            Console.WriteLine("  {0}", dir.Name);
    }
}


Let’s take the querry line by line and see what it does:

  • from dir in from e in entries select new DirectoryInfo(e)creates a DirectoryInfo for each element of entries and uses the resulted array as source
  • orderby dir.Nameorders the DirectoryInfo array by the name of the directories
  • group dir by dir.Attributes into dirGroupsgroups the directories by their attributes into dirGroups
  • orderby dirGroups.Keyorders the group by the key
  • select dirGroupsselects the groups

And that’s all. You don’t have to struggle with creating dictionaries, maintaining the correct sorting and so on. You can do all of that with a simple query.

The following listing shows a second query that displays the directories ascending by their creation time, and for each directory the name, creation time and attributes are listed. The code also exemplifies the use of anonymous types, and the inference of the type from the assigned value, concepts new in C# 3.0:

void LinqList2(string path)
{
    string[] entries = System.IO.Directory.GetDirectories(path);

    var result = from dir in
                     from e in entries select new DirectoryInfo(e)
                 orderby dir.CreationTime
                 select new { 
                            Name = dir.Name, 
                            Attr = dir.Attributes, 
                            Creation = dir.CreationTime };

    foreach(var item in result)
    {
        Console.WriteLine("{0}\n\t{1}\t{2}", 
                          item.Name, item.Creation, item.Attr);
    }
}


Hits for this post: 18967 .

Anders Hejlsberg, creator of Turbo Pascal and C#, delivered a great presentation on LINQ on Tuesday. This was actually my first contact with LINQ (which stands for Language INtegrated Query), but it makes me envy the C# and VB.NET programmers, because these are the only two languages that support it. LINQ defines a set of general-purpose standard query operators that allow traversal, filter, and projection operations to be expressed in a direct yet declarative way in any .NET-based programming language. It basically introduces SQL-like queries as first citizens of C# and VB.NET.

 Andres Hejlsberg at the MVP Global Summit 2007

A very simple sample provided by Don Box and Anders Hejlsberg in their article “The LINQ Project” at MSDN (http://msdn2.microsoft.com/en-us/library/aa479865.aspx) shows a query on objects in C#:

using System;
using System.Query;
using System.Collections.Generic;

class app {
static void Main() {
string[] names = { "Burke", "Connor", "Frank",
"Everett", "Albert", "George",
"Harris", "David" };

IEnumerable expr = from s in names
where s.Length == 5
orderby s
select s.ToUpper();

foreach (string item in expr)
Console.WriteLine(item);
}
}


The output of the program being:

BURKE
FRANK
DAVID

Procedural languages, that express both what to do and how to do, have reached a point where there is nothing more to enhance. Removing the "how" part of the equation seems to be the next direction in the development of such languages. One of these cases is represented by the LINQ project, which simply makes C# and VB.NET more powerful.

Hits for this post: 18077 .