In the book The Blind Watchmaker, Richard Dawkins explains how evolution happens through accumulated small random variations (or mutations) filtered by the natural selection, as opposed to the mainstream misinterpretation that evolution happens in big leaps. He proposes a computer program, called weasel, to demonstrate how the accumulated small improvements (mutations that bring a benefit to the individual so that it is “chosen” by the natural selection) produce faster results.

I don’t know who it was first pointed out that, given enough time, a monkey bashing away at random on a typewriter could produce all the works of Shakespeare. The operative phrase is, of course, given enough time. Let us limit the task facing our monkey somewhat. Suppose that he has to produce, not the complete works of Shakespeare but just the short sentence ‘Methinks it is like a weasel’, and we shall make it relatively easy by giving him a typewriter with a restricted keyboard, one with just the 26 (capital) letters, and a space bar. How long will he take to write this one little sentence?

[...]

We again use our computer monkey, but with a crucial difference in its program. It again begins by choosing a random sequence of 28 letters, just as before … it duplicates it repeatedly, but with a certain chance of random error – ‘mutation’ – in the copying. The computer examines the mutant nonsense phrases, the ‘progeny’ of the original phrase, and chooses the one which, however slightly, most resembles the target phrase, METHINKS IT IS LIKE A WEASEL.

The phrase “METHINKS IT IS LIKE A WEASEL” is from Hamlet:

Hamlet: Do you see yonder cloud that’s almost in shape of a camel?
Polonius: By the mass, and ’tis like a camel, indeed.
Hamlet: Methinks it is like a weasel.

The algorithm as described on Wikipedia, is as follows:

  1. Start with a random string of 28 characters.
  2. Make 100 copies of this string, with a 5% chance per character of that character being replaced with a random character.
  3. Compare each new string with the target “METHINKS IT IS LIKE A WEASEL”, and give each a score (the number of letters in the string that are correct and in the correct position).
  4. If any of the new strings has a perfect score (28), halt.
  5. Otherwise, take the highest scoring string, and go to step 2.

A character is an upper case letter or space, the number of copies per iteration could be 100 and the mutation rate 5%.

The following C# code is an implementation of the algorithm.

class Weasel1
{
   private const string ALLOWED_CHARS = "ABCDEFGHIJKLMNOPQRSTUVXYWZ ";
   private const string TARGET = "METHINKS IT IS LIKE A WEASEL";
   Random rand = new Random();

   private int Fitness(string candidate)
   {
      int score = 0;
      for (int i = 0; i < candidate.Length; ++i)
      {
         if (candidate[i] == TARGET[i])
            score++;
      }

      return score;
   }

   private string Mutate(string parent)
   {
      StringBuilder builder = new StringBuilder();
      for (int i = 0; i < parent.Length; ++i)
      {
         bool mutate = rand.Next(20) == 10;
         builder.Append(
            mutate ?
            ALLOWED_CHARS[rand.Next(ALLOWED_CHARS.Length)] : parent
            [i]);
      }
      return builder.ToString();
   }

   public void Run(int copies)
   {
      StringBuilder builder = new StringBuilder();
      for (int i = 0; i < TARGET.Length; ++i)
      {
         builder.Append(ALLOWED_CHARS[rand.Next(ALLOWED_CHARS.Length)]);
      }
      string parent = builder.ToString();

      int step = 1;
      Console.WriteLine("{0:D3} {1}", step, parent);

      do
      {
         step++;

         string[] children = new string[copies];
         for (int i = 0; i < copies; ++i)
         {
            children[i] = Mutate(parent);
         }

         int indexMax = 0;
         int scoreMax = 0;
         for (int i = 0; i < copies; ++i)
         {
            int score = Fitness(children[i]);

            if (score > scoreMax)
            {
               scoreMax = score;
               indexMax = i;
            }
         }

         parent = children[indexMax];

         Console.WriteLine("{0:D3} {1}", step, parent);

      } while (parent != TARGET);
   }
}

class Program
{
   static void Main(string[] args)
   {
      Weasel1 weasel = new Weasel1();
      weasel.Run(100);
   }
}

Function Fitness() computes how similar an offspring is to the parent, function Mutate() makes a copy of the parent with a 5% chance for each character to mutate. Function Run() runs the algorithm until the target string is reached. A possible output of the program is shown below. Different runs require different number of iterations (even more than 100).

001 E PVBQVRYRNQSRBCDLOCFOGEIB J
002 E PVBQVRYRNQSRBCDLOCFOWEIB J
003 E PVIQVRYRNQSRBCDLOCFOWEIB J
004 EEPLIQVUYRNQSRBCDLOCFOWEIB J
005 EEPLIQVU RNQSRBCDLOCFOWEIB J
006 EEPHIQLU RNQSRBCDLOCFKWEIB J
007 EEPHIQLU RNQSRHCDKOCFKWEIB J
008 EEPHIQLU RNQSSHCDKOCFKWEIB J
009 EEPHIQLU RNQSS CDKOCFKWEIB J
010 EEPHIQLU RNQSS CDKOCWKWEKB L
011 EEPHIQLU RNQSS CDKOCOKWEKB L
012 EEPHIQLU RN SS CDKOGOKWEKB L
013 EEPHIQLU RN SS LDKOGOKWEKB L
014 EEPHIQLU RT SS LDKOGOKWEKB L
015 EEPHIQLU RT SS LDKOGOKWEKBEL
016 EEPHIQLU RT SS LPKOGOKWEKBEL
017 EEPHIQLU RT SS LPKOGOKWEKSEL
018 EEPHIQLU IT SS LPKOGOKWEKSEL
019 EEPHIQLU IT SS LIKOGOKWEKSEL
020 EEPHIQLU IT SS LIKOGOKWEKSEL
021 EEKHIQLS IT SS LIKOGOKWEKSEL
022 EEKHIQLS IT SS LIKOGOKWEKSEL
023 EEKHIQLS IT SS LIKO OKWEKSEL
024 EEKHIQLS IT SS LIKO OFWEKSEL
025 EEKHIQLS IT ES LIKO OFWENSEL
026 EEKHINLS IT ES LIKO OFWENSEL
027 EEKHINLS IT ES LIKO O WENSEL
028 EEKHINLS IT ES LIKO O WECSEL
029 EEKHINLS IT BS LIKO O WEHSEL
030 EEKHINLS IT QS LIKO O WEASEL
031 EETHINSS IT QS LIKO O WEASEL
032 EETHINSS IT QS LIKO O WEASEL
033 EETHINSS IT IS LIKO O WEASEL
034 EETHINSS IT IS LIKO O WEASEL
035 EETHINBS IT IS LIKO O WEASEL
036 XETHINBS IT IS LIKO O WEASEL
037 XETHINBS IT IS LIKO O WEASEL
038 XETHINBS IT IS LIKO O WEASEL
039 XETHINBS IT IS LIKO O WEASEL
040 XETHINBS IT IS LIKK O WEASEL
041 XETHINBS IT IS LIKK O WEASEL
042 XETHINBS IT IS LIKK O WEASEL
043 XETHINKS IT IS LIKK O WEASEL
044 XETHINKS IT IS LIKK   WEASEL
045 XETHINKS IT IS LIKK   WEASEL
046 XETHINKS IT IS LIKK   WEASEL
047 XETHINKS IT IS LIKE   WEASEL
048 XETHINKS IT IS LIKE   WEASEL
049 METHINKS IT IS LIKE   WEASEL
050 METHINKS IT IS LIKE   WEASEL
051 METHINKS IT IS LIKE A WEASEL

We can change this algorithm to randomize the chance of mutation. The changes to the code are small (basically function Mutate changes to take not only the parent string but only the chance for the mutation rate), but this increases the number of iterations needed to reach the target string.

class Weasel2
{
   const string TARGET = "METHINKS IT IS LIKE A WEASEL";
   const string ALLOWED_CHARS = "ABCDEFGHIJKLMNOPQRSTUVWXYZ ";
   Random rand = new Random();

   private double Fitness(string candidate)
   {
      int score = 0;
      for (int i = 0; i < candidate.Length; ++i)
      {
         if (candidate[i] == TARGET[i])
            score++;
      }

      return score;
   }

   private string Mutate(string parent, double rate)
   {
      StringBuilder builder = new StringBuilder();
      foreach(char c in parent)
      {
         builder.Append(
            rand.NextDouble() < rate ?
            c:
            ALLOWED_CHARS[rand.Next(ALLOWED_CHARS.Length)]);
      }

      return builder.ToString();
   }

   public void Run(int copies)
   {
      StringBuilder builder = new StringBuilder();
      for (int i = 0; i < TARGET.Length; ++i)
      {
         builder.Append(ALLOWED_CHARS[rand.Next(ALLOWED_CHARS.Length)]);
      }
      string parent = builder.ToString();

      int step = 1;
      Console.WriteLine("{0:D4} {1}", step, parent);

      do
      {
         string[] children = new string[copies];
         for(int i = 0; i < copies; ++i)
         {
            children[i] = Mutate(parent, rand.NextDouble());
         }

         int indexMax = 0;
         double scoreMax = 0;
         for (int i = 0; i < copies; ++i)
         {
            double score = Fitness(children[i]);

            if (score > scoreMax)
            {
               scoreMax = score;
               indexMax = i;
            }
         }

         parent = children[indexMax];

         Console.WriteLine("{0:D4} {1}", step, parent);
         step++;

      } while (parent != TARGET);
   }
}

Here is the output, but because of the bigger number of iterations, only every 20th iteration is shown:

0001 BXZZBDJSZTVYFWRSGNKYEULISWRK
0020 MSTCFNBSTITVIS L KPLEDBRASEC
0040 METCINQSDITVIS LIKW A IRISEW
0060 METLINKSOITJIS LJKW A IGASEW
0080 METLINKSOITJIS LJKX A WEAUEL
0100 METLINKSTKTXIS LIKK A WEAOEL
0120 METHINKS KTXIS LIKH A WEAOEL
0140 METHINKSDKTXIS LIKA A WEASEL
0160 METHINKSEGTXIS LIKA A WEASEL
0180 METHINKSEGTEIS LIKE A WEASEL
0200 METHIMKSEGTMIS LIKE A WEASEL
0220 METHIMKSTITNIS LIKE A WEASEL
0240 METHIAKSTIT IS LIKE A WEASEL
0260 METHIPKSTIT IS LIKE A WEASEL
0280 METHJKKSTOT IS LRKE A WEASEL
0300 METHYQKSCOT IS LRKE A WEASEL
0320 METHFPKSCOT IS LRKE A WEASEL
0340 METHFPKSCGT IS LRKE A WEASES
0360 METHINKS XT IS LMOE A WEAIEL
0380 METHINKS YT IS LMZE A WEAIEL
0400 METHINKS IT IS LMHE A WEAIEL
0420 METHINKS IT IS LEDE A WEANEL
0440 METHINKS IT IS LIPE A WEANYL
0460 METHINKS IT IS LIPE A WEAOYL
0480 METHINKS IT IS LIWE A WEARWL
0500 METHINKS IT IS LIWEDA WEAFWT
0520 METVINKS IT IS LILE A WEABBG
0540 METVINKS IT IS LIJE A WEABTG
0560 METHINKS IT IS LIJE A WEAMRV
0580 METHINKS IT IS LIZE A WEAMRV
0600 METHINKS IT IS LIZE A WEA OV
0620 METHINKS IT IS LIZE A WEAHE
0640 METHINKS IT IS LIJE A WEAHEH
0660 METHINKS IT IS LIFE A WEAHEH
0680 M THINKS IT IS LINE I WXAHEL
0700 MQTHINKS IT IS LIAE I WXASEL
0720 MZTHIRKS IT IS LIKE T WXASEL
0740 METHINKR IT IS LIKE T WRASEL
0760 METOINKI IT IS LIKE T WRASEL
0780 METMINKM IT IS LIKE A WQASEL
0800 METMINKS IT IS LIKE A WQASEL
0820 METJIXKS IT IS LIKE A WQASEL
0840 METJILKS IT IS LIKE A WUASEL
0860 METJIPKS IP IS LIKE A WAASEL
0880 METJIPKS IW IS LIKE A WLASEL
0900 METIIBKS QT IS LIKE A SEASEL
0920 METIIPKS PT IS LIKE A SEASEL
0940 METKINKS PT IS LIKE A TEASEL
0960 METKINKS BT IS LIKE A QEASEL
0980 METWINKS IT IS LIKE A QEASEL
1000 METOINKS IT IS LIKE ANQEASEL
1020 METUINKS IT IS LIKE AHTEASEL
1040 METUINKS IT IS LIKE AMWEASEO
1060 METQINKS IT IS LIKE AMWEASEO
1080 MZTYINKS IT ISULIKE AKWEASEL
1100 MZTYINKS IT ISULIKE APWEADEL
1120 MJTHINKS IT ISULIKE AEWEAQEL
1140 MJTHINKS IT ISULIKE AFWEAUEL
1160 MJTHINKS IT ISCLIKP AVWEABEL
1180 MJTHINKS IT IS LIKP AVWEAGEG
1200 MNTHINKS IT IS LIKM AVWEACE
1220 MNTHINKS IT IS LIKM AQWEACEL
1240 MNTHINKS IT IS LIKMXAKWEAAEL
1260 MNTHINKS CT IS LIKMGAPWEASEL
1280 MPTHINKS IX IS EIK GA WEASEL
1300 METHINKS IX IS LIK RA WEASEL
1320 METHINKS IZ IS LIKEWA WEASEL
1340 METHINKS IZ ISILIKEWA WEASEL
1360 METHINKS IJ ISILIKE A WEASEL
1380 METHINKS IM ISELIKE A WEASEL
1400 METHINKS IM ISELIKE A WEASEL
1420 METHINKS IU ISELIKE A WEASEL
1440 METHINKS IY ISELIKE A WEASEL
1460 METHINKS IT ISVLIKE A WEASEL
1480 METHINKS IT ISCLIKE A WEASEL
1500 METHINKS IT ISCLIKF A WEASEL
1520 METHINKS IT ISTLIKF A WEASEL
1540 METXINKS IT ISOLIKF A WEASEL
1560 MZTXINKS IT ISOLIKC A WEASEL
1580 MATXINKS IT IS LIKC A WEASEL
1600 MATXINKS IT IS LIKE A WEDSEL
1620 METHINKSGITVIC LIKE A WEASEL
1640 METHINKSGITQIC LIKE A WEASEL
1660 METHINLSAITQIC LIKE A WEASEL
1680 METHINLSAITQIH LIKE A WEASEL
1700 METHINGS ITAIH LIKE A WEASEL
1720 METHINGS ITRIS LIKE W WEASEL
1740 METHINVS IT IS LIJE A WEASEL
1760 METHINVS IT IS LIGE A WEASEL
1780 METHINVS IT IS LIGE A WEASEL
1800 METHINVS IT IS LI E A WEASYL
1820 METHIN S IT IS LIKE A WTASYT
1840 METHINQS IT IS LIKE A STASYT
1860 METHINWA IT IS LIKE A IFASES
1880 METHINKA IT IS LIKE A IFASEJ
1900 METHINKA IT YS LIKE O IEAAEL
1920 METHINKU IT YS LIKE O WEAAEL
1940 METHOJKM IT JS LIKE O WEAPEL
1960 METHOJKF IT IS LIKE  RWEATEL
1980 METHFGKF IT IS LIKE ARWBATEL
2000 METHYNKG IT IS LIKE ARWEALEL
2020 METHUNKD IT IS LIKE AAWEAWEL
2040 METHIWKN IT IS LIKE AYWEADEL
2060 METHIUKN IT IS LIKE AKWEAQEL
2080 METHIUKN IT IS LIKE A WEAQEL
2100 METHIAKN IT IS LIKE N WEAQEL
2120 METHINKK IT IS LIKE N WEEQEL
2140 METHDNKQ IT IS LIKE A WEBSEL
2160 METHDNKQ IT IS LIKE A WEBSEL
2180 METHDNKQ IT IS LIKE A WEBSEL
2200 METHMNKQ IT IS LIKE A WEBSEL
2220 MEKHBNKX IT IS LIKE A WEBSEL
2240 MEBHQNKS IT IS LIKE A WEBSEL
2260 METHQNKS IT IS LIKEZA WEFSEL
2280 METHGNKS IT IS LIKEZA WESSEL
2300 METHGNKS IT IS LIKEZA WEWSEL
2320 METHLNKS IT IP BIKE A WEOSEL
2340 METHYNKS IT IB SIKE A WEQSEL
2360 METHINKS ITQIJ LIKE A WEQSEL
2380 METHINKS ITIIJ LIKE A WEYSTL
2400 METQINKS ITIIJ LIKE A WEASGL
2420 METHINKS ITIIJJLIKE A WEASGL
2440 METHINKS ITQIWJLIKE A WEAS L
2460 METHINKS ITZIWJLIKE A WEASCL
2480 METHINKS IT IDLLIZE A UEASCL
2500 METHINKS IT ID LIZE A IEASZL
2520 METHINKS IT ID LIZE A IEASZL
2540 METHINKS IT IB LIZE A XEASOL
2560 METHINKS IT IB LIAE A  EASOL
2580 METHINKS IT IBVLICE A  EASEL
2600 METHINKS IT IBVLIKE A  EASEL
2620 METHINKS IT IKDLIKE A  EASEL
2640 METHINKS IT IKNLIKE A HEASEL
2660 METHINKS IT IKNLIKE A HIASEL
2680 METHINKS IT IZNLIKE A HIASEL
2700 METHINKS IT IZNLIKE A WOASEL
2720 METHINKS IT ISNLIKE A WOASEL
2740 METHINKS IT ISNLIKE M WSASEL
2760 METHINKS IT ISVLIKE M WXASEL
2780 METHINKS IT IS LIKE S WEASEL
2785 METHINKS IT IS LIKE A WEASEL

Notice that matching letters are not immutable. A letter, even if it matches one in the target string can change at any time.

You can find similar implementations in several programming languages on the Rosetta Code website.

, , , Hits for this post: 1274 .

I recently found a piece of code that can be summarized by the following sample:

interface I
{
   void F1();
   void F2();
}

class X
{
   public void F2() { Console.WriteLine("F2"); }
}

class A : X, I
{
   public void F1() { Console.WriteLine("F1"); }
}

As you can see there is an interface I that has two methods, F1 and F2. A is derived from X, that has a method F2, and also implements I, but only contains F1. I was puzzled at first, because I was expecting that A was explictitly implementing all the methods defined in the interface I. But F2 was implemented in X, its base class. After thinking a little bit it all become clear. This was a normal behavior of the compiler.

When a class A implements an interface I it guarantees that it supports (implements) the entire contract that the interface defines. But it does not assert that it will explicitly implement all the interface members within its explicit definition. I’m stressing on the explicit word here, because A extends (is derived from) X. That means A is an X. Everything that X exposes (i.e. what is visible to its derived classes) is part of A too.

In our case, F2, implemented in X, is also available to A, because A is an X. Since both F1 and F2 are members of A, then it means A fully implements I, which makes the code compile just fine.

How is this helpful? Suppose you have several interfaces that all define one ore several members with the same meaning.

interface I1
{
  void F1();
  void F2();
  int ErrorCode { get; }
}

interface I2
{
  void G1();
  void G2();
  int ErrorCode { get; }
}

interface I3
{
  void H1();
  int ErrorCode { get; }
}

Instead of providing the same implementation several times, like in the following code, you can have only one implementation for the common functionality.

class A : I1
{
  private int m_errorCode;

  public void F1() {}
  public void F2() {}
  public int ErrorCode { get {return m_errorCode;} }
}

class B : I2
{
  private int m_errorCode;

  public void G1() {}
  public void G2() {}
  public int ErrorCode { get {return m_errorCode;} }
}

class C : I3
{
  private int m_errorCode;

  public void H1() {}
  public int ErrorCode { get {return m_errorCode;} }
}

We can create one class that provides the implementation for ErrorCode and let the others extend it and implement the corresponding interface.

class X
{
  protected int m_errorCode;

  public int ErrorCode { get {return m_errorCode;} }
}

class A : X, I1
{
  public void F1() {}
  public void F2() {}
}

class B : X, I2
{
  public void G1() {}
  public void G2() {}
}

class C : X, I3
{
  public void H1() {}
}

, , , Hits for this post: 2234 .

.NET provides two classes for image transformations: Matrix, used for geometric transformations, and ColorMatrix, used for color transformations.

One of such color transformations is inverting or negating. This means subtracting each color component from 255. Black (0,0,0) becomes White (255, 255, 255), and Green (0, 255, 0) becomes Magenta (255, 0, 255).

You can find many examples on the web that look like this:

public Bitmap Transform(Bitmap source)
{
    //create a blank bitmap the same size as original
    Bitmap newBitmap = new Bitmap(source.Width, source.Height);

    //get a graphics object from the new image
    Graphics g = Graphics.FromImage(newBitmap);

    // create the negative color matrix
    ColorMatrix colorMatrix = new ColorMatrix();
    colorMatrix.Matrix00 = colorMatrix.Matrix11 = colorMatrix.Matrix22 = -1f;
    colorMatrix.Matrix33 = colorMatrix.Matrix44 = 1f;

    // create some image attributes
    ImageAttributes attributes = new ImageAttributes();

    attributes.SetColorMatrix(colorMatrix);

    g.DrawImage(source, new Rectangle(0, 0, source.Width, source.Height),
                0, 0, source.Width, source.Height, GraphicsUnit.Pixel, attributes);

    //dispose the Graphics object
    g.Dispose();

    return newBitmap;
}

Using this code one can get a negative image.

Original image

Original image

Negative image

Negative image

This runs fine on Windows XP. But when I ran it on Windows 7, I was getting only a black image. All the pixels were ARGB(255, 0, 0, 0). This was how it looked:

Incorrectly transformed image

Incorrectly transformed image

I was surprised to learn that it worked on Windows XP, but not on Windows 7. I don’t have Windows Vista to test but I guess it’s the same as with Windows 7. I thought it must be something in the GDI+ library, because building with .NET 3.5 SP1 or 4.0 Beta 2 didn’t change a thing.

After trying different things, I figured out what the problem was: the color matrix was incorrect. It must be defined like this:

ColorMatrix colorMatrix = new ColorMatrix(
   new float[][]
   {
      new float[] {-1, 0, 0, 0, 0},
      new float[] {0, -1, 0, 0, 0},
      new float[] {0, 0, -1, 0, 0},
      new float[] {0, 0, 0, 1, 0},
      new float[] {1, 1, 1, 0, 1}
   });

With this change the Transform function produces a correct negative image, regardless the operating system or the .NET framework version.

However, what I don’t know yet, is why it worked on Windows XP. The only conclusion I can draw is that the GDI+ implementation has a fault there, that was later corrected. That’s why an incorrect color matrix produced a correct transformation on Windows XP.

, , , , Hits for this post: 2457 .

.NET 3.0 provides some support for working with ZIP files. However, it has an important drawback: it only works for packages that are conformant to the Open Packaging Convention standard. Most of the ZIP files are not. Codeplex features a library called DotNetZip that provides support for packing and unpacking in C#, VB.NET or any other .NET language, but also any COM environment, including Javascript, VBSCript, VB6, VBA, PHP, Perl. In addition to the basic packing and unpacking operations, it supports password protection, UNICODE filenames, ZIP64 and AES encryption, comment, and others. Here are some simple samples.

Creating a ZIP file:

      public static void PackZip(string source, string targetzip)
      {
         if(File.Exists(source))
         {
            PackFile(source, targetzip);
         }
         else if(Directory.Exists(source))
         {
            PackFolder(source, targetzip);
         }
         else
         {
            Console.WriteLine("Source does not exists!");
         }
      }

      public static void PackFile(string file, string targetzip)
      {
         using (ZipFile zipfile = new ZipFile())
         {
            zipfile.AddFile(file, String.Empty);
            zipfile.Save(targetzip);
         }
      }

      public static void PackFolder(string folder, string targetzip)
      {
         using (ZipFile zipfile = new ZipFile())
         {
            zipfile.AddDirectory(folder);
            zipfile.Save(targetzip);
         }
      }

Unpacking to a target folder:

      public static void UnpackZip(string zippath, string targetdir)
      {
         if(ZipFile.IsZipFile(zippath))
         {
            using (ZipFile zipfile = ZipFile.Read(zippath))
            {
               foreach (ZipEntry zipentry in zipfile)
               {
                  try
                  {
                     zipentry.Extract(targetdir, ExtractExistingFileAction.OverwriteSilently);
                  }
                  catch (ZipException ex)
                  {
                     if (ex.InnerException == null)
                     {
                        Console.WriteLine(ex.Message);
                     }
                     else
                     {
                        Console.WriteLine("{0}: {1}", ex.Message, ex.InnerException.Message);
                     }
                  }
               }
            }
         }
         else
         {
            Console.WriteLine("Not a zip file!");
         }
      }

Display the content of a ZIP archive:

      public static void ShowZipContent(string zippath)
      {
         if(ZipFile.IsZipFile(zippath))
         {
            using(ZipFile zipfile = ZipFile.Read(zippath))
            {
               foreach(ZipEntry zipentry in zipfile)
               {
                  Console.WriteLine("{0}", zipentry.FileName);
               }
            }
         }
         else
         {
            Console.WriteLine("Not a zip file!");
         }
      }

You can find many more examples on Codeplex: C# and VB.NET.

Hits for this post: 1979 .

Microsoft has made available a first beta version of an experimental version of .NET 4.0, called .NET Framework 4.0 Beta 1 Enabled for Software Transactional Memory v1.0. Since that is quite a long name, the short one is STM.NET. This is a special version of .NET 4.0 that enables software transactional memory for C#. It allows programmers to demarcate regions of code as operating in an atomic, isolated transaction from other code running concurrently. The means to do this is a delegate called Atomic.Do, or try-catch blocks. Might be that in the future an ‘atomic’ block will be added to the language(s).

This first version of the framework, also comes with additional tools:

  • tooling (debugging, ETW tracing)
  • lock interoperability
  • interoperability with traditional transactions
  • annotations (how methods run in transactions, suppressed transactions on methods, etc.)
  • static and dynamic checking of annotations

On the other hand there are some limitations:

  • only works for C# for now
  • cannot be installed on a machine with VS 2010, nor the opposite
  • there is only a 32-bit version

More information about it can be found at the STM team blog or MSDN DevLabs.

, , , Hits for this post: 5072 .

.NET allows you to expose components as COM and consume them from unmanaged code. There are many references on how to this (and you can only start with MSDN), and I will not talk about that part. What I want to explain here is something different. Suppose you have this interface:

[Guid("2F8433FE-4771-4037-B6B2-ED5F6585ED04")]
[InterfaceType(ComInterfaceType.InterfaceIsIDispatch)]
public interface IAccounts
{
      [DispId(1)]
      string[] GetUsers();
}

Method GetUsers() returns an array on string representing the user names. But what if you also wanted the user passwords or addresses? Since this is exposed as COM, you cannot return an array of User. But you can return multiple arrays of string. So, how would you deal with out string[]? This is what I want to show you in this tutorial.

This is a .NET interface exposed to COM. It has two methods, GetUsers() that returns an array of string representing user names, and GetUsers2() that returns an array of strings as an output parameters and a bool as return type, indicating whether any user was found.

namespace SampleLibrary
{
   [Guid("2F8433FE-4771-4037-B6B2-ED5F6585ED04")]
   [InterfaceType(ComInterfaceType.InterfaceIsIDispatch)]
   public interface IAccounts
   {
      [DispId(1)]
      string[] GetUsers();

      [DispId(2)]
      bool GetUsers2(out string [] users);
   }
}

And this is the implementation:

namespace SampleLibrary
{
   [Guid("C4713144-5D29-4c65-BF9C-188B1B7CD2B6")]
   [ClassInterface(ClassInterfaceType.None)]
   [ProgId("SampleLibrary.DataQuery")]
   public class Accounts : IAccounts
   {
      List< string > m_users;

      public Accounts()
      {
         m_users = new List< string > {
            "marius.bancila",
            "john.doe",
            "anna.kepler"
         };
      }

      #region IDataQuery Members

      public string[] GetUsers()
      {
         return m_users.ToArray();
      }

      public bool GetUsers2(out string[] users)
      {
         users = m_users.ToArray();

         return users.Length > 0;
      }

      #endregion
   }
}

Note: If you are trying this example make sure you set the ComVisible attribute to true, either for each type or per assembly (in AssemblyInfo.cs)

[assembly: ComVisible(true)]

Second, you have to check the “Register for COM interop” setting in the Build page of the project properties.

The first thing to do in C++ is importing the .TLB file that was generated by regasm.exe.

#import "SampleLibrary.tlb"
using namespace SampleLibrary;

If we look in the .TLB file, we can see how the IAccounts interface looks like:

struct __declspec(uuid("2f8433fe-4771-4037-b6b2-ed5f6585ed04"))
IAccounts : IDispatch
{
    //
    // Wrapper methods for error-handling
    //

    // Methods:
    SAFEARRAY * GetUsers ( );
    VARIANT_BOOL GetUsers2 (
        SAFEARRAY * * users );
};

The following C++ functions, GetUsers1() retrieves the users users list using method GetUsers() from IAccounts. It puts the users in a CStringArray (notice that this container does not have an assignment operator, so the only way to return such an array is with a reference in the parameters list).

void GetUsers1(CStringArray& arrUsers)
{
   IAccountsPtr pAccounts(__uuidof(Accounts));

   SAFEARRAY* sarrUsers = pAccounts->GetUsers();

   _variant_t varUsers;
   varUsers.parray = sarrUsers;
   varUsers.vt = VT_ARRAY | VT_BSTR;

   UnpackBstrArray(varUsers, arrUsers);
   SafeArrayDestroy(sarrUsers);

   pAccounts->Release();
}

UnpackBstrArray() is a function (see below) that extracts the elements of a SAFEARRAY and adds them to a CStringArray.

Function GetUsers2() uses the second method, GetUsers2() from IAccounts. This needs the address of a pointer to a SAFEARRAY (i.e. SAFEARRAY**) that will hold the values returned by the COM method. This time we have to create an empty SAFEARRAY and then pass its address to the COM method. The rest is similar to the previous case.

void GetUsers2(CStringArray& arrUsers)
{
   IAccountsPtr pAccounts(__uuidof(Accounts));

   SAFEARRAYBOUND aDim[1];
   aDim[0].lLbound = 0;
   aDim[0].cElements = 0;

   SAFEARRAY* sarrUsers = SafeArrayCreate(VT_BSTR, 1, aDim);

   VARIANT_BOOL ret = pAccounts->GetUsers2(&sarrUsers);
   if(ret != VARIANT_FALSE)
   {
      _variant_t varUsers;
      varUsers.parray = sarrUsers;
      varUsers.vt = VT_ARRAY | VT_BSTR;
      UnpackBstrArray(varUsers, arrUsers);
   }

   SafeArrayDestroy(sarrUsers);

   pAccounts->Release();
}

The helper method UnpackBstrArray() used previous looks like this:

void UnpackBstrArrayHelper(VARIANT* pvarArrayIn, CStringArray* pstrarrValues)
{
   if (!pstrarrValues || !pvarArrayIn || pvarArrayIn->vt == VT_EMPTY)
      return;

   pstrarrValues->RemoveAll();

   VARIANT* pvarArray = pvarArrayIn;
   SAFEARRAY* parrValues = NULL;

   SAFEARRAYBOUND arrayBounds[1];
   arrayBounds[0].lLbound = 0;
   arrayBounds[0].cElements = 0;

   if((pvarArray->vt & (VT_VARIANT|VT_BYREF|VT_ARRAY)) == (VT_VARIANT|VT_BYREF) &&
      NULL != pvarArray->pvarVal &&
      (pvarArray->pvarVal->vt & VT_ARRAY))
   {
      pvarArray = pvarArray->pvarVal;
   }

   if (pvarArray->vt & VT_ARRAY)
   {
      if (VT_BYREF & pvarArray->vt)
         parrValues = *pvarArray->pparray;
      else
         parrValues = pvarArray->parray;
   }
   else
      return;

   if (parrValues != NULL)
   {
      HRESULT hr = SafeArrayGetLBound(parrValues, 1, &arrayBounds[0].lLbound);
      hr = SafeArrayGetUBound(parrValues, 1, (long*)&arrayBounds[0].cElements);
      arrayBounds[0].cElements -= arrayBounds[0].lLbound;
      arrayBounds[0].cElements += 1;
   }

   if (arrayBounds[0].cElements > 0)
   {
      for (ULONG i = 0; i < arrayBounds[0].cElements; i++)
      {
         LONG lIndex = (LONG)i;
         CString strValue = _T("");

         VARTYPE vType;
         BSTR bstrItem;

         ::SafeArrayGetVartype(parrValues, &vType);
         HRESULT hr = ::SafeArrayGetElement(parrValues, &lIndex, &bstrItem);

         if(SUCCEEDED(hr))
         {
            switch(vType)
            {
            case VT_BSTR:
               strValue = (LPCTSTR)bstrItem;
               break;
            }

            ::SysFreeString(bstrItem);
         }

         pstrarrValues->Add(strValue);
      }
   }
}

void UnpackBstrArray( const _variant_t &var, CStringArray &strarrValues  )
{
   UnpackBstrArrayHelper( &(VARIANT)const_cast< _variant_t & >(var), &strarrValues );
}

Attached you can find a demo project (C# and C++) with the complete example show in this tutorial.

[download id="4"]

Hits for this post: 6941 .

In a recent post I wrote about Code Contracts in .NET. Now you can find a more detailed article on this topic at sharparena.com. In this article I’m providing more information and examples on:

  • pre-conditions
  • post-conditions
  • object invariants
  • asserts and assumptions
  • quantifiers

In additions, you should check the official user documentation, which can be found here.

, , , , Hits for this post: 4893 .

Visual Studio 2010 has support for code contracts that allow to express pre-, post-conditions and invariants to your .NET code.

Let’ say you want to create a function to return a random value in a range. This could look like it:

    class Program
    {
        Random rng = new Random();

        public int GetRandom(int min, int max)
        {
            return rng.Next(min, max);
        }

        static void Main(string[] args)
        {
            Program p = new Program();
            int n1 = p.GetRandom(10, 20);
            int n2 = p.GetRandom(10, 10);
        }
    }

However, at a rough analysis one can find two problems:

  • Second call to GetRandom(), is not well formed, because the range is 0
  • Radnom.Next returns a value greater or equal to the first argument, and lower than the second.

What code contracts provide is a mean to check that some statements, like:

  • maximum value of the range should always be greater than the minimum value
  • returned value should always be in the interval, equal or greater than the minimum, and equal or less than then maximum

The first is a pre-requisite, and the second is a post-requisite. We can specify those with:

        public int GetRandom(int min, int max)
        {
            Contract.Requires(max > min);
            Contract.Ensures(Contract.Result() >= min &&
                             Contract.Result() <= max);

            return rng.Next(min, max);
        }

The Contract class is available in namespace System.Diagnostics.Contracts. To enable the static checking, you have to go to Project Properties > Code Contracts and select "Perform Static Contract Checking."

Code Contracts Property Page

Code Contracts Property Page

When you build, you get the following warnings:

Code Contracts warnings

Code Contracts warnings

The first says that the call GetRandom(10, 10) does not match the pre-condition. The second warning indicates that the post-condition is not met. It isn't possible to know whether Random.Next() returns a value that hods the post-condition. But if you check the "Perform Runtime Contract Checking" it asserts at runtime, when the return value is outside the interval (not possible with this code sample).

You can read more about code contracts on the BCL team's blog. It features a list of possible constructs for pre- and post-requisites, but also object invariants.

Code Contracts are also available for Visual Studio 2008. For downloads and additional information check the following links:

, , , , Hits for this post: 8322 .

One of the new features in .NET 4.0 is the BigInteger class. This was suppose to be part of 3.5 framework, but after the CTP it was gone. Now it’s back. This big integer is a really large integer representation. However, I was not able to find the precision documented anywhere.

The class is available in the System.Numerics namespace and provides overloaded operators, parsing methods and other utility features for bit integer numbers.

Not very long ago, on codexpert.ro we launched a contest for finding the biggest pair of amicable numbers. One of the members (crystyce) that submited a solution found this big pair (2724918040393706557785752240819405848576, 2724918040396184856306258038787235905536) with a C++ implementation using a 3rd party BigInt class. I decided to put that in C# 4.0 and use BigInteger. Maybe you can find things to optimize in the below code, but that’s not the point here. I’m showing you this code just to get a feeling of BitInteger.

using System.Numerics;

namespace cs_test
{
   static class Amicable
   {
      public static BigInteger Pow2(BigInteger exp)
      {
         BigInteger res = 1;

         for (BigInteger i = 0; i < exp; i++)
            res *= 2;

         return res;
      }

      public static BigInteger ExpMod(BigInteger value, BigInteger exp, BigInteger mod)
      {
         BigInteger ullResult = 1;
         BigInteger ullValue = value;

         while (exp > 0)
         {
            if (exp % 2 != 0)
            { // odd
               ullResult *= ullValue;
               ullResult %= mod;
            }

            ullValue *= ullValue;
            ullValue %= mod;
            exp /= 2;
         }

         return (ullResult);
      }

      static public bool IsMillerRabinPrime(BigInteger prime)
      {
         // randomWitness = witness that the "prime" is NOT composite
         // 1 < randomWitness < prime-1
         long totalWitness = 15;
         BigInteger [] randomWitness = { 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 51001, 351011};
         BigInteger primeMinusOne = prime - 1;
         BigInteger oddMultiplier;
         long bitLength;
         long i, j;
         BigInteger result;
         // find oddMultiplier, defined as "prime - 1 = (2^B) * M"
         // get bitLength (B) and find the oddMultiplier (M)

         // init value multiplier
         oddMultiplier = primeMinusOne;

         bitLength = 0; // reset
         while(oddMultiplier % 2 == 0)
         {
            oddMultiplier /= 2;
            bitLength++;
         }
         for(i = 0; i < totalWitness; i++)
         {
            if (randomWitness[i] == prime)
               return true;

            // init value of result = (randomWitness ^ oddMultiplier) mod prime
            result = ExpMod(randomWitness[i], oddMultiplier, prime);

            // is y = 1 ?
            if(result == 1)
               continue; // maybe prime

            // is y = prime-1 ?
            if(result == primeMinusOne)
               continue; // maybe prime

            // loop to get AT LEAST one "result = primeMinusOne"
            for(j = 1; j <= bitLength; j++)
            {
               // square of result
               result = ExpMod(result, 2, prime);

               // is result = primeMinusOne ?
               if(result == primeMinusOne)
                  break; // maybe prime
            }

            if(result != primeMinusOne)
               return false; // COMPOSITE
         }

         // it may be pseudoprime/prime
         return true;
      }

   }

   class Program
   {
      const int STARTM = 1;
      const int STOPM = 100;
      const int STOPN = 30;

      static void Main(string[] args)
      {
         DateTime start = DateTime.Now;

         for (BigInteger m = STARTM ; m <= STOPM ; m++)
         {
            for (BigInteger n = m+1 ; n <= m+STOPN ; n++)
            {
               BigInteger p = Amicable.Pow2(n) + Amicable.Pow2(m) - 1;
               if (!Amicable.IsMillerRabinPrime(p)) continue;

               BigInteger q = Amicable.Pow2(2 * n - m) + Amicable.Pow2(n) - 1;
               if (!Amicable.IsMillerRabinPrime(q)) continue;

               BigInteger r = Amicable.Pow2(3 * n - m) +
                              Amicable.Pow2(n + m) +
                              Amicable.Pow2(2 * n + 1) - 1;
               if (!Amicable.IsMillerRabinPrime(r)) continue;

               BigInteger f1 = Amicable.Pow2(n) * p * q;
               BigInteger f2 = Amicable.Pow2(n) * r;

               DateTime end = DateTime.Now;
               Console.WriteLine(
                  "{2} ... {0}, {1}",
                  f1,
                  f2,
                  (end - start).Milliseconds / 1000.0);
            }
         }
      }
   }
}
0.017 ... 220, 284
0.021 ... 2172649216, 2181168896
0.028 ... 17296, 18416
0.040 ... 9363584, 9437056
0.138 ... 2.7249180403937065577857522408E+39, 2.7249180403961848563062580388E+39

As you can see, it was very fast finding the same big pair. Well it was at least on my 2 core machine.

The only thing I don't understand about the implementation is the presence of some helper functions in the BigInteger class.

public static BigInteger Abs(BigInteger value);
public static BigInteger Add(BigInteger left, BigInteger right);
public static BigInteger GreatestCommonDivisor(BigInteger left, BigInteger right);
public static BigInteger Divide(BigInteger dividend, BigInteger divisor);
public static BigInteger DivRem(BigInteger dividend, BigInteger divisor, out BigInteger remainder);
public static double Log(BigInteger value);
public static double Log(BigInteger value, double baseValue);
public static double Log10(BigInteger value);
public static BigInteger Max(BigInteger left, BigInteger right);
public static BigInteger Min(BigInteger left, BigInteger right);
public static BigInteger ModPow(BigInteger value, BigInteger exponent, BigInteger modulus);
public static BigInteger Multiply(BigInteger left, BigInteger right);
public static BigInteger Negate(BigInteger value);
public static BigInteger Pow(BigInteger value, int exponent);
public static BigInteger Remainder(BigInteger dividend, BigInteger divisor);
public static BigInteger Subtract(BigInteger left, BigInteger right);

That belongs to System.Math, or maybe to a System.Numerics.Math class. They are all helper functions, not specific to the implementation of the big integer; and the implementation does not depend on the existing of these functions. So they are clearly utility functions. They should go somewhere else. I have submitted this as feedback to Microsoft Connect. If you agree with it, vote for it here. Of course, if you have arguments for the current implementation, you can vote against it too.

, , , Hits for this post: 5901 .

So far, C#, unlike C++, did not support optional arguments. For instance, suppose you need a function to print log a message, that can add a new line or not after writing the message. Most of the times you want a new line, so you don’t want to specify that for most of the calls. Until now, the only possibility was using overloaded functions, with different parameters.

class Logger
{
   public void LogMessage(string error, bool addNewLine)
   {
      Console.Write(error);
      if (addNewLine) Console.WriteLine();
   }

   public void LogMessage(string error)
   {
      LogMessage(error, true);
   }
};

class Program
{
   static void Main(string[] args)
   {
      Logger log = new Logger();
      log.LogMessage("trying to connect...", false);
      // do more stuff here
      log.LogMessage("SUCCESS");
   }
}
trying to connect...SUCCESS
Press any key to continue . . .

C# 4.0, that will be released with Visual Studio 2010, and is for now available with the Visual Studio 2010 CTP, supports, just like C++, optional arguments. So instead of using overloaded functions, you can specify a default value for a parameter. When doing the call, if you don’t provide a value for the argument, the default one will be used.

The Logger class below is identical from the functionality point of view with the one above. (The previous Main() function doesn’t have to change.)

class Logger
{
   public void LogMessage(string error, bool addNewLine = true)
   {
      Console.Write(error);
      if (addNewLine) Console.WriteLine();
   }
};

The only restriction is that optional parameters have to appear in the function after all required parameters. In other words, this is not legal:

class foo
{
   public void func(int a, int b = 0, int c)
   {
   }
}
error CS1737: Optional parameters must appear after all required parameters

Considering this implementation of foo

class foo
{
   public void func(int a, int b = 0, int c = 1)
   {
   }
}

the following calls can be made:

var f = new foo();
f.func(42, 3, 4);          // f.func(42, 3, 4)
f.func(42);                // f.func(42, 0, 1)
f.func(42, 100);           // f.func(42, 100, 1)

But this is not all. C# 4.0 brings another feature: named parameters (and this does not exist in C++). It means that when you make a call, you can specify an argument by its name, not by position. In this case you use the parameter’s name followed by ‘:’ and the value. Here are several examples:

var f = new foo();
f.func(42, c: 4);          // f.func(42, 0, 4)
f.func(42, c: 4, b: 3);    // f.func(42, 3, 4)
f.func(c: 3, a: 1, b: 2);  // f.func(1, 2, 3)

Of course, this can be used regardless the function has optional parameters or not. However, I would not use this feature for anything else than specifying an optional parameter when I would have to first suply values for other several optional parameters. In other words:

var f = new foo();
f.func(42, c: 4);          // this is good, skip b, provide value for c
f.func(42, c: 4, b: 3);    // don't like this, rather call f.func(42, 3, 4)
f.func(c: 3, a: 1, b: 2);  // don't like this, rather call f.func(1, 2, 3)

, , , , , Hits for this post: 5995 .