In the book The Blind Watchmaker, Richard Dawkins explains how evolution happens through accumulated small random variations (or mutations) filtered by the natural selection, as opposed to the mainstream misinterpretation that evolution happens in big leaps. He proposes a computer program, called weasel, to demonstrate how the accumulated small improvements (mutations that bring a benefit to the individual so that it is “chosen” by the natural selection) produce faster results.

I don’t know who it was first pointed out that, given enough time, a monkey bashing away at random on a typewriter could produce all the works of Shakespeare. The operative phrase is, of course, given enough time. Let us limit the task facing our monkey somewhat. Suppose that he has to produce, not the complete works of Shakespeare but just the short sentence ‘Methinks it is like a weasel’, and we shall make it relatively easy by giving him a typewriter with a restricted keyboard, one with just the 26 (capital) letters, and a space bar. How long will he take to write this one little sentence?

[...]

We again use our computer monkey, but with a crucial difference in its program. It again begins by choosing a random sequence of 28 letters, just as before … it duplicates it repeatedly, but with a certain chance of random error – ‘mutation’ – in the copying. The computer examines the mutant nonsense phrases, the ‘progeny’ of the original phrase, and chooses the one which, however slightly, most resembles the target phrase, METHINKS IT IS LIKE A WEASEL.

The phrase “METHINKS IT IS LIKE A WEASEL” is from Hamlet:

Hamlet: Do you see yonder cloud that’s almost in shape of a camel?
Polonius: By the mass, and ’tis like a camel, indeed.
Hamlet: Methinks it is like a weasel.

The algorithm as described on Wikipedia, is as follows:

  1. Start with a random string of 28 characters.
  2. Make 100 copies of this string, with a 5% chance per character of that character being replaced with a random character.
  3. Compare each new string with the target “METHINKS IT IS LIKE A WEASEL”, and give each a score (the number of letters in the string that are correct and in the correct position).
  4. If any of the new strings has a perfect score (28), halt.
  5. Otherwise, take the highest scoring string, and go to step 2.

A character is an upper case letter or space, the number of copies per iteration could be 100 and the mutation rate 5%.

The following C# code is an implementation of the algorithm.

class Weasel1
{
   private const string ALLOWED_CHARS = "ABCDEFGHIJKLMNOPQRSTUVXYWZ ";
   private const string TARGET = "METHINKS IT IS LIKE A WEASEL";
   Random rand = new Random();

   private int Fitness(string candidate)
   {
      int score = 0;
      for (int i = 0; i < candidate.Length; ++i)
      {
         if (candidate[i] == TARGET[i])
            score++;
      }

      return score;
   }

   private string Mutate(string parent)
   {
      StringBuilder builder = new StringBuilder();
      for (int i = 0; i < parent.Length; ++i)
      {
         bool mutate = rand.Next(20) == 10;
         builder.Append(
            mutate ?
            ALLOWED_CHARS[rand.Next(ALLOWED_CHARS.Length)] : parent
            [i]);
      }
      return builder.ToString();
   }

   public void Run(int copies)
   {
      StringBuilder builder = new StringBuilder();
      for (int i = 0; i < TARGET.Length; ++i)
      {
         builder.Append(ALLOWED_CHARS[rand.Next(ALLOWED_CHARS.Length)]);
      }
      string parent = builder.ToString();

      int step = 1;
      Console.WriteLine("{0:D3} {1}", step, parent);

      do
      {
         step++;

         string[] children = new string[copies];
         for (int i = 0; i < copies; ++i)
         {
            children[i] = Mutate(parent);
         }

         int indexMax = 0;
         int scoreMax = 0;
         for (int i = 0; i < copies; ++i)
         {
            int score = Fitness(children[i]);

            if (score > scoreMax)
            {
               scoreMax = score;
               indexMax = i;
            }
         }

         parent = children[indexMax];

         Console.WriteLine("{0:D3} {1}", step, parent);

      } while (parent != TARGET);
   }
}

class Program
{
   static void Main(string[] args)
   {
      Weasel1 weasel = new Weasel1();
      weasel.Run(100);
   }
}

Function Fitness() computes how similar an offspring is to the parent, function Mutate() makes a copy of the parent with a 5% chance for each character to mutate. Function Run() runs the algorithm until the target string is reached. A possible output of the program is shown below. Different runs require different number of iterations (even more than 100).

001 E PVBQVRYRNQSRBCDLOCFOGEIB J
002 E PVBQVRYRNQSRBCDLOCFOWEIB J
003 E PVIQVRYRNQSRBCDLOCFOWEIB J
004 EEPLIQVUYRNQSRBCDLOCFOWEIB J
005 EEPLIQVU RNQSRBCDLOCFOWEIB J
006 EEPHIQLU RNQSRBCDLOCFKWEIB J
007 EEPHIQLU RNQSRHCDKOCFKWEIB J
008 EEPHIQLU RNQSSHCDKOCFKWEIB J
009 EEPHIQLU RNQSS CDKOCFKWEIB J
010 EEPHIQLU RNQSS CDKOCWKWEKB L
011 EEPHIQLU RNQSS CDKOCOKWEKB L
012 EEPHIQLU RN SS CDKOGOKWEKB L
013 EEPHIQLU RN SS LDKOGOKWEKB L
014 EEPHIQLU RT SS LDKOGOKWEKB L
015 EEPHIQLU RT SS LDKOGOKWEKBEL
016 EEPHIQLU RT SS LPKOGOKWEKBEL
017 EEPHIQLU RT SS LPKOGOKWEKSEL
018 EEPHIQLU IT SS LPKOGOKWEKSEL
019 EEPHIQLU IT SS LIKOGOKWEKSEL
020 EEPHIQLU IT SS LIKOGOKWEKSEL
021 EEKHIQLS IT SS LIKOGOKWEKSEL
022 EEKHIQLS IT SS LIKOGOKWEKSEL
023 EEKHIQLS IT SS LIKO OKWEKSEL
024 EEKHIQLS IT SS LIKO OFWEKSEL
025 EEKHIQLS IT ES LIKO OFWENSEL
026 EEKHINLS IT ES LIKO OFWENSEL
027 EEKHINLS IT ES LIKO O WENSEL
028 EEKHINLS IT ES LIKO O WECSEL
029 EEKHINLS IT BS LIKO O WEHSEL
030 EEKHINLS IT QS LIKO O WEASEL
031 EETHINSS IT QS LIKO O WEASEL
032 EETHINSS IT QS LIKO O WEASEL
033 EETHINSS IT IS LIKO O WEASEL
034 EETHINSS IT IS LIKO O WEASEL
035 EETHINBS IT IS LIKO O WEASEL
036 XETHINBS IT IS LIKO O WEASEL
037 XETHINBS IT IS LIKO O WEASEL
038 XETHINBS IT IS LIKO O WEASEL
039 XETHINBS IT IS LIKO O WEASEL
040 XETHINBS IT IS LIKK O WEASEL
041 XETHINBS IT IS LIKK O WEASEL
042 XETHINBS IT IS LIKK O WEASEL
043 XETHINKS IT IS LIKK O WEASEL
044 XETHINKS IT IS LIKK   WEASEL
045 XETHINKS IT IS LIKK   WEASEL
046 XETHINKS IT IS LIKK   WEASEL
047 XETHINKS IT IS LIKE   WEASEL
048 XETHINKS IT IS LIKE   WEASEL
049 METHINKS IT IS LIKE   WEASEL
050 METHINKS IT IS LIKE   WEASEL
051 METHINKS IT IS LIKE A WEASEL

We can change this algorithm to randomize the chance of mutation. The changes to the code are small (basically function Mutate changes to take not only the parent string but only the chance for the mutation rate), but this increases the number of iterations needed to reach the target string.

class Weasel2
{
   const string TARGET = "METHINKS IT IS LIKE A WEASEL";
   const string ALLOWED_CHARS = "ABCDEFGHIJKLMNOPQRSTUVWXYZ ";
   Random rand = new Random();

   private double Fitness(string candidate)
   {
      int score = 0;
      for (int i = 0; i < candidate.Length; ++i)
      {
         if (candidate[i] == TARGET[i])
            score++;
      }

      return score;
   }

   private string Mutate(string parent, double rate)
   {
      StringBuilder builder = new StringBuilder();
      foreach(char c in parent)
      {
         builder.Append(
            rand.NextDouble() < rate ?
            c:
            ALLOWED_CHARS[rand.Next(ALLOWED_CHARS.Length)]);
      }

      return builder.ToString();
   }

   public void Run(int copies)
   {
      StringBuilder builder = new StringBuilder();
      for (int i = 0; i < TARGET.Length; ++i)
      {
         builder.Append(ALLOWED_CHARS[rand.Next(ALLOWED_CHARS.Length)]);
      }
      string parent = builder.ToString();

      int step = 1;
      Console.WriteLine("{0:D4} {1}", step, parent);

      do
      {
         string[] children = new string[copies];
         for(int i = 0; i < copies; ++i)
         {
            children[i] = Mutate(parent, rand.NextDouble());
         }

         int indexMax = 0;
         double scoreMax = 0;
         for (int i = 0; i < copies; ++i)
         {
            double score = Fitness(children[i]);

            if (score > scoreMax)
            {
               scoreMax = score;
               indexMax = i;
            }
         }

         parent = children[indexMax];

         Console.WriteLine("{0:D4} {1}", step, parent);
         step++;

      } while (parent != TARGET);
   }
}

Here is the output, but because of the bigger number of iterations, only every 20th iteration is shown:

0001 BXZZBDJSZTVYFWRSGNKYEULISWRK
0020 MSTCFNBSTITVIS L KPLEDBRASEC
0040 METCINQSDITVIS LIKW A IRISEW
0060 METLINKSOITJIS LJKW A IGASEW
0080 METLINKSOITJIS LJKX A WEAUEL
0100 METLINKSTKTXIS LIKK A WEAOEL
0120 METHINKS KTXIS LIKH A WEAOEL
0140 METHINKSDKTXIS LIKA A WEASEL
0160 METHINKSEGTXIS LIKA A WEASEL
0180 METHINKSEGTEIS LIKE A WEASEL
0200 METHIMKSEGTMIS LIKE A WEASEL
0220 METHIMKSTITNIS LIKE A WEASEL
0240 METHIAKSTIT IS LIKE A WEASEL
0260 METHIPKSTIT IS LIKE A WEASEL
0280 METHJKKSTOT IS LRKE A WEASEL
0300 METHYQKSCOT IS LRKE A WEASEL
0320 METHFPKSCOT IS LRKE A WEASEL
0340 METHFPKSCGT IS LRKE A WEASES
0360 METHINKS XT IS LMOE A WEAIEL
0380 METHINKS YT IS LMZE A WEAIEL
0400 METHINKS IT IS LMHE A WEAIEL
0420 METHINKS IT IS LEDE A WEANEL
0440 METHINKS IT IS LIPE A WEANYL
0460 METHINKS IT IS LIPE A WEAOYL
0480 METHINKS IT IS LIWE A WEARWL
0500 METHINKS IT IS LIWEDA WEAFWT
0520 METVINKS IT IS LILE A WEABBG
0540 METVINKS IT IS LIJE A WEABTG
0560 METHINKS IT IS LIJE A WEAMRV
0580 METHINKS IT IS LIZE A WEAMRV
0600 METHINKS IT IS LIZE A WEA OV
0620 METHINKS IT IS LIZE A WEAHE
0640 METHINKS IT IS LIJE A WEAHEH
0660 METHINKS IT IS LIFE A WEAHEH
0680 M THINKS IT IS LINE I WXAHEL
0700 MQTHINKS IT IS LIAE I WXASEL
0720 MZTHIRKS IT IS LIKE T WXASEL
0740 METHINKR IT IS LIKE T WRASEL
0760 METOINKI IT IS LIKE T WRASEL
0780 METMINKM IT IS LIKE A WQASEL
0800 METMINKS IT IS LIKE A WQASEL
0820 METJIXKS IT IS LIKE A WQASEL
0840 METJILKS IT IS LIKE A WUASEL
0860 METJIPKS IP IS LIKE A WAASEL
0880 METJIPKS IW IS LIKE A WLASEL
0900 METIIBKS QT IS LIKE A SEASEL
0920 METIIPKS PT IS LIKE A SEASEL
0940 METKINKS PT IS LIKE A TEASEL
0960 METKINKS BT IS LIKE A QEASEL
0980 METWINKS IT IS LIKE A QEASEL
1000 METOINKS IT IS LIKE ANQEASEL
1020 METUINKS IT IS LIKE AHTEASEL
1040 METUINKS IT IS LIKE AMWEASEO
1060 METQINKS IT IS LIKE AMWEASEO
1080 MZTYINKS IT ISULIKE AKWEASEL
1100 MZTYINKS IT ISULIKE APWEADEL
1120 MJTHINKS IT ISULIKE AEWEAQEL
1140 MJTHINKS IT ISULIKE AFWEAUEL
1160 MJTHINKS IT ISCLIKP AVWEABEL
1180 MJTHINKS IT IS LIKP AVWEAGEG
1200 MNTHINKS IT IS LIKM AVWEACE
1220 MNTHINKS IT IS LIKM AQWEACEL
1240 MNTHINKS IT IS LIKMXAKWEAAEL
1260 MNTHINKS CT IS LIKMGAPWEASEL
1280 MPTHINKS IX IS EIK GA WEASEL
1300 METHINKS IX IS LIK RA WEASEL
1320 METHINKS IZ IS LIKEWA WEASEL
1340 METHINKS IZ ISILIKEWA WEASEL
1360 METHINKS IJ ISILIKE A WEASEL
1380 METHINKS IM ISELIKE A WEASEL
1400 METHINKS IM ISELIKE A WEASEL
1420 METHINKS IU ISELIKE A WEASEL
1440 METHINKS IY ISELIKE A WEASEL
1460 METHINKS IT ISVLIKE A WEASEL
1480 METHINKS IT ISCLIKE A WEASEL
1500 METHINKS IT ISCLIKF A WEASEL
1520 METHINKS IT ISTLIKF A WEASEL
1540 METXINKS IT ISOLIKF A WEASEL
1560 MZTXINKS IT ISOLIKC A WEASEL
1580 MATXINKS IT IS LIKC A WEASEL
1600 MATXINKS IT IS LIKE A WEDSEL
1620 METHINKSGITVIC LIKE A WEASEL
1640 METHINKSGITQIC LIKE A WEASEL
1660 METHINLSAITQIC LIKE A WEASEL
1680 METHINLSAITQIH LIKE A WEASEL
1700 METHINGS ITAIH LIKE A WEASEL
1720 METHINGS ITRIS LIKE W WEASEL
1740 METHINVS IT IS LIJE A WEASEL
1760 METHINVS IT IS LIGE A WEASEL
1780 METHINVS IT IS LIGE A WEASEL
1800 METHINVS IT IS LI E A WEASYL
1820 METHIN S IT IS LIKE A WTASYT
1840 METHINQS IT IS LIKE A STASYT
1860 METHINWA IT IS LIKE A IFASES
1880 METHINKA IT IS LIKE A IFASEJ
1900 METHINKA IT YS LIKE O IEAAEL
1920 METHINKU IT YS LIKE O WEAAEL
1940 METHOJKM IT JS LIKE O WEAPEL
1960 METHOJKF IT IS LIKE  RWEATEL
1980 METHFGKF IT IS LIKE ARWBATEL
2000 METHYNKG IT IS LIKE ARWEALEL
2020 METHUNKD IT IS LIKE AAWEAWEL
2040 METHIWKN IT IS LIKE AYWEADEL
2060 METHIUKN IT IS LIKE AKWEAQEL
2080 METHIUKN IT IS LIKE A WEAQEL
2100 METHIAKN IT IS LIKE N WEAQEL
2120 METHINKK IT IS LIKE N WEEQEL
2140 METHDNKQ IT IS LIKE A WEBSEL
2160 METHDNKQ IT IS LIKE A WEBSEL
2180 METHDNKQ IT IS LIKE A WEBSEL
2200 METHMNKQ IT IS LIKE A WEBSEL
2220 MEKHBNKX IT IS LIKE A WEBSEL
2240 MEBHQNKS IT IS LIKE A WEBSEL
2260 METHQNKS IT IS LIKEZA WEFSEL
2280 METHGNKS IT IS LIKEZA WESSEL
2300 METHGNKS IT IS LIKEZA WEWSEL
2320 METHLNKS IT IP BIKE A WEOSEL
2340 METHYNKS IT IB SIKE A WEQSEL
2360 METHINKS ITQIJ LIKE A WEQSEL
2380 METHINKS ITIIJ LIKE A WEYSTL
2400 METQINKS ITIIJ LIKE A WEASGL
2420 METHINKS ITIIJJLIKE A WEASGL
2440 METHINKS ITQIWJLIKE A WEAS L
2460 METHINKS ITZIWJLIKE A WEASCL
2480 METHINKS IT IDLLIZE A UEASCL
2500 METHINKS IT ID LIZE A IEASZL
2520 METHINKS IT ID LIZE A IEASZL
2540 METHINKS IT IB LIZE A XEASOL
2560 METHINKS IT IB LIAE A  EASOL
2580 METHINKS IT IBVLICE A  EASEL
2600 METHINKS IT IBVLIKE A  EASEL
2620 METHINKS IT IKDLIKE A  EASEL
2640 METHINKS IT IKNLIKE A HEASEL
2660 METHINKS IT IKNLIKE A HIASEL
2680 METHINKS IT IZNLIKE A HIASEL
2700 METHINKS IT IZNLIKE A WOASEL
2720 METHINKS IT ISNLIKE A WOASEL
2740 METHINKS IT ISNLIKE M WSASEL
2760 METHINKS IT ISVLIKE M WXASEL
2780 METHINKS IT IS LIKE S WEASEL
2785 METHINKS IT IS LIKE A WEASEL

Notice that matching letters are not immutable. A letter, even if it matches one in the target string can change at any time.

You can find similar implementations in several programming languages on the Rosetta Code website.

, , , Hits for this post: 1295 .