In this post I want to show how you can implement common list operations: union, intersection, difference and concatenation.

Concatenation is the simplest of them all, because type List already has a function call append that does everything for you.

let concat left right =
    List.append left right

The union of two lists is a list containing all the distinct elements from the two original lists. We can implement this operation by concatenating the two lists firsts, and the filtering the distinct elements.

let union left right =
    List.append left right |> Seq.distinct |> List.ofSeq

The intersection of two lists is a list containing all the elements of the first list that also appear in the second list. We can implement this by interating through the elements of the first list and checking whether the element appears in the second list. To do this in the shortest possible time, with constant lookup time, we can use a HashSet collection. The result is an O(m+n) complexity instead of O(m*n) if you used brute force.

let intersection (left:list< 'a >) (right:list< 'a >) =
    let cache = HashSet< 'a >(right, HashIdentity.Structural)
    left |> List.filter (fun n -> cache.Contains n)

The difference of two lists is a list containing all the elements from the first list that are not part of the second list. The implementation of difference is very similar to the implementation of the intersection. All that differs is the lambda used for the filtering.

let difference (left:list< 'a >) (right:list< 'a >) =
    let cache = HashSet< 'a >(right, HashIdentity.Structural)
    left |> List.filter (fun n -> not (cache.Contains n))

Let’s see all these put to a use:

let main() =
    let c = concat [4;3;2;1] [2;3;5]
    printfn "%A" c

    let u = union [4;3;2;1] [2;3;5]
    printfn "%A" u

    let i = intersection [4;3;2;2;1] [2;3;5]
    printfn "%A" i

    let d = difference [4;3;2;1] [2;3;5]
    printfn "%A" d

main()

This program yields the following output:

[4; 3; 2; 1; 2; 3; 5]
[4; 3; 2; 1; 5]
[3; 2; 2]
[4; 1]

Notice that these operations work with unsorted lists. You don’t have to sort the lists first to apply them.

In order to use the HashSet, you need to add a reference to the FSharp.PowerPack.dll assembly. This sample was build with F# 1.9.7.8 for Visual Studio 2008.

UPDATE: You can read about similar implementations but using operators on this post by Jason Kikel. He also deals with repetitions, regular expression binding operator and null coalescing binding operator.

, Hits for this post: 2614 .

FParsec is an F# adaptation of Parsec, a free monadic parser combinator library for Haskell. It can parse context-sensitive, infinite look-ahead grammars, has complete support for unicode input and large files (> 4 GB) and produces excellent error messages.

You can download and install it from the above link. To use it in an application, you must add a reference to FParsec.dll and FParsecCS.dll.

In this post I will use FParsec to parse ranges of numerical values, such as [1-100], and IPs, such as [10.5.10.0-10.5.255.255].

I will start by defining some types:

type IPAddr = {C1: int64; C2:int64; C3: int64; C4: int64;}
type TValue =
   | Num of int64
   | IP of IPAddr

type Range = {Min : TValue; Max : TValue;}

Having that, I can define a parser for numerical ranges like this:

let p_numrange =
   parse
      {
         do! skipChar '['
         let! a = pint64
         do! skipChar '-'
         let! b = pint64
         do! skipChar ']'
         return {Min = Num a; Max = Num b;}
      }

That can be read like this: make sure it stars with [ byt skip that, parse an numerical value and let that be a, make sure - follows but skip that, parse a second numerical value and let that be b, make sure ] follows but skip that, and it if successful return a Range of a and b.

To print a Range value I’ll use this function:

let printip (ip:IPAddr) =
   printf "%d.%d.%d.%d" ip.C1 ip.C2 ip.C3 ip.C4
   ()

let printrange (r:Range) =
   match (r.Min, r.Max) with
   | Num(l1),Num(l2) -> printfn "Range: %d-%d" l1 l2
   | IP(ip1),IP(ip2) ->
      printf "Range: "
      printip ip1
      printf "-"
      printip ip2
      printfn ""
   | _ -> ()
   ()

Here is a test

   let result = run p_numrange "[1-100]"
   match result with
   | Success r -> printrange r
   | Failure (msg, err) -> printf "%s\n" msg
   ()

That outputs:

Range: 1-100

Using “(1-100)” instead of “[1-100]” yields the following error message:

Error in Ln: 1 Col: 1
(1-100)
^
Expecting: '['

Let's move forward and define an IP parses. Based on what I did so far this should be pretty easy to understand.

let p_ip =
   parse
      {
         let! c1 = pint64
         do! skipChar '.'
         let! c2 = pint64
         do! skipChar '.'
         let! c3 = pint64
         do! skipChar '.'
         let! c4 = pint64
         return {C1 = c1; C2 = c2; C3 = c3; C4 = c4;}
      }

And an IP range parser would look just like the first range parser, except that pint64 would be replaced by p_ip:

let p_iprange =
   parse
      {
         do! skipChar '['
         let! a = p_ip
         do! skipChar '-'
         let! b = p_ip
         do! skipChar ']'
         return {Min = IP a; Max = IP b;}
      }

Here are some helper functions for testing these parses:

let testrange range =
   let result = run (p_numrange .>> eof) range
   match result with
   | Success r -> printrange r
   | Failure (msg, err) -> printf "%s\n" msg
   ()

let testip ip =
   let result = run (p_ip .>> eof) ip
   match result with
   | Success ip -> printip ip; printfn ""
   | Failure (msg, err) -> printf "%s\n" msg
   ()   

let testiprange range =
   let result = run (p_iprange .>> eof) range
   match result with
   | Success r -> printrange r
   | Failure (msg, err) -> printf "%s\n" msg
   ()

You probably noticed the construction “(p_numrange .>> eof)”. That reads: parse a range, and then the end of the file and return the range. That helps us making sure nothing follows the range, like “[1-100] more here”.

Executing

testrange "(1-100)"
testrange "[1-100]"
testrange "[1]"

yields

Error in Ln: 1 Col: 1
(1-100)
^
Expecting: '['

Range: 1-100
Error in Ln: 1 Col: 3
[1]
  ^
Expecting: '-'

Executing

testip "192.168"
testip "192.168.0.1"
testip "192.168.0.1 "

yields

Error in Ln: 1 Col: 8
192.168
       ^
Expecting: '.'

192.168.0.1
Error in Ln: 1 Col: 12
192.168.0.1
           ^
Expecting: end of file

and executing

testiprange "[10.5.0.1-10.5.255.255]"
testiprange "[10-10.5.255.255]"

yields

Range: 10.5.0.1-10.5.255.255
Error in Ln: 1 Col: 4
[10-10.5.255.255]
   ^
Expecting: '.'

But why defining two range parser? Isn’t possible to have just a single one and specify which parser to use for the values? Yes, it is possible, and that would look like this:

let p_range p =
   parse
      {
         do! skipChar '['
         let! a = p
         do! skipChar '-'
         let! b = p
         do! skipChar ']'
         return {Min = a; Max = b;}
      }

Having that, we can re-define the p_numrange and p_iprange like shown bellow:

let p_ip =
   parse
      {
         let! c1 = pint64
         do! skipChar '.'
         let! c2 = pint64
         do! skipChar '.'
         let! c3 = pint64
         do! skipChar '.'
         let! c4 = pint64
         return IP {C1 = c1; C2 = c2; C3 = c3; C4 = c4;}
      }

let p_int =
   parse
      {
         let! v = pint64
         return Num v
      }

let p_numrange = p_range p_int
let p_iprange = p_range p_ip

Everything above would work, except for testip function that needs change to:

let testip ip =
   let result = run (p_ip .>> eof) ip
   match result with
   | Success ip -> match ip with
                   | IP v -> printip v; printfn ""
                   | _ -> printfn "error"
   | Failure (msg, err) -> printf "%s\n" msg
   ()

That’s about it about parsing ranges with FParsec.

Hits for this post: 6533 .

A very old unsolved problem in numbers theory, known as Goldbach’s conjecture, says that any even number greater than 5 can be written as the sum of 2 prime numbers. It hasn’t been solver yet, not I will try to solve it. But I decided to write some F# code to display all the possibilities of writing an even number as the sum of two primes. This turned to be quite simple actually:

First, I wrote a function that checks if a number is prime:

let is_prime n =
   not ([2..n/2] |> Seq.filter (fun x -> (n % x) = 0) |> Seq.nonempty)

Then it was all about checking all the numbers from 2 to N/2 to see if k and N-k are both primes and if so print them:

let check n =
   [2..n/2] |>
   Seq.filter (fun x -> (is_prime x) && (is_prime (n-x))) |>
   Seq.iter (fun x -> printfn "%d & %d" x (n-x))

For instance, if we consider number 200, the check function would print:

3 & 197
7 & 193
19 & 181
37 & 163
43 & 157
61 & 139
73 & 127
97 & 103

And that’s about it; though you can imagine optimizing this by caching the prime numbers already computed. But that wasn’t something I wanted to consider for this problem.

Hits for this post: 6283 .

I recently found an interesting problem on the web, about reducing words, letter by letter until only one letter remains. Here is a formal definition:

We define word reduction as removing a single letter from a word while leaving the remaining letters in their original order so that the resulting sequence of characters is also a word. A good word can be reduced step-by-step until all that is left is a or i. Your program will answer the question: what is the biggest word in the given dictionary that can be reduced?

A typical example, not very long is this:

planets
plants
pants
pant
ant
an
a

So I though this could be a good exercise for F#. Several good dictionaries, both small and big, can be found here.

My approach was to read the dictionary and build a list for each word length: one for 1-letter words, one for 2-letter words, etc. This could be a Dictionary<int, list>, and let’s call it simply dictionary. Then these lists could be traversed and create a second set of lists (let’s call this reducedwords), but only with the words that meet the defined reduction.

An good approach would be to take each list of words from the initial dictionary, starting with the list with words of 2 letters, and for each word to delete one letter at a time. Then check to see if the resulting word already exists in the list from reducedwords corresponding to a length smaller with 1. If it’s there, that this word could be reduced and should be added to the list from reducedwords corresponding to the current length. In other words, the algorithm would be:

copy dictionary[1] to reducedwords[1]

for length = 2 to maxwordlength do
  foreach word in dictionary[length]
    foreach letter in word
      * delete the letter
      * if new word exists in reducedwords[length-1] then
          * add new word to reducedwords[length]

Here is a function for reading a dictionary file:

let readWords (filename:string) =
   let dictionary = new Dictionary< int, list< string > >()
   let reader = new StreamReader(filename)
   let word = String.Empty
   let fileend = ref false
   while (!fileend = false) do
      let word = reader.ReadLine()
      if word = null then
         fileend := true
      else
         let len = String.length word
         let ok, words = dictionary.TryGetValue(len)
         if ok then dictionary.[len] <- words@[word]
         else dictionary.[len] <- [word]
   done
   dictionary.[1] <- ["a";"e";"i";"o";"u"]
   dictionary

One I have the dictionary read, I can apply the algorithm and generate a second Dictionary structure. The following function also returns length of the longest word(s) in the reduced dictionary. This is useful for printing.

let findReducedWords (dictionary:Dictionary< int, list< string > >) =
   let reducedwords = new Dictionary< int, list< string > >()
   reducedwords.[1] <- dictionary.[1]

   let notdone = ref true
   let i = ref 2
   while (!notdone = true) do
      let ok, words = dictionary.TryGetValue(!i)
      if ok <> true then
         notdone := false
      else
         let added = ref false
         let ok, reducedpre = reducedwords.TryGetValue(!i - 1)
         reducedwords.[!i] <- []
         words |> List.iter (fun word ->
            for j = 0 to word.Length-1 do
               let trimmedword = word.Remove(j, 1)
               if reducedpre.Exists(fun x -> x = trimmedword) then
                  reducedwords.[!i] <- reducedwords.[!i]@[word]
                  added := true
            done;
         )
         reducedwords.[!i] <- reducedwords.[!i] |> Set.of_list |> List.of_seq
         if !added then
            i := !i + 1
         else
            reducedwords.Remove(!i) |> ignore
            notdone := false
   done

   (reducedwords, !i-1)

Since the problem is about printing only the longest such reductions, I'll only consider starting from the list of reduced words that has the longest words. That's why I needed findReducedWords to return the length of longest reducible word.

To print these paths I apply the same algorithm as before. The only difference is that I build a list with the words in the reducing path, starting with the longest word and ending with a 1-letter word.

let rec printSequence
   (word:string)
   (reducedwords:Dictionary< int, list< string > >)
   (path:list< string >) =
   match word.Length with
   | 1 ->
      path@[word] |> List.iter (fun x -> printf "%s " x);
      printfn ""
   | _ ->
      let ok, reducedpre = reducedwords.TryGetValue(word.Length-1)
      if ok then
         for j = 0 to word.Length-1 do
            let trimmedword = word.Remove(j, 1)
            if reducedpre.Exists(fun x -> x = trimmedword) then
               printSequence trimmedword reducedwords (path@[word])
         done;

let printSequences
   (reducedwords:Dictionary< int, list< string > >)
   (maxlen:int) =
   let ok, words = reducedwords.TryGetValue(maxlen)
   if ok then
      words |> List.iter (fun word ->
         printSequence word reducedwords [])

The only thing left to do is calling all these functions:

let main()=
   printfn "reading dictionary..."
   let dictionary = readWords "huge_dict.txt"

   printfn "building structures..."
   let reducedwords, max = findReducedWords dictionary

   printfn "printing matches..."
   printSequences reducedwords max

   Console.WriteLine("Press any key to continue...")
   Console.ReadKey()

main()

My results for this dictionary were:

complecting completing competing compting comping coping oping ping pig pi i
complecting completing competing compting comping coping oping ping pin in i
complecting completing competing compting comping coping oping ping pin pi i

Hits for this post: 8749 .

The Game of Life is a cellular automaton devised by the John Horton Conway in 1970. It is the best-known example of a cellular automaton. It consists of a collection of cells which, based on a few mathematical rules, can live, die or multiply. Depending on the initial conditions, the cells form various patterns throughout the course of the game.

Here you can learn more about the game:

I decided to implement this in F#. With all the code involving the user interface, it was nearly 300 lines of code. Pretty neat!

Using the game

The life’s arena can take several sizes:

  • Tiny: 15 x 10
  • Small: 30 x 20
  • Medium: 60 x 40
  • Large: 120 x 80
  • Huge: 150 x 100

The size can be changed from the Game > Size menu.

The following commands are available from the menu:

  • Reset (Ctrl + R): Resets the game, all cells die
  • Randomize (Ctrl + G): Randomly initialize alive cells
  • Start/Stop (Ctrl + B): Starts or stops continuos creation of new generations
  • Step (Ctrl + N): Creates a new generation of cells

In addition the followin commands are available from the File menu:

  • Save (Ctrl + S): saves game to a bitmap file called gameolife_.bmp; this file is created in the working folder
  • Save As… (Ctrl + Shift + S): saves the game to a file, giving you the possibility to chose the location and format (bmp, jpg, gif and png)

Note: The state of the game can be changed (killing cells, making others alive) simply by clicking with the mouse on the game panel.

Implementation in F#

Some comments on the implementation of the game. For more have a look at the code.

I defined two records, one called Cell and one called World. The cell reprezents a cell and has a flag called Alive (which is self explanatory), and World represents the ‘arena of life’, containing a matrix of Cells.

type Cell =
 { Alive : bool;}

type World =
 { Width : int;
   Height : int;
   Cells: Cell[,];}

Various arena sizes are defined in a descriminated union and the mapping between that and values for width and height are done in this function:

type GameSize =
   | Tiny
   | Small
   | Medium
   | Large
   | Huge

let sizeMapping (size:GameSize) =
   match size with
   | Tiny -> (15,10)
   | Small -> (30, 20)
   | Medium -> (60, 40)
   | Large -> (120, 80)
   | Huge -> (150, 100)

A next generation of cells is computed based on the rules of the game. For each cell the number of alived neighbors is computed and then the state of the cell is changes (if necessary).

let nextGeneration (world:World) =
   let alive = {new Cell with Alive = true}
   let dead = {new Cell with Alive = false}
   let newcells = Array2.create world.Width world.Height dead
   for i = 0 to world.Width - 1 do
      for j = 0 to world.Height - 1 do
         let ncount = neighbors world j i
         match world.Cells.(i,j).Alive with
         | true -> if ncount = 2 || ncount = 3 then newcells.(i,j) <- alive
                   else newcells.(i,j) <- dead
         | _ -> if ncount = 3 then newcells.(i,j) <- alive
                  else newcells.(i,j) <- dead
      done
   done
   { world with Cells = newcells}

For computing the number of neighbors, I considered 9 cases:

  • cell is one of the four corners
  • cell is on one of the four edges (not a corner)
  • cell is anywhere else in the matrix

Depending on this position a list of neighbors is created and folded to compute the number of alive neighboring cells.

let sum (neighbors:Cell list) =
   neighbors |> List.fold_left (fun sum x -> if x.Alive then sum+1 else sum) 0

let neighbors (world:World) row col =
   match row, col with
   | _,_ when row = 0 && col = 0 -> sum [world.Cells.(1,0);world.Cells.(1,1);world.Cells.(0,1)]
   | _,_ when row = 0 && col = world.Width-1 -> sum [world.Cells.(world.Width-2,0);world.Cells.(world.Width-2,1);world.Cells.(world.Width-1,1)]
   | _,_ when row = world.Height-1 && col = 0 -> sum [world.Cells.(0, world.Height-2);world.Cells.(1, world.Height-2);world.Cells.(1,world.Height-1)]
   | _,_ when row = world.Height-1 && col = world.Width-1 -> sum [world.Cells.(world.Width-1, world.Height-2);world.Cells.(world.Width-2, world.Height-2);world.Cells.(world.Width-2,world.Height-1)]
   | _,_ when row = 0 && col > 0 && col < world.Width-1 -> sum [world.Cells.(col-1,0);world.Cells.(col-1,1);world.Cells.(col,1);world.Cells.(col+1,1);world.Cells.(col+1,0);]
   | _,_ when row = world.Height-1 && col > 0 && col < world.Width-1 -> sum [world.Cells.(col-1,row);world.Cells.(col-1,row-1);world.Cells.(col,row-1);world.Cells.(col+1,row-1);world.Cells.(col+1,row);]
   | _,_ when col = 0 && row > 0 && row < world.Height-1 -> sum [world.Cells.(0,row-1);world.Cells.(1,row-1);world.Cells.(1,row);world.Cells.(1,row+1);world.Cells.(0,row+1);]
   | _,_ when col = world.Width-1 && row > 0 && row < world.Height-1 -> sum [world.Cells.(col,row-1);world.Cells.(col-1,row-1);world.Cells.(col-1,row);world.Cells.(col-1,row+1);world.Cells.(col,row+1);]
   | _,_ -> sum [world.Cells.(col-1,row-1);world.Cells.(col,row-1);world.Cells.(col+1,row-1);world.Cells.(col+1,row);world.Cells.(col+1,row+1);world.Cells.(col,row+1);world.Cells.(col-1,row+1);world.Cells.(col-1,row);]

The rest is basically user interface code. But I made a parallel version of the game, when then computation of the next generation of cells is parallelized with the Parallel FX framework. Here is how the function looks.

let nextGeneration (world:World) =
   let alive = {new Cell with Alive = true}
   let dead = {new Cell with Alive = false}
   let newcells = Array2.create world.Width world.Height dead
   Parallel.For(0, world.Width, (fun i ->
      for j = 0 to world.Height - 1 do
         let ncount = neighbors world j i
         match world.Cells.(i,j).Alive with
         | true -> if ncount = 2 || ncount = 3 then newcells.(i,j) <- alive
                   else newcells.(i,j) <- dead
         | _ -> if ncount = 3 then newcells.(i,j) <- alive
                  else newcells.(i,j) <- dead
      done
   ))
   { world with Cells = newcells}

Source code

Here are the available downloads

, , , Hits for this post: 8820 .

In this post I’ll show some F# constructs, all put together in a simple application that modifies file names that match a criteria. This would be an application that is started from a console with the following command line options:

filesmod.exe -f < folder > [-r] -p < pattern > [-pre < prefix >] [-suf < suffix >]
  -f < folder>   specifies the folder where the files are located
  -r            indicates that the specified folder should be parsed
                recursively
  -p < pattern>  indicates a pattern used for filtering files
  -pre < prefix> indicats a prefix to the added to all files
                that match the criteria
  -suf < suffix> indicats a suffix to the added to all files
                that match the criteria

Reading command line

The command line arguments can be retrieved using the Environment class from the .NET framework. This class has a static method called GetCommandLineArgs() that returns a list of the passed arguments.
We can define a type that contains all the parsed arguments.

type CommandOptions =
    { mutable Folder : string;
      mutable Recursive : bool;
      mutable Pattern : string;
      mutable Prefix : string;
      mutable Suffix : string;}

This mutable record can be instantiated, and the value can be mutated while parsing the arguments. This is how you instantiate it:

   let cmdops =
      { new CommandOptions
        with Folder = String.Empty
        and Recursive = false
        and Pattern = String.Empty
        and Prefix = String.Empty
        and Suffix = String.Empty }

Parsing the command line arguments can be done with pattern matching. This is the equivalent of switches in C+/C#/Java, only more powerful.
Basically, I’m checking each argument, and if it’s a flag in the command line (-f, -r, -p, -pre, -sub) I take the next argument and put it in the appropriate property of the record.

   try
      let args = Environment.GetCommandLineArgs()
      for i = 0 to args.Length-1 do
         match args.(i) with
            | "-f" when i+1 <= args.Length-1 -> cmdops.Folder <- args.(i+1)
            | "-r" -> cmdops.Recursive <- true
            | "-p" when i+1 <= args.Length-1 -> cmdops.Pattern <- args.(i+1)
            | "-pre" when i+1 <= args.Length-1 -> cmdops.Prefix <- args.(i+1)
            | "-suf" when i+1 <= args.Length-1 -> cmdops.Suffix <- args.(i+1)
            | _ -> ()
      done
   with e -> printfn "%s" e.Message

There are two things you could notice here. The first is the try … with block that makes sure any possible exception is caught.
The second is the quarding the rules with the contidion that the current argument is not the last one in the list. (-f should be followed by a folder, -suf by a suffix, etc.)
You can see what in the when statement:

when i+1 <= args.Length-1

Getting the files in a directory

We can get all files in a folder, using the following algorithm:

  • get all the files in the current folder
  • get all the sub-folders in the current folder and for each of them apply the algorithm again

That is spelled "recursion"!. Our function should take several arguments: the path of a folder, a pattern for mathing filenames and a flag indicating whether sub-folders should be parsed or not.

let rec allFiles dir pattern r =
    seq
        { for file in Directory.GetFiles(dir, pattern) do
            yield file
          if r then
            for subdir in Directory.GetDirectories(dir) do
                for file in allFiles subdir pattern r do
                    yield file }

The above function is recursive and returns a sequence of file names. Sequences are lazy, which means that successive elements are computed and returned on demand, when they are needed.
That is the opposite of a list or array, whose elements are created at once. The keyword 'yield' here is used to return a new value as the sequence is iterated.

Processing the files

To process the files, we simply iterate over the sequence of files from the specified folder, match it against the provided parttern, and if there is a match, apply the prefix and/or suffix transformation.

   for name in (allFiles cmdops.Folder "*.*" cmdops.Recursive) do
      let file = new FileInfo(name)
      if(Regex.IsMatch(file.Name, cmdops.Pattern, RegexOptions.Singleline)) then
        let filename = file.Name.Substring(0, file.Name.LastIndexOf('.'))
        let newname = file.Directory.FullName+"\\"+cmdops.Prefix+filename+cmdops.Suffix+file.Extension
        System.IO.File.Move(file.FullName, newname)
        printfn "%s -> %s" file.FullName newname
   done

Well, I have two cores on my machine, and since the Parallel FX framework is available, I like to use it. So here is the parallel version of that:

       try
          Parallel.ForEach(allFiles cmdops.Folder "*.*" cmdops.Recursive, fun name ->
             let file = new FileInfo(name)
             if(Regex.IsMatch(file.Name, cmdops.Pattern, RegexOptions.Singleline)) then
                let filename = file.Name.Substring(0, file.Name.LastIndexOf('.'))
                let newname = file.Directory.FullName+"\\"+cmdops.Prefix+filename+cmdops.Suffix+file.Extension
                System.IO.File.Move(file.FullName, newname)
                printfn "%s -> %s" file.FullName newname)
       with e -> printfn "%s" e.InnerException.Message

The provided (via command line) pattern is a regular expression. Initially, the folder is checked for all files and then these files are matched against this regular expression.

As I was saying in a previous post, if you use PFX, you have to add a reference to the System.threading.dll assembly, which requires a reference to the System.Core.dll assembly.
That should be specified at the project's propertyes.

-r C:\WINDOWS\assembly\GAC_MSIL\System.Core\3.5.0.0__b77a5c561934e089\System.Core.dll -r "C:\Program Files\Microsoft Parallel Extensions Dec07 CTP\System.Threading.dll"

Putting all together

All that put together looks like this:

#light

open System
open System.IO
open System.Text.RegularExpressions

open System.Threading

let rec allFiles dir pattern r =
    seq
        { for file in Directory.GetFiles(dir, pattern) do
            yield file
          if r then
            for subdir in Directory.GetDirectories(dir) do
                for file in allFiles subdir pattern r do
                    yield file }

let showUsage() =
    printfn "filesmod.exe -f < folder > [-r] -p < pattern > [-pre < prefix >] [-suf < suffix >]"
    printfn "  -f < folder >\tspecifies the folder where the files are located"
    printfn "  -r\t\tindicates that the specified folder should be parsed\n\t\trecursively"
    printfn "  -p < pattern >\tindicates a pattern used for filtering files"
    printfn "  -pre < prefix >\tindicats a prefix to the added to all files\n\t\tthat match the criteria"
    printfn "  -suf < suffix >\tindicats a suffix to the added to all files\n\t\tthat match the criteria"

type CommandOptions =
    { mutable Folder : string;
      mutable Recursive : bool;
      mutable Pattern : string;
      mutable Prefix : string;
      mutable Suffix : string;}

let main()=
   let cmdops =
      { new CommandOptions
        with Folder = String.Empty
        and Recursive = false
        and Pattern = String.Empty
        and Prefix = String.Empty
        and Suffix = String.Empty }

   try
      let args = Environment.GetCommandLineArgs()
      for i = 0 to args.Length-1 do
         match args.(i) with
            | "-f" when i+1 <= args.Length-1 -> cmdops.Folder <- args.(i+1)
            | "-r" -> cmdops.Recursive <- true
            | "-p" when i+1 <= args.Length-1 -> cmdops.Pattern <- args.(i+1)
            | "-pre" when i+1 <= args.Length-1 -> cmdops.Prefix <- args.(i+1)
            | "-suf" when i+1 <= args.Length-1 -> cmdops.Suffix <- args.(i+1)
            | _ -> ()
      done
   with e -> printfn "%s" e.Message

   if ((String.IsNullOrEmpty(cmdops.Prefix) && String.IsNullOrEmpty(cmdops.Suffix)) ||
        String.IsNullOrEmpty(cmdops.Pattern) ||
        String.IsNullOrEmpty(cmdops.Folder)) then
        showUsage()
   else
       try
          Parallel.ForEach(allFiles cmdops.Folder "*.*" cmdops.Recursive, fun name ->
             let file = new FileInfo(name)
             if(Regex.IsMatch(file.Name, cmdops.Pattern, RegexOptions.Singleline)) then
                let filename = file.Name.Substring(0, file.Name.LastIndexOf('.'))
                let newname = file.Directory.FullName+"\\"+cmdops.Prefix+filename+cmdops.Suffix+file.Extension
                System.IO.File.Move(file.FullName, newname)
                printfn "%s -> %s" file.FullName newname)
       with e -> printfn "%s" e.InnerException.Message

   Console.WriteLine("Press any key to continue...")
   Console.ReadKey()

main()

Of course, the options available in this application (on file name changes) are pretty limited, but that can be extended at will.

, , , , , Hits for this post: 10499 .

In my last post I was writing about parallelizing loops with Parallel.For in C#. Today I though it would be nice to try that in F#. So, here is the benchmarking of the matrix multiplication and the bubblesort algorithm in F4.

Matrices Multiplication

I started with a create_matrix function that creates and randomly initializes a matrix of doubles.

let create_matrix rows columns =
   let rnd = System.Random()
   Array2.init rows columns (fun i j -> rnd.NextDouble())

Sequential multiplication could look like this:

let multiply_sequential (m1:float[,]) (m2:float[,]) =
   let rows1 = Array2.length1 m1
   let cols1 = Array2.length2 m1
   let rows2 = Array2.length1 m2
   let cols2 = Array2.length2 m2
   let result = Array2.create rows1 cols2 0.0

   if(cols1 <> rows2) then
      failwith "Matrices size incorrect for multiplication!"

   for i = 0 to rows1-1 do
      for j = 0 to cols2-1 do
         for k = 0 to cols1-1 do
            result.[i,j] <- result.[i,j] + m1.[i,k] * m2.[k,j]
         done
      done
   done
   result

Parallelizing it only implies replacing the outer loop with Parallel.For.

let multiply_parallel (m1:float[,]) (m2:float[,]) =
   let rows1 = Array2.length1 m1
   let cols1 = Array2.length2 m1
   let rows2 = Array2.length1 m2
   let cols2 = Array2.length2 m2
   let result = Array2.create rows1 cols2 0.0

   if(cols1 <> rows2) then
      failwith "Matrices size incorrect for multiplication!"

   Parallel.For(0, rows1, (fun i->
      for j = 0 to cols2-1 do
         for k = 0 to cols1-1 do
            result.[i,j] <- result.[i,j] + m1.[i,k] * m2.[k,j]))
   result

We can test those function and get the same output as I had in my previous post with this:

let main() =
   let step = 100
   let size = ref step
   while (!size <= step*10) do
      let m1 = create_matrix !size !size
      let m2 = create_matrix !size !size
      printfn "Matrices size: %dx%d" !size !size

      printf "Sequential...\t"
      let starts = DateTime.Now
      let ms = multiply_sequential m1 m2
      printfn "%a" output_any (DateTime.Now - starts)

      printf "Parallel...\t"
      let startp = DateTime.Now
      let ms = multiply_parallel m1 m2
      printfn "%a" output_any (DateTime.Now - startp)

      size := !size + step
   done   

main()

Before running, you have to make sure you add System.Threading.dll to the referred assemblies. And since this one depends on System.Core.dll, you also have to add this one. In you are using Visual Studio and a F# project, you can add the two references from the project properties.

-r c:\Windows\assembly\GAC_MSIL\System.Core\3.5.0.0__b77a5c561934e089\System.Core.dll -r "C:\Program Files\Microsoft Parallel Extensions Dec07 CTP\System.Threading.dll"

The results are shown below:

Matrices size: 100x100
Sequential...   00:00:00.1250000
Parallel...     00:00:00.1250000
Matrices size: 200x200
Sequential...   00:00:00.9218750
Parallel...     00:00:00.6093750
Matrices size: 300x300
Sequential...   00:00:03.1093750
Parallel...     00:00:01.9375000
Matrices size: 400x400
Sequential...   00:00:07.5000000
Parallel...     00:00:04.7343750
Matrices size: 500x500
Sequential...   00:00:15.1562500
Parallel...     00:00:09.3125000
Matrices size: 600x600
Sequential...   00:00:25.7031250
Parallel...     00:00:16.5312500
Matrices size: 700x700
Sequential...   00:00:41.9843750
Parallel...     00:00:26.4375000
Matrices size: 800x800
Sequential...   00:01:03.5781250
Parallel...     00:00:39.8281250
Matrices size: 900x900
Sequential...   00:01:32.1093750
Parallel...     00:00:57.3125000
Matrices size: 1000x1000
Sequential...   00:02:07.0468750
Parallel...     00:01:18.9687500

Array Sorting

First, I created a function, create_array, that creates and randomly initializes an array of doubles.

let create_array size =
    let rnd = new Random()
    let arr = Array.create size 0.0
    for i = 0 to arr.Length-1 do
        arr.(i) <- rnd.NextDouble()
    arr

The sequential bubblesort implementation is quite straight forward, of course.

let bubblesort_seq (arr : double array) =
    for i = 0 to arr.Length-1 do
        for j = 0 to arr.Length-1 do
            if (arr.(i).CompareTo(arr.(j)) < 0) then
                let temp = arr.(j)
                arr.(j) <- arr.(i)
                arr.(i) <- temp
    arr

Parallelizing it, again, only means replacing the outer for loop with Parallel.For.

let bubblesort_parallel (arr : double array) =
    Parallel.For(0, arr.Length, (fun i ->
            for j = 0 to arr.Length-1 do
                if (arr.(i).CompareTo(arr.(j)) < 0) then
                    let temp = arr.(j)
                    arr.(j) <- arr.(i)
                    arr.(i) <- temp))

    arr

And this is how the two functions were used:

let main()=
    let step = 5000
    let size = ref step
    while (!size <= step*10) do
        let arr = create_array !size
        printfn "Array size: %d" arr.Length

        printf "Sequential...\t"
        let starts = DateTime.Now
        let arrs = bubblesort_seq arr
        printfn "%a" output_any (DateTime.Now - starts)

        printf "Parallel...\t"
        let startp = DateTime.Now
        let arrp = bubblesort_parallel arr
        printfn "%a" output_any (DateTime.Now - startp)

        size := !size + step
    done

main()

The output for the program is:

Array size: 5000
Sequential...   00:00:00.2343750
Parallel...     00:00:00.1562500
Array size: 10000
Sequential...   00:00:00.8593750
Parallel...     00:00:00.5156250
Array size: 15000
Sequential...   00:00:01.9531250
Parallel...     00:00:01.1718750
Array size: 20000
Sequential...   00:00:03.3125000
Parallel...     00:00:02.1562500
Array size: 25000
Sequential...   00:00:05.4062500
Parallel...     00:00:03.5312500
Array size: 30000
Sequential...   00:00:07.4062500
Parallel...     00:00:05.0312500
Array size: 35000
Sequential...   00:00:10.6562500
Parallel...     00:00:06.8906250
Array size: 40000
Sequential...   00:00:13.2343750
Parallel...     00:00:08.9375000
Array size: 45000
Sequential...   00:00:17.6406250
Parallel...     00:00:11.4687500
Array size: 50000
Sequential...   00:00:20.8281250
Parallel...     00:00:14.2187500

If you compare the output with the one from C#, you'll notice that the times are smaller. It looks like F# is faster than C#. Of course it can get faster if I replace the call to CompareTo() with operator <.

if (arr.(i).CompareTo(arr.(j)) < 0) then
if (arr.(i) < arr.(j)) then

In this case the results look like this:

Array size: 5000
Sequential...   00:00:00.1093750
Parallel...     00:00:00.1093750
Array size: 10000
Sequential...   00:00:00.4843750
Parallel...     00:00:00.2343750
Array size: 15000
Sequential...   00:00:01.1093750
Parallel...     00:00:00.4687500
Array size: 20000
Sequential...   00:00:01.9062500
Parallel...     00:00:00.8437500
Array size: 25000
Sequential...   00:00:03.0156250
Parallel...     00:00:01.2500000
Array size: 30000
Sequential...   00:00:04.3437500
Parallel...     00:00:01.8906250
Array size: 35000
Sequential...   00:00:05.9062500
Parallel...     00:00:02.4375000
Array size: 40000
Sequential...   00:00:07.7656250
Parallel...     00:00:03.3593750
Array size: 45000
Sequential...   00:00:09.8281250
Parallel...     00:00:04.0312500
Array size: 50000
Sequential...   00:00:12.2031250
Parallel...     00:00:05.2343750
Hits for this post: 8000 .

Here is a list of, what I consider, good F# resources. Hopefully you’ll find them helpful.

Official Documentation

Forums and Wikis

  • F# wikicontains articles, tips and sample code
  • hubFSa very good forum focused on F#

Blogs

Hits for this post: 6108 .

Yesterday I wrote about list in F#. Today I’ll write about arrays, which unlike lists are a mutable flat storage and cannot be resized. That means you have to create a new array if you want to remove or add elements. Advantages include constant look-up time and the fact that they can store a large amount of data.

You can create a literal array in a similar way with the lists, placing the elements between [| |]:

let data1 = [|1;2;3;4|]
printfn "data1: %a" output_any data1
data1: [|1; 2; 3; 4|]

The empty literal array is [||].

To create an array you can either use Array.create or Array.init. They both create and initialize an array, but the second makes a lambda expression, which allows advance initialization possibilities. The following creates an array with 10 elements initialized to 1:

let data2 = Array.create 10 1
printfn "data2: %a" output_any data2

Here is the output:

data2: [|1; 1; 1; 1; 1; 1; 1; 1; 1; 1|]

The same can be achieved using Array.init:

let data3 = Array.init 10 (fun x -> 1)
printfn "data3: %a" output_any data3
data3: [|1; 1; 1; 1; 1; 1; 1; 1; 1; 1|]

But we can use Array.init to initialize the elements from 1 to N for instance:

let data4 = Array.init 10 (fun x -> x+1)
printfn "data4: %a" output_any data4
data4: [|1; 2; 3; 4; 5; 6; 7; 8; 9; 10|]

The arrays are mutable data structures. Elements are accessed with .[] or .(). The following code shows how to set the elements of an array:

let data5 = Array.create 10 0
for i = 0 to (Array.length data5)-1 do
   data5.[i] <- i+1

printfn "data5: %a" output_any data5
data5: [|1; 2; 3; 4; 5; 6; 7; 8; 9; 10|]

You can iterate over the elements of an array with Array.iter and Array.iteri, the second also providing access to the index of the elements.

data4 |> Array.iter (fun x -> printf "%d " x)
printfn ""

data4 |> Array.iteri (fun i x -> printfn "data4(%d) = %d" i x)
1 2 3 4 5 6 7 8 9 10
data4(0) = 1
data4(1) = 2
data4(2) = 3
data4(3) = 4
data4(4) = 5
data4(5) = 6
data4(6) = 7
data4(7) = 8
data4(8) = 9
data4(9) = 10

Retrieving the length of the array can either be done with Array.length arr or with arr.Length.

for i = 0 to data4.Length-1 do
   printfn "data4(%d) = %d" i data4.(i)
data4(0) = 1
data4(1) = 2
data4(2) = 3
data4(3) = 4
data4(4) = 5
data4(5) = 6
data4(6) = 7
data4(7) = 8
data4(8) = 9
data4(9) = 10

Like the lists, arrays provide mapping that creates a new array by applying a function on all the elements of an array (with Array.map) or two arrays (with Array.map2).

let data6 = data4 |> Array.map (fun x -> x*2)
printfn "data6: %a" output_any data6

let data7 = Array.map2 (fun x y -> x+y) data4 data6
printfn "data7: %a" output_any data7
data6: [|2; 4; 6; 8; 10; 12; 14; 16; 18; 20|]
data7: [|3; 6; 9; 12; 15; 18; 21; 24; 27; 30|]

A copy of an array can be done with Array.copy.

let data7 = Array.copy data6
printfn "data7: %a" output_any data7
data7: [|2; 4; 6; 8; 10; 12; 14; 16; 18; 20|]

Appending elements to an array is also possible with Array.append, but the result is a new array, created by concatenating two arrays.

let data8 = Array.append data7 [|100|]
printfn "data8: %a" output_any data8
data8: [|2; 4; 6; 8; 10; 12; 14; 16; 18; 20; 100|]

The last operation of arrays I'm going to mention here is the folding, which allows applying a function to all the elements of an array, threading an accumulator argument in the process. The following example shows how to compute the sum of the elements of an array.

let data9 = [|1;2;3;4|]
let sum1 = (Array.fold_left (fun acc x-> x + acc) 0 data9)
let sum2 = (Array.fold_right (fun acc x-> x + acc) data9 0)
printfn "sum1 = %d" sum1
printfn "sum2 = %d" sum2
sum1 = 10
sum2 = 10

Hits for this post: 10984 .

In this post I will talk about the lists in F#, one of the fundamental concepts of the language. What should be said from the very beginning is that list are imutable single linked list. That means whenever you change a list, a new list is created.

You can declare a list in the following ways:

let list1 = [1;2;3;4]
let list2 = 5::6::7::8::[]

To print the content of the list you can do this:

printfn “list1: %a” output_any list1
printfn “list2: %a” output_any list2

list1: [1; 2; 3; 4]
list2: [5; 6; 7; 8]

You can concatenate two lists with operator @:

let list3 = list1 @ list2
printfn "list3: %a" output_any list3

list3: [1; 2; 3; 4; 5; 6; 7; 8]

and you can append elements to the beginning of the list with operator ::

let list4 = -1::0::list3
printfn "list4: %a" output_any list4

list4: [-1; 0; 1; 2; 3; 4; 5; 6; 7; 8]

You can also use the List (defined in Microsoft.FSharp.Code) functionality to print a list by iterating over its elements:

list3: [1; 2; 3; 4; 5; 6; 7; 8]

1 2 3 4

The same can be achieved using the pipe operator:

list1 |> List.iter (fun x -> printf "%d " x)

You can also iterate and get the index of the list elements, with List.iteri:

list1 |> List.iteri (fun i x -> printfn "list1[%d] : %d " i x)

list1[0] : 1
list1[1] : 2
list1[2] : 3
list1[3] : 4

List have a special representation, a head followed by a tail, that is in turn another list (including empty list []). Let's consider the list [1;2;3]. It has the head 1, and the tail [2;3]. The tail, in turn, has the head 2 and the tail [3]. This tail has the head 3 and the tail [], which is the empty list.
You can see the head and tail of a list with List.hd and List.td:

printfn "head list1: %a" output_any (List.hd list1)
printfn "tail list1: %a" output_any (List.tl list1)

The ouput for list1 [1;2;3] is:

head list1: 1
tail list1: [2;3]

Enough with basic things. Let's try working with lists.

1. Minimum and maximum from a list

We can compute the maximum (or minimum) of a list using the following algorithm:

  • if the list is empty, indicate error
  • if the list has only one element, that is the maximum (or minimum)
  • if the list has at least to elements, compute the maximum between that element and the maximum from the rest of the list

That sounds like a recursive operation, which can be simply put in F# like this:

let rec greatest_element l =
    match l with
    | [] -> failwith "empty list"
    | [x] -> x
    | x::rest -> max x (greatest_element rest)

let rec smallest_element l =
    match l with
    | [] -> failwith "empty list"
    | [x] -> x
    | x::rest -> min x (smallest_element rest)

We can use that like this:

let list1 = [1;2;3;4;-4;-3;-2;-1]
let list2 = []   

try
   printfn "maximum from list1: %d" (greatest_element list1)
   printfn "minimum from list1: %d" (smallest_element list1)

   printfn "maximum from list2: %d" (greatest_element list2)
   printfn "minimum from list2: %d" (smallest_element list2)
with
   Failure msg ->
      printfn "Error: %s" msg

and the output would be:

maximum from list1: 4
minimum from list1: -4
Error: empty list

2. Reversing a list

How would we reverse a list? We should take the last element and append to it the one before the last. To the new list we append the one before the one before the end, etc. That again sounds recursive.

let rec revert_list l =
   match l with
   | [] -> []
   | x::rest -> (revert_list rest) @ [x]

let list1 = [1;2;3;4;-4;-3;-2;-1]

printfn "list1: %a" output_any list1
printfn "list2: %a" output_any (revert_list list1)

And here is the output:

list1: [1; 2; 3; 4; -4; -3; -2; -1]
list2: [-1; -2; -3; -4; 4; 3; 2; 1]

3. Inserting in a list

So how could we insert an element in a list, before or after a specified element? We can use the following algorithm:

  • if the list is empty, the new list has one element (the one to insert)
  • else, if the head is the element we are looking for, create a list, with the new element either before the head, or between the head and the tail
  • else, if the head is not the element we are looking for, append the head to a list created by inserting the new element in the tail.

You got that right, recursion again.

let rec insert_after elem newelem l =
    match l with
    | [] -> [newelem]
    | x::rest -> if x = elem then
                    (x::newelem::rest)
                 else
                     x::(insert_after elem newelem rest)

let rec insert_before elem newelem l =
    match l with
    | [] -> [newelem]
    | x::rest -> if x = elem then
                    (newelem::x::rest)
                 else
                    x::(insert_before elem newelem rest)        

let list1 = [1;2;3;4;-4;-3;-2;-1]
let list2 = insert_after 4 6 list1
let list3 = insert_before 6 5 list2

printfn "list1: %a" output_any list1
printfn "list2: %a" output_any list2
printfn "list3: %a" output_any list3

And the output is:

list1: [1; 2; 3; 4; -4; -3; -2; -1]
list2: [1; 2; 3; 4; 6; -4; -3; -2; -1]
list3: [1; 2; 3; 4; 5; 6; -4; -3; -2; -1]

4. Removing elements from a list

As a last exercise, let's consider the removing of elements from a list. The following steps can be used to remove elements:

  • if the list is empty, return an empty list
  • if the list is not empty and the head meets the removing criteria, return a list obtained by reiterating the algorithm on the tail of the list
  • if the list is not empty and the head does not meet the removing criteria, return a list obtained by appending the head to a list optained by reiterating the algorithm on the tail of the list

let rec remove_if l predicate =
    match l with
    | [] -> []
    | x::rest -> if predicate(x) then
                    (remove_if rest predicate)
                 else
                     x::(remove_if rest predicate)

The great thing about this implementation is that we can pass a lambda expression as a predicate, and use it to specify the criteria for removing elements. We can remove like that, for instance, the odd elements, or the even elements, or the negative elements. Here is some sample code:

let list1 = [1;2;3;4;-4;-3;-2;-1]

let list2 = remove_if list1 (fun x -> (abs x &&&1) = 1)
let list3 = remove_if list1 (fun x -> (abs x &&&1) = 0)
let list4 = remove_if list1 (fun x -> x < 0)

printfn "%a" output_any list1
printfn "%a" output_any list2
printfn "%a" output_any list3
printfn "%a" output_any list4

The output for this sample is:

[1; 2; 3; 4; -4; -3; -2; -1]
[2; 4; -4; -2]
[1; 3; -3; -1]
[1; 2; 3; 4]

I hope this will help you to get a grip on how you can work on lists in F#.

Hits for this post: 11034 .