Applying File Transformations with F#

In this post I’ll show some F# constructs, all put together in a simple application that modifies file names that match a criteria. This would be an application that is started from a console with the following command line options:

Reading command line

The command line arguments can be retrieved using the Environment class from the .NET framework. This class has a static method called GetCommandLineArgs() that returns a list of the passed arguments.
We can define a type that contains all the parsed arguments.

This mutable record can be instantiated, and the value can be mutated while parsing the arguments. This is how you instantiate it:

Parsing the command line arguments can be done with pattern matching. This is the equivalent of switches in C+/C#/Java, only more powerful.
Basically, I’m checking each argument, and if it’s a flag in the command line (-f, -r, -p, -pre, -sub) I take the next argument and put it in the appropriate property of the record.

There are two things you could notice here. The first is the try … with block that makes sure any possible exception is caught.
The second is the quarding the rules with the contidion that the current argument is not the last one in the list. (-f should be followed by a folder, -suf by a suffix, etc.)
You can see what in the when statement:

Getting the files in a directory

We can get all files in a folder, using the following algorithm:

  • get all the files in the current folder
  • get all the sub-folders in the current folder and for each of them apply the algorithm again

That is spelled “recursion”!. Our function should take several arguments: the path of a folder, a pattern for mathing filenames and a flag indicating whether sub-folders should be parsed or not.

The above function is recursive and returns a sequence of file names. Sequences are lazy, which means that successive elements are computed and returned on demand, when they are needed.
That is the opposite of a list or array, whose elements are created at once. The keyword ‘yield’ here is used to return a new value as the sequence is iterated.

Processing the files

To process the files, we simply iterate over the sequence of files from the specified folder, match it against the provided parttern, and if there is a match, apply the prefix and/or suffix transformation.

Well, I have two cores on my machine, and since the Parallel FX framework is available, I like to use it. So here is the parallel version of that:

The provided (via command line) pattern is a regular expression. Initially, the folder is checked for all files and then these files are matched against this regular expression.

As I was saying in a previous post, if you use PFX, you have to add a reference to the System.threading.dll assembly, which requires a reference to the System.Core.dll assembly.
That should be specified at the project’s propertyes.

-r C:\WINDOWS\assembly\GAC_MSIL\System.Core\\System.Core.dll -r “C:\Program Files\Microsoft Parallel Extensions Dec07 CTP\System.Threading.dll”

Putting all together

All that put together looks like this:

Of course, the options available in this application (on file name changes) are pretty limited, but that can be extended at will.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.