<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Marius Bancila's Blog &#187; pattern</title>
	<atom:link href="http://mariusbancila.ro/blog/tag/pattern/feed/" rel="self" type="application/rss+xml" />
	<link>http://mariusbancila.ro/blog</link>
	<description>Sharing my opinions and ideas!</description>
	<lastBuildDate>Fri, 06 Apr 2012 13:45:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Applying File Transformations with F#</title>
		<link>http://mariusbancila.ro/blog/2008/05/06/applying-file-transformations-with-f/</link>
		<comments>http://mariusbancila.ro/blog/2008/05/06/applying-file-transformations-with-f/#comments</comments>
		<pubDate>Tue, 06 May 2008 20:10:57 +0000</pubDate>
		<dc:creator>Marius Bancila</dc:creator>
				<category><![CDATA[F#]]></category>
		<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[files]]></category>
		<category><![CDATA[parallel]]></category>
		<category><![CDATA[pattern]]></category>
		<category><![CDATA[record]]></category>
		<category><![CDATA[recursive]]></category>
		<category><![CDATA[regex]]></category>

		<guid isPermaLink="false">http://mariusbancila.ro/blog/?p=121</guid>
		<description><![CDATA[In this post I&#8217;ll show some F# constructs, all put together in a simple application that modifies file names that match a criteria. This would be an application that is started from a console with the following command line options: filesmod.exe -f < folder > [-r] -p < pattern > [-pre < prefix >] [-suf [...]]]></description>
			<content:encoded><![CDATA[<p>In this post I&#8217;ll show some F# constructs, all put together in a simple application that modifies file names that match a criteria. This would be an application that is started from a console with the following command line options:</p>
<pre class="console">
filesmod.exe -f < folder > [-r] -p < pattern > [-pre < prefix >] [-suf < suffix >]
  -f < folder>   specifies the folder where the files are located
  -r            indicates that the specified folder should be parsed
                recursively
  -p < pattern>  indicates a pattern used for filtering files
  -pre < prefix> indicats a prefix to the added to all files
                that match the criteria
  -suf < suffix> indicats a suffix to the added to all files
                that match the criteria
</pre>
<h3>Reading command line</h3>
<p>The command line arguments can be retrieved using the Environment class from the .NET framework. This class has a static method called GetCommandLineArgs() that returns a list of the passed arguments.<br />
We can define a type that contains all the parsed arguments.</p>
<pre class="prettyprint">
type CommandOptions =
    { mutable Folder : string;
      mutable Recursive : bool;
      mutable Pattern : string;
      mutable Prefix : string;
      mutable Suffix : string;}
</pre>
<p>This mutable record can be instantiated, and the value can be mutated while parsing the arguments. This is how you instantiate it:</p>
<pre class="prettyprint">
   let cmdops =
      { new CommandOptions
        with Folder = String.Empty
        and Recursive = false
        and Pattern = String.Empty
        and Prefix = String.Empty
        and Suffix = String.Empty }
</pre>
<p>Parsing the command line arguments can be done with pattern matching. This is the equivalent of switches in C+/C#/Java, only more powerful.<br />
Basically, I&#8217;m checking each argument, and if it&#8217;s a flag in the command line (-f, -r, -p, -pre, -sub) I take the next argument and put it in the appropriate property of the record.</p>
<pre class="prettyprint">
   try
      let args = Environment.GetCommandLineArgs()
      for i = 0 to args.Length-1 do
         match args.(i) with
            | "-f" when i+1 <= args.Length-1 -> cmdops.Folder <- args.(i+1)
            | "-r" -> cmdops.Recursive <- true
            | "-p" when i+1 <= args.Length-1 -> cmdops.Pattern <- args.(i+1)
            | "-pre" when i+1 <= args.Length-1 -> cmdops.Prefix <- args.(i+1)
            | "-suf" when i+1 <= args.Length-1 -> cmdops.Suffix <- args.(i+1)
            | _ -> ()
      done
   with e -> printfn "%s" e.Message
</pre>
<p>There are two things you could notice here. The first is the try &#8230; with block that makes sure any possible exception is caught.<br />
The second is the quarding the rules with the contidion that the current argument is not the last one in the list. (-f should be followed by a folder, -suf by a suffix, etc.)<br />
You can see what in the when statement:</p>
<pre class="prettyprint">
when i+1 <= args.Length-1
</pre>
<h3>Getting the files in a directory</h3>
<p>We can get all files in a folder, using the following algorithm:</p>
<ul>
<li>get all the files in the current folder</li>
<li>get all the sub-folders in the current folder and for each of them apply the algorithm again</li>
</ul>
<p>That is spelled "recursion"!. Our function should take several arguments: the path of a folder, a pattern for mathing filenames and a flag indicating whether sub-folders should be parsed or not.</p>
<pre class="prettyprint">
let rec allFiles dir pattern r =
    seq
        { for file in Directory.GetFiles(dir, pattern) do
            yield file
          if r then
            for subdir in Directory.GetDirectories(dir) do
                for file in allFiles subdir pattern r do
                    yield file }
</pre>
<p>The above function is recursive and returns a sequence of file names. Sequences are lazy, which means that successive elements are computed and returned on demand, when they are needed.<br />
That is the opposite of a list or array, whose elements are created at once. The keyword 'yield' here is used to return a new value as the sequence is iterated.</p>
<h3>Processing the files</h3>
<p>To process the files, we simply iterate over the sequence of files from the specified folder, match it against the provided parttern, and if there is a match, apply the prefix and/or suffix transformation.</p>
<pre class="prettyprint">
   for name in (allFiles cmdops.Folder "*.*" cmdops.Recursive) do
      let file = new FileInfo(name)
      if(Regex.IsMatch(file.Name, cmdops.Pattern, RegexOptions.Singleline)) then
        let filename = file.Name.Substring(0, file.Name.LastIndexOf('.'))
        let newname = file.Directory.FullName+"\\"+cmdops.Prefix+filename+cmdops.Suffix+file.Extension
        System.IO.File.Move(file.FullName, newname)
        printfn "%s -> %s" file.FullName newname
   done
</pre>
<p>Well, I have two cores on my machine, and since the Parallel FX framework is available, I like to use it. So here is the parallel version of that:</p>
<pre class="prettyprint">
       try
          Parallel.ForEach(allFiles cmdops.Folder "*.*" cmdops.Recursive, fun name ->
             let file = new FileInfo(name)
             if(Regex.IsMatch(file.Name, cmdops.Pattern, RegexOptions.Singleline)) then
                let filename = file.Name.Substring(0, file.Name.LastIndexOf('.'))
                let newname = file.Directory.FullName+"\\"+cmdops.Prefix+filename+cmdops.Suffix+file.Extension
                System.IO.File.Move(file.FullName, newname)
                printfn "%s -> %s" file.FullName newname)
       with e -> printfn "%s" e.InnerException.Message
</pre>
<p>The provided (via command line) pattern is a regular expression. Initially, the folder is checked for all files and then these files are matched against this regular expression.</p>
<p>As I was saying in a previous post, if you use PFX, you have to add a reference to the System.threading.dll assembly, which requires a reference to the System.Core.dll assembly.<br />
That should be specified at the project's propertyes.</p>
<blockquote><p>
-r C:\WINDOWS\assembly\GAC_MSIL\System.Core\3.5.0.0__b77a5c561934e089\System.Core.dll -r "C:\Program Files\Microsoft Parallel Extensions Dec07 CTP\System.Threading.dll"
</p></blockquote>
<h3>Putting all together</h3>
<p>All that put together looks like this:</p>
<pre class="prettyprint">
#light

open System
open System.IO
open System.Text.RegularExpressions

open System.Threading

let rec allFiles dir pattern r =
    seq
        { for file in Directory.GetFiles(dir, pattern) do
            yield file
          if r then
            for subdir in Directory.GetDirectories(dir) do
                for file in allFiles subdir pattern r do
                    yield file }

let showUsage() =
    printfn "filesmod.exe -f < folder > [-r] -p < pattern > [-pre < prefix >] [-suf < suffix >]"
    printfn "  -f < folder >\tspecifies the folder where the files are located"
    printfn "  -r\t\tindicates that the specified folder should be parsed\n\t\trecursively"
    printfn "  -p < pattern >\tindicates a pattern used for filtering files"
    printfn "  -pre < prefix >\tindicats a prefix to the added to all files\n\t\tthat match the criteria"
    printfn "  -suf < suffix >\tindicats a suffix to the added to all files\n\t\tthat match the criteria"

type CommandOptions =
    { mutable Folder : string;
      mutable Recursive : bool;
      mutable Pattern : string;
      mutable Prefix : string;
      mutable Suffix : string;}

let main()=
   let cmdops =
      { new CommandOptions
        with Folder = String.Empty
        and Recursive = false
        and Pattern = String.Empty
        and Prefix = String.Empty
        and Suffix = String.Empty }

   try
      let args = Environment.GetCommandLineArgs()
      for i = 0 to args.Length-1 do
         match args.(i) with
            | "-f" when i+1 <= args.Length-1 -> cmdops.Folder <- args.(i+1)
            | "-r" -> cmdops.Recursive <- true
            | "-p" when i+1 <= args.Length-1 -> cmdops.Pattern <- args.(i+1)
            | "-pre" when i+1 <= args.Length-1 -> cmdops.Prefix <- args.(i+1)
            | "-suf" when i+1 <= args.Length-1 -> cmdops.Suffix <- args.(i+1)
            | _ -> ()
      done
   with e -> printfn "%s" e.Message

   if ((String.IsNullOrEmpty(cmdops.Prefix) &#038;&#038; String.IsNullOrEmpty(cmdops.Suffix)) ||
        String.IsNullOrEmpty(cmdops.Pattern) ||
        String.IsNullOrEmpty(cmdops.Folder)) then
        showUsage()
   else
       try
          Parallel.ForEach(allFiles cmdops.Folder "*.*" cmdops.Recursive, fun name ->
             let file = new FileInfo(name)
             if(Regex.IsMatch(file.Name, cmdops.Pattern, RegexOptions.Singleline)) then
                let filename = file.Name.Substring(0, file.Name.LastIndexOf('.'))
                let newname = file.Directory.FullName+"\\"+cmdops.Prefix+filename+cmdops.Suffix+file.Extension
                System.IO.File.Move(file.FullName, newname)
                printfn "%s -> %s" file.FullName newname)
       with e -> printfn "%s" e.InnerException.Message

   Console.WriteLine("Press any key to continue...")
   Console.ReadKey()

main()
</pre>
<p>Of course, the options available in this application (on file name changes) are pretty limited, but that can be extended at will.</p>
]]></content:encoded>
			<wfw:commentRss>http://mariusbancila.ro/blog/2008/05/06/applying-file-transformations-with-f/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

