matthew ephraim

Portland Pictures

March 30th, 2009

Little Boxes

Seaweed

Geometry

Seattle/Portland

Seattle Pictures

March 29th, 2009

Flying over Seattle

Moss Handle

Windows XP

Seattle/Portland

Dynamic Rake Tasks

February 5th, 2009

This is a quick little Ruby trick that I thought was pretty neat. If you have a Rake file and you have a bunch of similar tasks that you don’t want to have to write by hand, you can dynamically generate the Rake tasks by using a little bit of Ruby code.

For example, maybe I want to change the contents of a few different directories on my file system. And say that I want to have a different rake task for each directory, so that I can type something like this:

Command line
rake modify:songs

Or like this:

Command line
rake modify:documents

With a small number of directories, it wouldn’t be a big deal to write a task for each directory. But, if I have 100 directories, that might get to be a little bit tedious. So, instead of writing the tasks manually, I’ll have Ruby do it:

rakefile.rb
namespace :modify do
	[:music, :documents, :application, :home].each do |directory|
		task directory do
			change_contents(directory)
		end
	end
end

In the code above I’m taking an array of 4 directories and using that array to call the Rake task function to create a new task for each directory. I’m also passing the name of the directory on to a function that changes the contents of the specific directory.

It’s a pretty simple trick, but it’s definitely come in handy for me.

Syntax Highlighting is Fun

February 3rd, 2009
JavaScript

A Simple C# Wrapper for Ghostscript

January 6th, 2009
Update:

This post has become somewhat popular (relative to my other posts anyway), so I decided to take the code and release it as an open source library. More information here

PDF thumbnails with Ghostscript

I’ve been looking for a while now for a simple solution for generating thumbnail images from PDF files. I wanted something that would let me programmatically load in a PDF file, choose a page, and generate a thumbnail from that page. As far as I can tell, there are only a few open source options and of those options I haven’t been able to find one that I could get working with C#.

After seeing it recommended a few times, I decided take a look at Ghostscript. Ghostscript is an open source interpreter for Postscript and PDF files. Among other things, Ghostscript allows you generate images from PDF pages. Which is exactly what I needed.

Ghostscript is a tool that can be used from the command line, which is how most of the examples I’ve found online have used it. Unfortunately, this is what a call to Ghostscript looks like:

gs -q -dQUIET -dPARANOIDSAFER  -dBATCH -dNOPAUSE \          
-dNOPROMPT -dMaxBitmap=500000000 -dFirstPage=1 \
-dAlignToPixels=0 -dGridFitTT=0 -sDEVICE=jpeg \
-dTextAlphaBits=4 -dGraphicsAlphaBits=4 -r100x100 \
-sOutputFile=output.jpg input.pdf

Not pretty. Luckily, I needed to automate the task of creating the thumbnails, so I wouldn’t need to manually generate the parameters to be passed to the command line tool. However, I still felt like there might be a better way to hook into Ghostscript’s functionality. So, I decided to take advantage of the API provided by Ghostscript by writing a simple C# wrapper for the API to use in my current ASP.Net project.

A simple Ghostscript wrapper

The first thing I needed was the Windows version of the Ghostscript DLL, which can be obtained here. Once I included the DLL in my project, I needed to expose the unmanaged API functions to my C# wrapper function.

C#
[DllImport("gsdll32.dll", EntryPoint = "gsapi_new_instance")]
private static extern int CreateAPIInstance(out IntPtr pinstance, 
                                        IntPtr caller_handle);

[DllImport("gsdll32.dll", EntryPoint = "gsapi_init_with_args")]
private static extern int InitAPI(IntPtr instance, int argc, IntPtr argv);

[DllImport("gsdll32.dll", EntryPoint = "gsapi_exit")]
private static extern int ExitAPI(IntPtr instance);

[DllImport("gsdll32.dll", EntryPoint = "gsapi_delete_instance")]
private static extern void DeleteAPIInstance(IntPtr instance);

Above, I complained about the long list of parameters that need to be passed to the Ghostscript command line tool. Those same parameters need to be passed to the API, so the next thing I did was create a function that wrapped up the functionality for building the list of parameters. For simplicity, I left in a lot of default parameters, but the function could be expanded later on to allow more specific parameters.

C#

private string[] GetArgs(string inputPath, string outputPath, 
                         int firstPage, int lastPage, int width, int height)
{
    return new[]
    {
        // Keep gs from writing information to standard output
        "-q",                     
        "-dQUIET",
       
        "-dPARANOIDSAFER", // Run this command in safe mode
        "-dBATCH", // Keep gs from going into interactive mode
        "-dNOPAUSE", // Do not prompt and pause for each page
        "-dNOPROMPT", // Disable prompts for user interaction           
        "-dMaxBitmap=500000000", // Set high for better performance
        
        // Set the starting and ending pages
        String.Format("-dFirstPage={0}", firstPage),
        String.Format("-dLastPage={0}", lastPage),   
        
        // Configure the output anti-aliasing, resolution, etc
        "-dAlignToPixels=0",
        "-dGridFitTT=0",
        "-sDEVICE=jpeg",
        "-dTextAlphaBits=4",
        "-dGraphicsAlphaBits=4",
        String.Format("-r{0}x{1}", width, height),

        // Set the input and output files
        String.Format("-sOutputFile={0}", outputPath),
        inputPath
    };
}

Once I had a way of creating a list of parameters, I could start using the Ghostscript API functions. I created a function called CallAPI that would accept an array of parameters and use them to call the Ghostcript API.

The function I created for building a list of arguments returned an array of strings, but to use the API I needed to convert each of those parameters into a ANSI null terminated byte array (I added the code I used to do this to the bottom of this post). Then I needed to allocate some space in memory for each of those arguments and get pointers to each one of them.

C#
var argStrHandles = new GCHandle[args.Length];
var argPtrs = new IntPtr[args.Length];

// Create a handle for each of the arguments after 
// they've been converted to an ANSI null terminated
// string. Then store the pointers for each of the handles
for (int i = 0; i < args.Length; i++)
{
    argStrHandles[i] = GCHandle.Alloc(StringToAnsi(args[i]), GCHandleType.Pinned);
    argPtrs[i] = argStrHandles[i].AddrOfPinnedObject();
}

// Get a new handle for the array of argument pointers
var argPtrsHandle = GCHandle.Alloc(argPtrs, GCHandleType.Pinned);

Then, to use the newly converted parameters, I needed to create an instance of the Ghostscript API and pass them into the initialization function.

C#
// Get a pointer to an instance of the GhostScript API 
// and run the API with the current arguments
IntPtr gsInstancePtr;
CreateAPIInstance(out gsInstancePtr, IntPtr.Zero);
InitAPI(gsInstancePtr, args.Length, argPtrsHandle.AddrOfPinnedObject());

The call to InitAPI runs Ghostscript and generates any requested files at the output path.

Now the only remaining thing I needed to do was clean up the memory that was allocated for the API. To handle this, I wrote a cleanup function that takes in the items that need to be cleaned up. The API provides some cleanup functions, so I called those in the cleanup function as well.

C#
private void Cleanup(GCHandle[] argStrHandles, GCHandle argPtrsHandle, 
                                       IntPtr gsInstancePtr)
{
    for (int i = 0; i < argStrHandles.Length; i++)
        argStrHandles[i].Free();

    argPtrsHandle.Free();

    ExitAPI(gsInstancePtr);
    DeleteAPIInstance(gsInstancePtr);
}

One last thing I added to the wrapper was a simple function for generating thumbnails from a source PDF file. Technically, I could have just used the CallAPI function to do that, but I wanted to hide the details of working with the API from code outside of the wrapper class.

C#

public void GeneratePageThumbs(string inputPath, string outputPath, 
                              int firstPage, int lastPage, int width, int height)
{
    CallAPI(GetArgs(inputPath, outputPath, firstPage, lastPage, width, height));
}

The GeneratePageThumbs doesn't do anything other than calling the CallAPI function. However, in the future, I'd like to provide other functions that use the Ghostscript API as well. If anyone has any ideas for improving the code, drop me line.

Update: Here is the code I used to convert the arguments to null terminated byte arrays. There might be a better way to do this in .Net, this is just the quick solution I'm using.

C#
public static byte[] StringToAnsi(string original)
{
       var strBytes = new byte[original.Length + 1];
       for (int i = 0; i < original.Length; i++)
            strBytes[i] = (byte)original[i];
        
        strBytes[original.Length] = 0;
        return strBytes;
}

Update: This code has been open sourced

Treating C# Like A Scripting Language

January 2nd, 2009

Creating code on the fly

One thing that I like about scripting languages is their ability to dynamically modify code during runtime. Ruby and JavaScript, for example, both give you the ability to load in code directly from a string and execute it as part of your program. While that sort of thing can be dangerous, it also gives you access to some really fun metaprogramming techniques.

While working on a simple DSL for one my ASP.NET sites, I started to wonder if C# had some similar functionality that I could take advantage of. In particular, I wanted to load in C# code from a file and execute the code inside of it. What I found was that C# does indeed have the ability to accomplish this task, albeit in sort of an ugly way.

Using C# to compile C#

Most scripting languages give you a function that allows you directly evaluate a block or raw string of code as soon as it’s encountered. Because C# is a compiled language, it’s a little bit more complicated. The C# code needs to be compiled into an assembly before it can be used. And then classes from the compiled code can be instantiated directly from the assembly.

C# code can be compiled on the fly with an instance of the CSharpCodeProvider class (there’s a similar class for VB.Net as well). Additionally, you can create an instance of the CompilerParameters class, which contains a collection of parameters that will be used when compiling your code. In the example below, I’m creating a new C# compiler and a set of parameters that will tell it not to create an assembly file, but to instead compile the new assembly in memory. I also tell the compiler to include System.dll as a reference assembly.

C#
// Create a new instance of the C# compiler
var compiler = new CSharpCodeProvider();

// Create some parameters for the compiler
var parms    = new System.CodeDom.Compiler.CompilerParameters
{
    GenerateExecutable      = false,
    GenerateInMemory        = true
};
parms.ReferencedAssemblies.Add("System.dll");

Once a C# compiler has been created, you can use it to compile raw source into an assembly. CSharpCodeProvider allows you to compile code from a variety of sources. In the example below, I’m using the CompileAssemblyFromSource method to compile my code directly from an array of strings. CompileAssemblyFromSource will look at the code provided and return an instance of the CompilerResults class.

C#
// Try to compile the string into an assembly
var results = compiler.CompileAssemblyFromSource(parms, new string[]
{@" using System;

    class MyClass
    {
        public void Message(string message)
        {
            Console.Write(message);
        }               
    }"});

One thing to note is that the compilation method will complete regardless of whether or not the code has compiled successfully. To make sure you code has compiled, you need to check the Errors collection that is part of the CompilerResults instance returned by CompileAssemblyFromSource. If there were no errors, the code was compiled successfully and you can begin using the assembly.

Using the compiled code

Once your code is compiled into an assembly, you can use that assembly to create instances of classes from your source code and use reflection to invoke methods and get and set properties of those classes. In the example below, I’m creating an instance of MyClass and storing it as an object. I’m then using reflection to invoke the Message method on the class.

C#
// If there weren't any errors get an instance of "MyClass" and invoke
// the "Message" method on it
if (results.Errors.Count == 0)
{
    var myClass = results.CompiledAssembly.CreateInstance("MyClass");
    myClass.GetType().
            GetMethod("Message").
            Invoke(myClass, new []{ "Hello World!" });
}

It’s not exactly pretty, but it gets the job done. Scripting languages make it much easier to accomplish this sort of task, but it’s still nice to see that it can be done in C#.

A Quick Little Extension to the Spreadsheet Gem

December 31st, 2008

The problem

One of the most tedious tasks I have to complete at work is importing excel spreadsheets into a database. Often, a client will give me a spreadsheet of data that needs to be imported into an existing database. And, if I’m really lucky, the data needs to be reformatted before it can be imported. I tried a few different methods of accomplishing this task before I settled on using the Ruby Spreadsheet gem to import and format the spreadsheet data. I then use the Ruby DBI library to import the data into a database.

Spreadsheet works great for reading data from a spreadsheet file, but one thing that always annoyed me was that I needed to know the index of a column before I could get the value from it. So, for example, if a spreadsheet stored the value for First Name in the 3rd column, I would need to know that the 3rd slot in each row array represented the the first name value. I thought it would be really nice if I could access that value for a row by saying something like row[:first_name]. I had a little bit of extra time over the holidays, so I decided to see if it would be possible to make this functionality happen.

My solution

The first thing I did was create my own row class for the Spreadsheet library. I decided that I would create a new class called HashRow that would allow you to access the values of each row by using symbols that represented each column header or by using the original method of accessing row values with an index number. For simplicity, I assumed that the first row of the spreadsheet was the header row. I also simplified values for each column header by stripping out any characters that couldn’t be represented by a symbol. So, a header value of First Name becomes :first_name.

I also added a few convenience methods to HashRow. The header? method returns true if the row is a header row and the empty? method returns true if the row is completely empty.

Ruby
# Wraps Spreadsheet::Excel::Row row array with extra functionality
class Spreadsheet::HashRow < Spreadsheet::Excel::Row
	attr_reader :index
	
	# Keeps the original row value array 
	# and also creates a hash of values 
	def initialize(row, col_hash, index)
		@val_array = row
		@val_hash  = get_val_hash(col_hash)
		@index 	   = index
	end
	
	# Is this row the first row in the spreadsheet?
	def header?
		@index === 0
	end
	
	# Checks if every cell in the row is set to nil
	def empty?
		@val_array.compact.length === 0
	end
			
	# Returns the value in the row based on the index 
	# or key passed in. Integer values returns the row value 
	# by index in the array and symbols return the value 
	# for the symbol or string
	def [](value)
		if value.is_a? Integer
			@val_array[value]
		else
			@val_hash[value.to_s.downcase]
		end
	end
	
	private 
	
	# Uses a hash columns to build another hash for the 
	# values in the array with keys for the column heads
	def get_val_hash(col_hash)
		col_hash.keys.inject({}) do |acc, key|
			acc.merge(key => @val_array[col_hash[key]])
		end
	end
end

Once I had my HashRow class, I needed to open the Spreadsheet::Excel::Worksheet class so that I could replace the row method with a new method that returned an instance of the HashRow class. I aliased the old row method and used it inside my new row method to pass in a reference to the original row array. I also added a private method to help determine the index of each of column and a private method to format the column names.

Ruby
# Extends Spreadsheet::Excel::Worksheet so that the Rows become HashRows
class Spreadsheet::Excel::Worksheet		
	# Override the original row method with a new method 
	# that returns the custom HashRow class instead of an array
	alias_method :old_row, :row
	def row(value)
	    Spreadsheet::HashRow.new(old_row(value), get_col_indexes, value)
	end
		
	private
		
	# Returns a hash that contains key/value pairs for the column 
	# headers and the the index of each header
	def get_col_indexes
	    @col_indexes ||= old_row(0).inject({}) do |hash, cell|
	        hash.merge(get_col_key(cell.to_s) => hash.length)
	    end
	end
			
	# Converts the name of a column header to a 
        # specially formatted string
	def get_col_key(col)
	    col.gsub(/[\(\)]+/, "").
	         gsub(/\s/, "_").
	         downcase
	end
end

Once I included my new spreadsheet extension file I could use the Spreadsheet library similar to the way I had used it before, only now I had access to the row values by the row and index and by the column header name.

Ruby
Spreadsheet.open(FILE).worksheet(0).each do |row|
	unless row.empty? || row.header?
		puts row[:first_name]
		puts row[:last_name]
	end
end

Screwing Around With Screw.Unit

November 26th, 2008

I just started using the delightfully named Screw.Unit for JavaScript testing, and I have to say that, so far, I’m pretty pleased with it. Screw.Unit is a JavaScript testing framework that allows you to easily create and run unit tests for your JavaScript code. For example, here is a really simple hello world type test written with Screw.Unit:

JavaScript/Screw.Unit
describe("A simple hello world test", function()
{
    it("is a test for truth example", function()
    {
        expect(true).to(equal, true);
    };
};

If I were to run the test on a page that had the framework loaded in, I’d get a nice pretty rundown of my tests, showing that the hello world test had indeed passed (true does, in fact, equal true).

One thing that I immediately wondered about was how the keywords in the framework worked. At first, I assumed they were global variables and was a little put off by that idea. Would functionality like “describe”, “expect” and “it” interfere with other code that happened to use those same function names or variables? Fortunately, I was surprised to find this code in the framework:

JavaScript/Screw.Unit
var contents = fn.toString().match(/^[^\{]*{((.*\n*)*)}/m)[1];
var fn = new Function("matchers", "specifications",
  "with (specifications) { with (matchers) { " + contents + " } }"
);

fn.call(this, Screw.Matchers, Screw.Specifications);

Located in the initialization portion of Screw.Unit, this cryptic looking piece of code is actually doing something very clever. It’s using the Function.toString() method to extract the content of the function wrapper around all of your tests. It’s then taking the contents of that function and creating a new function that is using the “with” keyword to execute your code in the context of the framework’s keywords and functions.

In other words, it’s taking your tests and tricking JavaScript into thinking that your test functions have access to the properties and methods stored in Screw.Matchers and Screw.Specifications. This gives you access to the framework functionality without leaking any variables into the global namespace.

That sort of cleverness gives me hope that JavaScript has a bright future ahead.

A Historic Night

November 10th, 2008

On November 4th, 2008, I was given an amazing opportunity to witness history in Chicago’s Grant Park. I stood 50 feet away as Barack Obama gave his first speech as the president elect of the United States of America

A Light

First Family

Front Row

Hope

Speaking of Browser Bugs…

October 2nd, 2008

I keep breaking Google Chrome. Without even trying.