Thursday, December 28, 2006

Evolution:: (Rails,Ruby) -> Haskell

Apparently, there is a renewed / growing interest in functional languages like Haskell, OCaml, etc. and I am again being accidentally trendy... :-). Here's a bit on how I managed to stumble upon Haskell and how its affecting me as a developer...

Last year, along with thousands of others, I discovered Ruby on Rails, and built the core product for my company using it. It was (and continues to be) a great programming experience. Like everyone else who uses Rails for more than a couple of minutes, I fell in love with Ruby as a programming language. Coming from a background in C/C++ yada yada, Ruby opened up this wonderful world of blocks, closures, lambdas, etc. - concepts that despite 25+ years of programming were new to me. I was so excited, in fact, that I ended up helping to create the Houston Ruby on Rails user group.

Since most of the Rails group members came from similar backgrounds to mine, none of us really had anything more than a superficial understanding the whole lambda thing, so I was elected to give a talk on it. It was in doing my research for the presentation that I came across functional programming (FP) and Haskell.

I've just begun to scratch the surface of Haskell and FP, but its already changed the way I think about things in Ruby. I've been doing OOAD for many, many years, and I approach Ruby from that perspective. I tend to think in terms of OO abstractions, and methods on classes that alter state, i.e., alter the values of variables in an object. In FP, and Haskell in particular since it is the "purest" of the FP languages, there is no such thing (ignoring monads for the time being) as state. "Variables" are really more like constants in other languages - you can assign (aka bind) values to them, but that's it. Functions in FP take inputs and produce outputs, with no "side effects", meaning that they do not alter state or global variables. Since variables can only be assigned to once (like constants), new approaches are required for just about everything. The simplest example is iterating over a collection. In C++, you just do something like

int x[3] = {1,2,3}
for(int i = 0; i < 2; ++i)
{
x[i] = x[i] * 2;
}

and you're done. This can't be done in Haskell, since you would be reassigning values to the variable i. UPDATE: For the sake of clarity, I am not trying to say that you can't iterate in this style - but rather that incrementing a variable (i++) is not allowed in Haskell (although I understand that there are monadic ways of doing this?). In Haskell, you would use a function (map) that applies another function (*2 in this case) to each element in the array or list (x in this case), like this:
map (*2) [1,2,3]

I'm fudging slightly here - the C++ example is of an array, while the Haskell example is really a list. The concept is the same, though.

There are many new concepts in Haskell and FP that I intend to explore over time, but how has this one aspect changed my Ruby programming? I already used Ruby iterators (such as map, inject, etc.) with associated blocks to manipulate arrays. But I also used x.each frequently, essentially taking the same approach that I might have using C. This is not bad, of course, however the FP approach made me take a second look at some of the code that used that idiom, and I quickly realized that I was

(a) not taking full advantage of the Ruby higher level iterators such as map, and

(b) creating local variables all over the place, sometimes without a very good
reason.

Local variables aren't bad either, per se, of course. Sometimes they simply help make the code more readable and easier to debug. But every line of code has a potential for creating a bug, so there is a balance to be sought. So......here's an example of some code from our application that I reworked based on my new found FP sensibilities...

The original code:
html.each do |line|
@search_terms.each do |term|
line.gsub!(/#{term}/i,'<span class="found_text">;\&</span>') unless line =~ /(<|Storage|Towslip)/
end
@results_html

This of course is not bad, and is pretty readable. The general structure is to iterate over each line, then each term, and do a global substitution using a regex. Each line is modified in place using "gsub!". For each term, though, there are actually (potentially) two regex calls - one to determine if the line should be modified, and the other to search for terms to modify. So how would I approach it from a more FP point of view? This is admittedly a naive approach, but just thinking "How could I treat the html as whole and NOT modify values (as in the line.gsub! call)?" proved interesting. The result is below. The regexs are combined into one making it more efficient (although for the non-regex initiated it may seem pretty messy). There are NO explicit intermediate variables.

def highlight_html(html,ht, et)
html.map do |line|
line.gsub(/(?!#{et.join("|")})#{ht.join("|")}/i) \
{ |m| "<span class='found_text'>#{m}</span>" }
end.to_s
end

Some good references


Exploring functional programming, Bruce Tate

4 comments:

Eric Mertens said...

You are correct that the equivalent of i++ exists in Haskell and that it would typically done in a Monad.

Anonymous said...

Yeah, in general you would never do the equivalent of "i++" in haskell. You'd more than likely pass (i+1) recursively or use [1..] or something..
but if you really wanted to, you would (assuming ST monad) use "modifySTRef i (+1)"
or, borrowing from Bulat's nice Ref sugar, you can do like http://hpaste.org/1185

Anonymous said...

Why use a block in the regex? Am I naïve in thinking that the following gsub produces the same result:

line.gsub(/(?!#{et.join("|")})(#{ht.join("|")})/i, '[span class="found_text"]\1[/span]')

IIRC (?!...) does not capture anything.

Keith Lancaster said...

@anonomous - While I have to admit that I don't remember off hand WHY I used the block, the formulation just doing a straight gsub does not work. I'll play around with it a bit and post back here.