So, what's a closure and why should I care?
As Bob Dylan said, "...the times they are a changin'." The strange world of functional programming, once restricted to more esoteric languages such as LISP, Erlang and Haskell, is now invading the mainstream. Languages such as C# and Java are now incorporating concepts once unique to functional programming languages. These changes should bring simplicity to these more mainstream languages, but I find they are often misunderstood (by myself for certain) and thereby misused or not taken advantage of. I've had a fair understanding of closures for a while, but wanted to develop a more solid understanding, ergo this post.
One aspect of functional languages, called closures, have been talked about since .Net 2.0 came on the scene. Of course, what many presented as closures in 2.0 really weren't -- for a number of reasons that I don't care to get into at this time. More recently with the 3.0 framework (and the same will hopefully be true with Java 7), we see the introduction of lambda expressions; and finally have true support for closures (remember, not all anonymous functions are closures). You may ask, however, what the heck are closures anyway? Well, I'll do my best to give an accurate illustration.
First off, what are they. As the Wikipedia link above stated, "a closure is a function that is evaluated in an environment containing one or more bound variables." Neal Gafter, during a Google Talk given in January 2007, put it another way by saying that a closure is "a function that refers to free variables in its lexical context." We all should know what functions are...they are constructs that take a number of parameters and return a result (this includes more than just returning void). A free variable (or bound variable) is defined as a name (or symbol) that has a definition outside of the closure itself. The "lexical context" can be thought of as the context of the caller. I'll give an example later.
One thing that I found particularly interesting about Gafter's presentation was how he helped define closure by describing what they should support. Among other things, he mentioned that closures should have access to private members of the "lexical context" in which they are used. In C# and Java, they can also access the "this" instance of the caller. They should also support return values and support throwing exceptions (the latter is more of a Java concern). He also pointed out that although they are executed within the context of the caller, they can only return control to the caller (and not force the caller to return).
Gafter also pointed out that closures -- as proposed in Java at least -- are methods that act as new statements in the language and can be used to extend the native API (he calls these "control abstraction"). He described them as "concise function literals" that should behave as built-in statement, such as constructs such as for, while, lock, using, etc. in C# but without the problems associated with anonymous delegates and the like. Again, they are not anonymous class instances (such as delegates, which are not pointers to methods, but rather classes used to box methods).
Okay, so we have at least a rudimentary definition of closures; now let's see how they can be useful. To again steal from Gafter's presentation, I'll use a couple of his examples that I found useful.
First off, what's one of the problems closures can address? Gafter was describing a bit of code that was used repeatedly throughout an application at Google to gather run-time performance metrics on various operations. He argued that closures could be used to extend the native API and offer a more elegant syntax. Here's what the code looked like:
Granted this is only 8 lines of code, but only one of the 8 lines relates to the business logic of the application. The additional code (the other 7 lines) ends up cluttering the code and thereby obscures the surrounding business code. It would be nice if this code could be replaced by something like this (this would probably be a static method from some util class):
Which with the Java's proposed new method invocation syntax would mean the same thing as:
Neat, huh? With this example, the time() function is used in a manner similar to "for", "using", and other such operations native to the API. This is an instance where using a closure could be used to wrap common functionality around business logic in a more sensible way. Since the time method is a closure, we could also access doSomething() is it was a private method, such as this:
Pretty groovy huh? (pun intended)
From the horse's mouth...
*WARNING: If you have a burning lack of curiosity or can't fathom looking at code other than code written in your favorite language DO NOT continue reading. Reading LISP code without an open mind can lead to a general sense of disaffection and in more extreme cases unreasonable zealotry for your preferred language.
As stated recently I've been digging deeper into LISP lately. In LISP such features are native (and natural) to the language. For some reason I tend to think that closures look for magical (as well as cleaner) in LISP programs. I'll give another example of closures, but this time with one of the parent languages to all this closure goodness.
So here's the scenario...which, by the way, is an adaptation of the practical exercise in Chapter 3 of Practical Common Lisp. In the example I'm writing a simple little application to help break down the living things in my yard into a truncated taxonomy. Once finished, I'll be able to look up groups of living things by either kingdom or phylum. Alternately, I can look up a living thing by name and see to which kingdom and phylum they belong. Bear in mind I probably would never bother with this in real life...
Setting things up:
First off, I'll create a global symbol for my collection of living things, a function that creates a new living thing, and a function to add a new living thing to my global collection. Now I know that I should be using the CLOS, but I don't want to confuse anyone with slots and whatnot.
That was simple enough. We see that *living-things* is initialized as an empty list (or null set). The function "make-living-thing" uses the "list" operator to create a special kind of list called a "property list"...think of it as a hashmap. The "add-living-thing" function simply pushes whatever is passed to it onto the *living-things* collection (which I'm using as a stack in this case).
Now I'll walk around my yard and add the living things that I find...
Okay, so I left Bermuda Grass out...most of the grass in my yard is dead anyway. I can print out the contents of *living-things* and see the raw data:
Now to make things look a little prettier (only a little, mind you), I'll add a few functions for displaying the data in a more readable format:
And now I have something like this (output is truncated):
And now for something completely different...
Okay, now we are nearing where closures enter the example. Next I want be able to pull data from *living-things* using a SQL-like syntax. Don't ask why, I blame it on too much HQL.
Since I have only one entity type, I'll just copy the list passed to the from clause and return it (nothing to it...copy-list came free with Common LISP). The select statement takes a data source (ultimately using the from function) and n comparison predicates. The function "remove-if-not" is a Common LISP function that removes items from the list (passed as the second parameter) whenever the predicate evaluates to false/nil (the first parameter).
If you run the code from the comments you get the following:
Here you see the first use of a lambda expression to make an anonymous function (a function not associated with a symbol). The inner function, getf, is just a getter function that gets the value associated with the symbol :kingdom. The variable x in the getf is the current list item. For instance, the following gets the value associated with :kingdom for the first item in *living-things*:
The equal function returns true (T) if the value for :kingdom at x is equal to 'PLANTAE. That's all that there is to the lambda expression...it simply serves as the comparison predicate for "remove-if-not". There's still no closure here...any symbol could be used instead of :kingdom and the expression would still be valid:
See? return NIL, no problem.
End game...
Now we'll work on the where clause. Passing lambda expressions is kind of ugly, though, so we'll want to create an operator that will handle the where clause. Once finished, we should be able to make something like the following call and get all the matching entities back:
(select (from *living-things*) (where :kingdom 'ANIMALIA :phylum 'ARTHOPODA))
This part gets a little funky because we're going to use macros. Macros in LISP are similar to macros in C, but pumped up on steroids. Macros in LISP are used for run-time code generation (but much more elegant than IL generation in .Net). I'll skip over some of the details on macros but will attempt to get at the spirit of this whole thing:
This part's a little hard to follow. We are focusing on the "where" macro, but I think we'll start with the simpler predicate functions.
Birds eye view: First the "where" macro utilizes the "make-comparisons-list" function to generate a comparison predicate from a set of criteria. The "make-comparisons-list" function utilizes "make-comparison" for create a part of the overall comparison predicate using a field/value pair (such as :kingdom 'ANIMALIA).
The "make-comparison" function works just like the lambda function in the example above. First it gets the value associated with the specified symbol (such as :kingdom) and then checks that value against value that was passed as a parameter. Here's where we see a closure, where "living-thing" appears out of nowhere. First off, the backquote before the list containing "equal" is used to evaluate the list as code, the commas latter in the list mark variables. Without the backquote and commas we'd have an error because "living-thing" is not a bound variable (and that's bad). This backquote facility is very important in that it allows the coder to easily define code that can be evaluated later. So, if you execute "make-comparison" you get a piece of the code that the macro will ultimately execute. Again, I'm sure you noticed that "living-thing" isn't a bound variable...yet. We'll get back to that.
Since we want to support more than one comparison predicate, we've also created the "make-comparison-list" function with iterates over a list of fields and values (or properties and their associated values). The "loop while x collecting y" construct is an iterator with an accumulator. Basically it is iterating over all the field/value pairs (x), popping the field then the value and passing them to "make-comparison", and finally appending the resulting code (y) to a new list (this is a Common LISP feature). The result is a list of comparison predicate expressions.
Here the macro takes everything following the "where" operator and constructs the appropriate of comparison predicates. The where function will iterate over the list returned from the "from" function and pass each list item to the lambda expression. The variable "living-thing" is introduced as a parameter in the lambda expression, which will be bound by the time the code resulting from "make-comparison" is evaluated (where the closure is). The "and" function is used for the conjunction between all the comparisons resulting from calling "make-comparisons-list". Ideally, I'd use a more generic name for the bound variable used in "make-comparison", so that I could create predicates for any other property list.
To illustrate the code that's generated I'll make a few calls to the function macroexpand-1 (it's kind of like reflecting over the macro):
So, now you've seen the final lambda expression generated by the macro when using both multiple field/value paris and when using only one. Now we'll put together the entire select statement and test it...I'll get all animals from *living-things*.
...and now we'll select all plants and use the "friendlier" display:
So, as far as using closures, the closure on "make-comparison" was handy for supporting the lambda expression in the "where" function. I think that LISP is nice to clearly illustrate how closures work. One doesn't NEED to use closures...but at times they are quite useful for making code a bit simpler and easier to read. I hope this post was helpful in elucidating the nature of closures and how they might be used. I also hope it helps show how fresh a 50+ year old language can be.
For more information on closures:
This article seemed like a decent illustration of their use in C#.
If you can make it through the near 2 hour presentation, Neal Gafter's discussion of Closures in Java about a year and a half ago was an interesting illustration of what they might look like in Java. I also used some of his discussion as an example in this article. (It's also fun to watch the original crowd of near 60 people dwindle down to about 10 by the end of the presentation)
Then of course there are the scores of books and sites about LISP, Erlang, Haskell, Ruby, JavaScript and other languages that feature closures as a central feature of the language.
Happy Coding!
Joshua