![]() | Monkeys at Keyboards: The Javanomicon © Michael James Heron | ||||
| Topic: Java Programming Level: 2 Version: delta | |||||
29 - Recursion | |||||
| Previous | Table of Contents | Next |
| Forum |
| Chapter Objectives |
By the end of this chapter, the reader will be able to:
|
Going hand in hand with multithreading as a concept that has the potential to melt the unwary mind is recursion. The idea is deceptively simple, and the syntax is trivial. However, getting a mental image of how it works for anything other than simple problems is very difficult and requires thinking about a problem on many different levels. It is a very powerful technique that can usefully be applied to a range of different problems. However, it comes with significant drawbacks that limit its usefulness as a general purpose tool. Mastering the idea of recursion is a key to developing optimal solutions to certain kinds of problems. Often the solutions to problems involving recursion are expressed in a much clearer way than they could be with other techniques. In this chapter we'll be looking at a couple of simple (and not so simple) applications of recursion to explore the power behind this complex and confusing aspect of Java programming.
Recursion is all around us - it appears in almost all walks of life, from mathematics and physics to engineering and game-play. Simply put, recursion relates to the way certain things are defined in terms of themselves. Russian dolls are one of the classic examples of this idea - each doll contains a smaller version of itself, until eventually the smallest doll is reached. Another famous example is that of the mathematical idea of a factorial, which we'll look at as our first example of recursion in code. The book Godel Escher and Bach covers a wide range of topics, one of which is recursion and how it underpins much of the way we think about problems. Its author, Douglas Hofstatdter uses the example of GOD to explain the idea. GOD is an entity invoked by a pair of characters, and it turns out to be a genie that they ask to grant a wish. The genie is unable to comply and so pulls out a lamp of its own so it can consult its own genie - a meta-genie. The meta-genie pulls out its own lamp and so this continues on and on and on. The title GOD is defined as 'GOD over Djinn' - a recursive definition that precisely maps the chain of invocations of genies. The idea as applied to programming is somewhat more simple, and falls into one of two categories:
We've already seen the idea of indirect recursion when we spoke about animation - the paintComponent method called repaint, which called update, which called paintComponent. In this chapter we're going to be looking at direct recursion only. Recursive methods are made up of two components:
As is the case with all methods that we write, we are trying to solve a problem when we make use of recursion. Usually we pass in parameters to our methods - the trick in recursion is to pass a smaller version of the problem onto each successive recursive call, so that the problem reduces inevitably towards the termination clause.
There is one example that is always used as an introduction to recursion, and that is the example of the factorial. It is the classic example because the mathematical definition of a factorial (which is inherently recursive) is mapped precisely onto the method code that implements it. A factorial is written as a number followed by an exclamation mark... 5!. This is then read as 'the factorial of five'. Mathematically, a factorial is defined as follows:
So, the factorial of five is expressed as:
And the factorial of four is expressed as:
And so on, until we get to the termination clause:
Only when we get to the termination clause can we start to provide an answer to the original question. Determining the value of a factorial is inherently recursive - each factorial is defined in terms of the factorial that precedes it, except in the termination case. This means it is an ideal first introduction to recursion as a programming tool. We have our termination clause (0! = 1), so all we need to do is write a method that takes a number (the number we want the factorial of), and then recurse around that number, passing a smaller version of the problem onto each successive method call:
It is important to realise that recursion is really just a special kind of loop - there's no reason that we have to implement this method as a recursive structure. We can just as easily implement it as a normal for loop:
The recursive structure is somewhat neater, and more precisely maps the way factorials are defined mathematically. However, recursion comes with some downsides which we'll discuss later in this chapter. From this simple example, we can see how easy recursion is to use, but it's not immediately obvious why we should bother learning how to use it.
Recursion works on the Machiavellian principle of Divide and Conquer. Each time we make a recursive call, we make the problem simpler. Each recursive call must lead inexorably towards the termination clause, whereupon we can start actually providing answers to the original question. In our recursive method above, every time we have the method make a call to itself, it passes a smaller number as a parameter. In this way, we're guaranteed to reach the termination clause when we finally pass 0 as a parameter:
The diagram above shows the information flowing between each call to the recursive method - we keep recursing until we get to the method that calls getFactorial (0). The factorial of 0 is defined in terms of an absolute answer, and so it doesn't need to recurse beyond that. It can then return and pass the answer on to the method that called it, which can then pass the answer onto the method that called it, which... You get the idea, I'm sure.
At a more technical level, we need to understand a little bit about what happens when we make a method call in Java. This is a simplified explanation of the process - there are many books on the subject that the interested reader is directed towards for a more complete (and accurate!) explanation. When a Java program is executed, Java allocates space in memory for each line of code, and each line of code has a memory location that goes with it. If we look at the simple example of our factorial, we may get a memory map that looks something like this:
Obviously this is a gruesomely simplified example, but it should illustrate the idea. Whenever we make a method call, the Java Virtual Machine creates what is called an activation record that contains the information relating to the call. It contains the value of any parameters passed, the value of any internal variables, the value of what the method returns, and the memory location it should go back to when the method has finished executing. So, Java starts off with a main method, and starts executing each line in sequence. It puts the value 5 into a variable (called number), and then makes a call to getFactorial passing in this variable as a parameter. This call causes Java to create an activation record holding the appropriate information. The method call is made at location 1300, and so when Java has finished with the getFactorial method it will return to location 1300 and continue executing:
Java then moves to the memory location associated with the getFactorial method (1500), and starts executing each line in order. It checks to see if the parameter is equal to 0, which it isn't. It never executes the code that goes with that if statement, and instead gets to a return statement that has another invocation of getFactorial. So Java creates another activation record:
Java then starts executing through getFactorial again, and gets to another recursive call:
And it will continue this recursion until it gets to a case where the termination clause is invoked, and thus a definite return value is found that is not defined recursively:
The location where all of this occurs is known as the runtime stack, which describes a kind of data structure stored in memory. As we discussed in the chapter on data structures, a stack works by adding things to the end and then taking things off the end when required. Like a stack of dishes, the last one you add to the stack is the first one you take off - for this reason it is known as a LIFO structure. Last In, First Out. Once we have an actual return value, Java starts removing records from the stack, passing our definite answer back to the method that created the last activation record:
And then:
And so on, until we get to the definite answer:
At which point, the method returns control back to the main method with an absolute answer. We don't need to understand all of this to make use of recursion, but it hopefully helps in making sense of exactly why recursion can work.
So far we've been looking at recursion in a very simple context, and one that doesn't best show off its powers. We're also going to look at a slightly more complex example where recursion makes a simple job of what would otherwise be a rather difficult task - listing all the subdirectories of a particular directory. The termination case for this is when all directories in a subdirectory have been listed. Each time we call the method, we get a list of all the files in the directory we pass as a parameter, and then call our list method (called getDirectories) on each directory we find there. At each point, we return an ArrayList that contains all the directories we have found within that particular method call:
Let's look at how this works with a simplified example of a directory structure, with a directory called C:\example:
As you can see from the code above, we pass a file reference as our parameter - this file reference will be to the base directory:
So, we pass c:\example as our initial directory. The first thing our getDirectories method does is use the listFiles method on the File object to get a list of all the files in that directory. In this case, it will pick up all file references, including those that point to subdirectories. If in addition to our directories we have a file called Example.txt, the array we get back from the listFiles method call will hold the following file references:
The method iterates over each of these, calling the method isDirectory on each. If this method call returns true, it then adds it to the list of directories found, and then passes that file reference to getDirectory:
If an element doesn't return true for the isDirectory call, it is ignored. Element 0 and Element 1 of the array will return true, so those become the passed parameters for their own getDirectories methods. Each time the method has finished iterating over the list of directories it has, it returns an ArrayList containing all the directories it found, and all the directories found in those directories. Then, it simply outputs this list to a text area on the application. This is something that is very easy to do with recursion, but not quite so simple to do with a normal loop structure. Recursion is a very powerful way of dealing with problems of this nature, and it can achieve very complex ends with very simple code. The only difficult part is thinking recursively enough to implement the algorithm. As I'm sure you're beginning to realise, this is the difficult part about recursion!
Any solution that can be written recursively can also be written iteratively. This begs the question, why bother using recursion at all? Sometimes, as in the case of our Factorial example, it's just neater. There is no real benefit other than the fact it more precisely models the mathematical definition. For other solutions, it's just easier - such as with listing subdirectories of a directory. An iterative solution to this, while possible, is quite a bit more complex. However, neither of these are reasons enough in themselves to make use of recursion. Let's return to a simple mathematical recursive definition - that of Fibonacci numbers. We can define these mathematically as follows:
This is almost identical to our factorial example, and so the code is almost the same:
We can then run this for N number of terms, and get the right answer out of it. The equivalent for loop is somewhat uglier:
This is a case where implementing with recursion is obviously cleaner and simpler. But there is a problem - a fairly significant problem that we can only really see when we start computing larger Fibonacci numbers. Let's look at an example with some simple output:
For five terms, we get the following output:
For five terms, it entered the fibb method a total of nine times... that means it's executing the method almost twice as many times as it should have to. For six terms:
And for seven:
That's over three times as many invocations of the method as the number of terms we want to compute! Let's write a quick testing harness to let us get some simple metrics:
Running this program gives the following output:
That's pretty appalling - let's plot it on a graph to really see how bad the recursive solution performs with larger terms:
The number of method calls required is exponential - the bigger a number we want to compute, the longer it's going to take. Oh no! Our for loop suffers from no such problem - it is several orders of magnitude more efficient. The reason for the recursion inefficiency is in the way the terms are computed - each call to a preceding term requires it to be computed afresh each time. We can solve this by writing some caching to ensure that once a particular number has been computed we don't do it again (in fact, we'll do exactly this in chapter 31) - but this greatly complicates what should be a simple implementation. And here is where we hit the big problem with recursion - efficiency. For most cases, the efficiency of an iteration versus a recursion will be roughly similar. The recursion will be a shade slower, but not so much that it should keep you up at nights. However, in cases where a recursive method is making multiple recursive calls as part of its execution, the performance penalty is very severe.
Despite the fact recursion is all around us, it is often very difficult to apply it as a technique to solving computer problems. There are various organisations that do studies as to what concepts people find most difficult to learn in programming, and recursion consistently comes out near the top of the lists. As with most things in programming, it becomes easier with practise... but recognising the situations where recursion is suitable is still a difficult task. In this section we're going to look at an example where recursion is the ideal solution for a problem that may not instantly be an obvious candidate for its gentle caresses. We're going to use recursion to develop a piece of code that will shuffle the letters of a word and return all the possible permutations of that word. This isn't quite the same as generating anagrams, because we're not going to check the validity of these by comparing them against a dictionary. It does form the basis of a solution to a number of similar problems. If we want to compute all possible permutations of a word (let's say bing), we follow a standard procedure. We need to find all of the permutations that begin with b, all of the permutations that begin with i, and with n, and with g. However, in order to work this out for B, we need to find all of the permutations that can be made with B, which requires working out all the permutations that begin with i, and with n, and with g. And so on... until we find all the permutations of a single letter. Once we have this, we can start working backwards. Recursion! The method we are going to use makes use of start and end strings - we need some way of telling where we are through a string, and the start string contains those letters which we're not going to permute. The end string is the collection of letters that we still need to permute:
This is a very complex algorithm that needs careful reading in order to understand what is going on. The following diagram shows a subset of the processing that occurs with just the word bing:
During the execution of the method, we create two variables, newEnd and newStart that represent the start point of a recursive call of the method. newEnd is made up of the current value of the end parameter minus the first letter. NewStart is made up of the current value of the start parameter plus the first letter of end. So if we have start as being empty and end as being bing, we get the following result:
And then these two parameters (newEnd and newStart) are passed off to our generateShuffle recursive call. This method returns an ArrayList in the same way our subdirectories method does, so it then adds all the elements that got returned from the method call to our current list of permutations. Once we've done this, we rotate our end variable so that every character moves one place to the left and do the whole thing again:
So, the next time around the loop, start is still empty and end is ingb:
And so on, until we've passed in all possible starting letters for new permutations. This continues until we get to the termination clause
At the end of the process, we get all of the possible permutations of a word. Implementing a solution to this problem using iteration is absurdly difficult compared to the recursive solution... but as I'm sure you can see, it's not immediately apparent exactly why the recursive solution works. Sitting down and working your way through it with a pen and piece of paper would be a useful exercise in tracking the flow of logic, even if it makes you go ever-so-slightly funny in the head.
It is important to realise that recursion is never the only solution to a problem. It is only one of a number of tools you have available, and sometimes it will be the best - however, it's not the only one. If you can't achieve a particular aim with recursion, you will be able to do the same thing with iteration. It may be considerably more complex in terms of the code, but it may be correspondingly easier to envisage in your head. There are some problems, particularly related to data structure manipulation, where recursion is the ideal solution and one that is clearly superior to any other implementation. Usually this is a feature of the problem rather than of the inherent benefits of recursion in and of itself. A simple example is a piece of code that computes the value of a calculation following the rules of operator precedence:
In real life, we solve this recursively by working out the answers between the brackets before combining them back into the full formula. As such, it's an ideal candidate for solving with a recursive algorithm, and solving it iteratively would be counter-intuitive. Recursion is used in a number of areas in computing:
One thing these fields all have in common is that they are traditionally regarded as hard programming - not necessarily in terms of syntax, but in terms of the complexity of the thinking required. All of these fields have numerous instances where recursion is the best solution to any given problem, but a firm understanding of the problem domain is required first.
The answer to this question is simple - most of the time. Recursion is slower than iteration - sometimes dramatically so. Recursion is considerably more complex than iteration, not in terms of syntax but in terms of the difficulty of thinking through a solution. Recursion is harder to debug than iteration - not as hard as debugging multithreading since the path of execution through a recursive solution is still deterministic, but it's still a lot trickier than debugging an iteration, especially if the error is subtle. So, unless you have a problem where the answer is clearly to use recursion, it is best avoided. Experience will guide you as to when it is the most effective item you have in your toolkit.
We have had an opportunity to discuss searching algorithms within the course of this book, but we've relied very much on Java's inbuild Arrays and Collections classes to provide our sorting abilities. There are a range of sorting algorithms. Some of these can be implemented with nothing more complex than a pair of for loops (the bubble sort, for example). Some however are best modelled with recursion, and these tend to be the ones that are most scalable and efficient. This seems to fly in the face of what was said before - recursion is slower than iteration. However, we're not comparing a straight performance of a loop against a recursion - we're comparing two entirely different ways of approaching a particular kind of problem. One of the most efficient algorithms for sorting is called the quicksort algorithm. It is an inherently recursive process - implementations using iteration are possible, but substantially more complex to code. We'll discuss the quicksort algorithm in this chapter as a further example of recursion, but bear in mind that the standard Java sort implementation already provides an effective quicksort mechanism. You shouldn't feel pressured into incorporating this into anything you do. We begin with two index pointers - one points to the start of the list, and the second points to the end of the list:
We partition this list into two parts. One contains all the numbers smaller than the one pointed to by the start pointer, and the other contains all the numbers larger than the one pointed to by the start pointer. So in this case, one array contains everything that is smaller than 54, and the other contains everything that is larger than 54. We step through each of the elements in the list applying a simple routine, switching between the two pointers according to a simple metric:
So, first step over the array... we move start on by one:
11 is smaller than 54, so we increase the value of start again:
88 is larger than 54, so we stop working with Start and switch control over to End. We reduce the value of End by one:
16 is smaller than 54, and so we swap the two elements around:
Start is still smaller than end, so we start increasing start once again:
17 is smaller than 54, so we increase the valie of Start again:
67 is greater than 54, so we start decreasing End:
59 is greater than 54, so we keep decreasing end:
98 is greater than 54, so we keep decreasing end:
66 is greater than 54, so we keep decreasing end:
70 is greater than 54, so we keep decreasing end:
89 is greater than 54, so we keep decreasing end:
32 is less than 54, and so we swap the numbers around:
Start is less than End, so control swaps back to start, and so we move it on a space:
67 is greater than 54, and so we swap control over to End. We reduce it by one:
32 is less than 54, so we'll stop decreasing end. Here, start is no longer less than end, and so we stop the current process. The final aspect here is that we swap the first number (54) with the number currently pointed to by end:
By the end of this process, we have assured that everything to the left of the end pointer is less than 54, and everything to the right is greater than 54. So, you may be asking... where does the recursion come in? Well, so far no-where... it's what we do next that brings in the Hot Recursion Action. Having ended up with the list sorted into two distinct parts, we then apply exactly the same process to each section of the array. We begin doing the same thing with the left hand side of the list:
We sort this array into two parts - one containing elements larger than 32, and one containing elements smaller than 31. Follow the process above and you end up with:
And then recursively again:
Becomes:
And:
Becomes:
After sorting the left part of the array until we're attempting to a sort an array of one element, we attempt to sort the right part of each of the smaller arrays until we end up with:
So we begin recursively sorting the right hand side:
After following this lengthy process, we end up with a perfectly sorted array. It seems like a very complicated process, and it might look as if it would be a nightmare to code. Luckily recursion takes most of the sting out of it for us. Consider a simple implementation that works only on integers (writing one that works for general objects is left as an exercise for the reader).
As you can see, that's an awful lot of complicated functionality handled in only a small fistful of lines of code. Such is the power of recursion - use it wisely, young jedi.
Recursion is a very difficult topic - not in terms of its syntax, but in terms of the design process. For simple definitions of fundamental mathematical concepts, it is both simple and self-evident. As a general purpose programming tool, it is less intuitive and very often the problems it is best applied to aren't 'obviously' recursive in nature. It is only by spending time thinking about the problem that the applicability of recursion becomes clear. Recursion is inefficient compared to a similar iterative loop - this is not a general maxim that is always true, but it is sufficiently true for it to be a useful guiding principle. Iteration enjoys a much reduced complexity and much greater applicability that means it is a more effective tool for solving most problems. However, recursion is sufficiently powerful that when used properly it can express dramatically improvements over iterative solutions in terms of compactness of code and ease of maintainability. Only practise will give you the experience required to recognise these situations when they occur. So get practising! Further ReadingThe following table details further reading on the topic in this chapter, and also any external resources that you may find useful.
|
| Previous | Table of Contents | Next |
© 2004-2006 Michael James Heron