Monkeys at Keyboards: The Javanomicon
© Michael James Heron
Topic: Java Programming
Level: 2
Version: delta

If an elderly but distinguished scientist says that something is possible he is almost certainly right, but if he says that it is impossible he is very probably wrong.
Arthur C. Clarke

7 - Hooray for Arrays

PreviousTable of ContentsNext
Forum


Chapter Objectives

By the end of this chapter, the reader will be able to:

  • Be able to use arrays and ArrayLists to solve programming problems.
  • Be able to use iterators to step over elements of an ArrayList.
  • Make use of sorting algorithms, and apply it to lists of data.


7.1

Introduction

In previous chapters we've been looking at the skills we require in order to develop simple Java applets. We've looked at the fundamental building blocks of the language and how to set up graphical user interfaces using the Swing GUI architecture. In this chapter, we're going to take a look at one of the more complex structures provided by Java for representing data.

Primarily we will be looking at the idea of arrays and some of the specialisations of this concept provided by the Java object library - mainly the ArrayList class. I know that many of you will already be familiar with the idea of arrays, but is is a subject of sufficient complexity to warrant a further discussion within the text of this book

7.2

The problem with single variables

The variables we have used so far are individual units of data. They live a lonely, solitary life by themselves - unconnected and unrelated to even conceptually similar information. In real life however we must often deal with 'clumps' of related data. It is very useful for us to be able to refer to whole sets of data in a single statement... we can do this in Java through the medium of arrays.

Arrays are conceptually-linked lists of data - instead of having multiple pieces of data we have a single piece of data that contains a number of different compartments into which information can be inserted. Arrays are homogenous - they have multiple compartments, but all the data that is placed in them must be of the same type.

The benefit of arrays comes when we're dealing with many instances of the same kind of data. Consider a program that lets us store the ages of five people. We can implement the interface to this as a text field where we enter ages and a button that lets us add an age to our current list.

To store five ages, we need to have five integer variables. We also need to keep track of how many ages we've entered so that we know into which variable we are going to put a particular age:

import java.awt.*; 
import java.applet.*;
import javax.swing.*;
import java.awt.event.*;


public class WhyWeNeedArrays extends JApplet implements ActionListener {
JTextField myText;
JButton myButton;
int age1, age2, age3, age4, age5;
int numOfAges;

public void init() {
setLayout (null);
myText = new JTextField (10);
add (myText);
myText.setBounds (100, 10, 100, 30);
myButton = new JButton ("Add Age");
add (myButton);
myButton.setBounds (10, 50, 50, 30);
myButton.addActionListener (this);
numOfAges = 0;
}


public void actionPerformed (ActionEvent e) {
int temp;
numOfAges++;
temp = Integer.parseInt (myText.getText());
if (numOfAges == 1) {
age1 = temp;
}

else if (numOfAges == 2) {
age2 = temp;
}

else if (numOfAges == 3) {
age3 = temp;
}

else if (numOfAges == 4) {
age4 = temp;
}

else {
age5 = temp;
}

}

}

This is a workable solution - for five ages. What if we wanted to store ten, or twenty, or a hundred, or a thousand?

Obviously the code for this is not very extensible, but this is only the beginning. The problem comes when we want to manipulate this data in order to get some meaningful information out of it. Consider if we want the average:


public double findAverage() {
int total = num1 + num2 + num3 + num4 + num5;
return total /5;
}

Again, this isn't a problem - for five ages. It's more of a problem for one hundred. Consider now finding the highest age in the set of data:


public int findOldest() {
int currentOldest = 0;
if (age1 > currentOldest) {
currentOldest = age1;
}

if (age2 > currentOldest) {
currentOldest = age2;
}

if (age3 > currentOldest) {
currentOldest = age3;
}

if (age4 > currentOldest) {
currentOldest = age4;
}

if (age5 > currentOldest) {
currentOldest = age5;
}

return currentOldest;
}

Imagine having to write that for a hundred ages - not only is it a nightmare to code with huge amounts of repetition, it's also crying out for a transcription error when you lose track of where you are and assign the wrong value to the wrong variable.

7.3

Introduction to Arrays

Instead of writing such unwieldy code using single variables, we make use of arrays. In many ways, they work identically to other data types except that they have a slightly different syntactical notation.

Arrays are assigned a single name - this is the name of the variable. Arrays are broken into a number of compartments, which are called elements. Each of these elements has an index, which is a numerical indication of in which compartment they are:

Arrays indexes and elements
Fig 7.1: Arrays indexes and elements

The Java programming language provides two ways of referencing to an array in code. One is to refer to the whole array, which is done by referring to the variable name as with other data types - when we do this, we are making reference to the entire set of data rather than any of the actual data elements it contains.

The other way we have of accessing an array is by specifying a particular index - this lets us access the data contained within an element. We do this by using a square bracket notation system - we give the name of the array, and then the index contained within square brackets. With our example above, if we wanted the element stored in position 2 of the array, we would use the following notation:

int temp = myAges[2]; 

You probably noticed that the first element in the array has an index of 0 - this is a feature of Java and other C-type languages. For arrays we start counting from zero. This is sometimes quite hard to get used to, but eventually it becomes second nature.

If we wanted to change the data container within an element, we use the same square bracket notation:

myAges[2] = 32; 

After this assignment statement, our array is changed so that the element at position 2 is overwritten with the new value:

Setting an array element
Fig 7.2: Setting an array element

So, why do we want to use arrays? Mainly because they are flexible - there is no difference in terms of code complexity in using an array that has ten, or a hundred, or a thousand elements. This is a marked improvement over using separate variables to hold each instance of data.

They are coherent - conceptually-linking related pieces of information allows for us to perform operations on the whole set of information without having to worry about making sure we're referring to everything we should be and without the risk of missing out variables that should be included in the operation.

They are simple to manipulate - their structure is perfectly suited to manipulation through the use of standard programming constructs such as for loops. We'll see some examples of this later in the chapter.

7.4

Declaring Arrays

We've seen the notation for getting and setting elements of data within an array, but we haven't yet seen how we create them in code.

To create an instance of an array, we need three pieces of information:

  • The name we are going to give the array
  • The type of data that is going to be stored within the array
  • The number of elements the array is going to have.

There are two stages to setting up an array (actually, these are the same two steps in setting up any variable, but understanding this is of particular importance when using arrays):

  • Declaring to Java that we need some space in memory to hold a variable.
  • Putting something into that declared memory.

The first step we have seen before when we have created the class wide variables to hold GUI components:

JButton myButton; 

What this line of code says to Java is 'Hey, we're going to need some memory to store a button - could you set enough memory aside for this for me? Later on, we'll refer to the chunk of memory you give us with the identifier myButton'.

At this point, there is nothing in the memory - there is no actual button there, there's just a reference to a chunk of memory large enough to store a button. If you try to call a method on this variable before something is put into the memory, you get what is called a null pointer exception.

Before you can actually do anything with this variable, you need to put something into it... we do this with the new keyword:

myButton = new JButton ("An Example Button"); 

Once this line of code is executed, Java creates the button itself and stores it in the space in memory that was reserved.

These two steps are often combined into one line of code:

JButton myButton = new JButton ("An Example Button"); 

But even though this is now one single statement, there are still two discreet steps involved in setting up the variable.

So, when declaring an array we need to keep this in mind. For the first part of the declaration we need to indicate to Java the name, the type, and the number of dimensions of the array (more on the dimensions part later). To create the array we looked at above, first we need to declare some space in memory:

int myAges[]; 

Notice that we don't give the number of elements here - we do that when actually putting an array into the space of memory we've declared:

Declaring an array
Fig 7.3: Declaring an array

And that's it, we've created our array. At this point, the array looks like this:

An empty array
Fig 7.4: An empty array

We can then set individual elements of the array in the way we saw above:

myAges[0] = 10; 
myAges[2] = 30;
myAges[3] = 15;

Array after modifications
Fig 7.5: Array after modifications

There is a second way to perform the second step of the array declaration, and that is to give Java the list of elements that should be contained within - from this, Java will work out how large the array needs to be itself:

myAges = { 10, 20, 30, 40, 50 }; 

This creates the following array:

Array with starting values
Fig 7.6: Array with starting values

We can use this exact same notation to declare arrays of integers, floating point numbers or booleans. We use exactly the same notation to declare arrays of more complex data types also, like Strings, or JButtons:

JButton myButtons[] = new JButton[100]; 

There is a slight difference in how Java populates an initial array. With primitive data types, all the elements of the array are set to 0. With complex data types, all of the elements of the array as set to null. We need to put something into an element of the array before we can do anything:

myButtons[0] = new JButton ("Press Me"); 

If we don't put something into the element with the new keyword, a null pointer exception will be encountered any time we try to call a method:

myButtons[1].setBounds (10, 20, 30, 40); 

This is because although we have declared that we want an array of complex data types (like buttons), we've only created the container for the set of these objects - we haven't actually created the objects themselves. We must do this before we can perform any operations on the elements on the array.

Java tip

Sometimes the overhead of using an Array can make things more complicated than using single variables - however, these situations are quite rare. Consider any situation in which you have multiple objects or pieces of data acting in a similar way, and determine whether you have scope for making use of an array.


7.5

Methods and Arrays

Like most things in Java, arrays are actually objects. We've yet to look into this idea in any depth, but the key point is that like JButtons, JComboBoxes and JScrollBars - arrays in Java have methods and properties that go with them. Much of the functionality for dealing with the day to day operations of an array is handled by the provided methods.

For example, it was once a common coding task to write a sorting algorithm... a piece of code that would go through an unordered list and turn it into an ordered list. For example, consider an unordered list of integers:

Unordered list of integers
Fig 7.7: Unordered list of integers

Through the application of code, this would be turned into an ordered list of integers:

Ordered list of integers
Fig 7.8: Ordered list of integers

It was once necessary to write the handling code to do this, but with the advent of object oriented programming most of this functionality is provided in the object libraries of languages like Java. To sort a list of unordered elements, we simply call the sort method of the Arrays class and pass our array as an argument:

int myIntArray[] = { 5, 6, 1, 60, 212, 4, 7, 11, 920 }; 
Arrays.sort (myIntArray);

The Arrays class is found in java.util, so the line import java.util.*; must be placed at the top of your code before you can make use of it.

The case is the same for searching, which has previously been another vehicle useful for explaining array manipulation. Searching has an associated method in the Arrays class. We can call the method binarySearch, passing our array and the element we wish to search for. This method will return the index of where this element may be found in the array:

int temp; 
temp = Arrays.binarySearch (myIntArray, 212);
System.out.println ("The value 212 is found at position " + temp + " in the array.");

However, the binarySearch method will only work on a fully sorted list. If you want to make use of it, you must ensure your array has had the sort method called on it. There is still a requirement for developers to write searching code to locate data elements that are not stored in an ordered list. We'll talk a little more about searching later in the chapter.

The array itself has a property called length that tells us how many elements it contains - obviously we will already know this since we have to pass the number of elements as part of the declaration of an array. It comes in useful however when we pass arrays to and from methods, as we shall see in a few moments.

int numberOfElements = myIntArray.length; 

If myIntArray contains nine elements, then numberOfElements will then have the value 9.

7.6

Multi-dimensional Arrays

Now we've looked at the idea of single dimensional arrays - we can look a little at the idea of multi-dimensional arrays. Java makes it simple to extend arrays in any number of dimensions without any difficulty (beyond the conceptual).

If one dimensional arrays are like tables of data, then two dimensional arrays are like boards:

Multi-dimensional Array
Fig 7.9: Multi-dimensional Array

We can declare two dimensional arrays in a very similar way to how we declare one dimensional arrays. We simply need to add an extra set of square brackets to indicate dimensionality:

int board[][]; 
board = new int [5][5];

These two statements will create a two dimensional board of integers that extends five elements in the X dimension and five elements in the Y dimension.

We can then 'pinpoint' elements in this array by passing both an x and y index:

int temp = board[3][4]; 

Multi-dimensional Array Access
Fig 7.10: Multi-dimensional Array Access

We can set elements in the array in the same way, by pinpointing which element we wish to access:

Board[2][3] = 9; 

Multi-dimensional Array Access
Fig 7.11: Multi-dimensional Array Access

We can increase the number of dimensions by adding an extra set of square brackets for each dimension. For example, a three dimensional array:

int cube[][][]; 
cube = new int [10][10][10];

To pinpoint elements, we simply pass an index for each axis of dimensionality:

cube[2][3][4] = 25; 

The only difficult in extending arrays in this way is conceptual - it is very difficult for humans to envisage a data structure that contains more than the standard four dimensions. If you find it necessary to implement arrays of this level of complexity it is often an indication that you should consider a more concise representation of your data.

7.7

Passing arrays to methods

We can pass arrays to and from methods in the same way we do with primitive data types - all we have to do is indicate to Java that the variables we are dealing with are arrays in the method signature. The syntax for passing an array into a method is unchanged:

int bing = 5; 
someMagicMethodOne (bing);
int bong[] = { 1, 2, 3, 4, 5 };
someMagicMethodTwo (bong);

We can indicate that an array is expected as a parameter to a method by using the square bracket notation - we don't need to know in advance how many elements, just the number of axis of dimensionality. For a method that accepts a one dimensional array as a parameter:


public static void someMagicMethod (int[] bing) {

If we wish to pass in an array of two dimensions, we must also indicate this:


public static void someMagicMethod (int[][] bing) {

To show an example of this in practise, we will look at an applet that creates a one dimensional array of integers, fills it full of random numbers, and then displays the largest number in the array in a message dialog when a button is pressed.:

import java.awt.*; 
import java.awt.event.*;
import java.util.*;
import javax.swing.*;
import java.applet.*;


public class ArraysExample extends JApplet implements ActionListener {
JButton myButton;

public void init() {
myButton = new JButton ("Press me to go!");
add (myButton, BorderLayout.CENTER);
myButton.addActionListener (this);
}

/*
This method goes through the array that is passed in, iterating over each element
with a for loop. It puts a random number between one and ten in each position.Note
the use of the length property of the array to determine the number of iterations
- although we don't know how many elements will be in the array that is passed ,
we know we can get that information from the length property.
*/


public static void randomiseArray (int[] array) {
int i = 0;
for (i = 0; i < array.length; i++) {
array[i] = (int) (Math.random() * 10);
}

}

/*
The findLargest method uses a standard algorithm for finding the largest positive
number. The variable largest is initially set to something smaller than anything
in the array can be, then every element of the array is compared against the largest
variable. If the current element is larger than the value of largest, then it is
assigned as the value of largest.At the end of the iteration, largest has the value
of the largest number in the array. We'll have cause to look at this algorithm again
in a future chapter.
*/


public static int findLargest (int[] array) {
int i = 0, largest=0;
for (i = 0; i < array.length; i++) {
if (array[i] > largest) {
largest = array[i];
}

}

return largest;
}

/*
This method goes through the array and appends each element to a string before displaying
it in a message dialog - effectively it just prints out the contents of the array.
*/


public void displayArray (int[] array) {
String temp = "";
for (int i = 0; i < array.length; i++) {
temp += "" + (i) + ": " + array[i] + "\n";
}

JOptionPane.showMessageDialog (null, temp);
}


public void actionPerformed (ActionEvent e) {
int largestNumber;
int[] myArray = new int [10];
randomiseArray (myArray);
displayArray (myArray);
largestNumber = findLargest (myArray);
JOptionPane.showMessageDialog (null, "The largest number is " largestNumber);
}

}

Java tip

Arrays are objects, and so they are passed by reference in the Java language. This means that within a method you are actually working on a live set of the data, and any changes you make to that array within that method will be reflected in the rest of the program. Proceed with caution!


7.8

ArrayLists

As we have seen, arrays are very powerful programming constructs. Unfortunately, they have a disadvantage in that we must know in advance how many elements are going to be stored within. This is not always something that can be anticipated, and it is not possible for arrays to be easily resized in Java.

It is possible to just give an insanely large number of elements and simply keep track of where we are in the array with a counter, but that's not an ideal solution.

To deal with this problem, Java provides an alternate array structure called an ArrayList. It is one of a family of structures called collections... we will have cause to look at some more of these throughout the book.

At its most basic, an ArrayList is simply an expanding array - it will increase in size as you add elements to it. It provides the functionality for adding and removing elements, as well as allowing elements to be easily inserted. This flexibility gives us a structure that can be used to deal with the uncertainty that comes with modern software development.

As with arrays, elements within an ArrayList are indexed by an integer value. However, we do not use the square bracket notation with an ArrayList to access its elements. ArrayLists are classes, and as such they are manipulated with method calls.

The ArrayList class is found in the java.util package, so your code must import this:

import java.util.*; 

Once you have this import statement, we can create an ArrayList in the same way we can create our GUI components - by creating a variable to hold it and then putting something in to the variable with the new keyword. However, here is where it starts to get a little bit tricky. Newer versions of Java (namely Java 1.5) have introduced a more complex syntax for dealing with collection classes.

In addition to declaring that we are going to be using an ArrayList, we must indicate what kind of objects are going to go onto it. We do this by using an angle bracket notation. Let's look for example at an ArrayList that stores String objects - when we create the object with the new keyword, we indicate that it is an ArrayList that will hold String objects:

ArrayList<String> myList = new ArrayList<String>(); 

Once we've done this, we have an empty instance of the ArrayList class- notice that we don't have to declare a number of elements. Our strange looking angle brackets indicate that this is an ArrayList designed for String objects.

Historical note:

The Generic syntax is introduced in Java 1.5 - previous versions of the language have no checks on the type of objects that go onto an ArrayList or any other collection. More importantly, they have no checks on the objects that come off - this has consequences for the syntax.


We can add elements onto an ArrayList by using the add method, but we must be sure that we are adding elements of the appropriate type. To our myList ArrayList, we can add Strings without any problems:

myList.add ("Hello!"); 
myList.add ("There!");
myList.add ("Sailor!");

At the end of these three invocations of the add method, we have an ArrayList that contains three strings, stored in the order in which they were added:

ArrayList state
Fig 7.12: ArrayList state

We will however encounter a problem if we try to add numbers to the ArrayList we have declared. The simple attempt to add a number will trigger a compile time error:

myList.add (10); 

If we wish to make an ArrayList that allows for the use of integers, we must create one using the angle bracket notation. Now, here's where things get a little complicated, so hold on tight. It doesn't matter particularly if you follow this explanation - the ramifications are somewhat subtle and most likely won't be an issue for some time down the line.

Java has two different kinds of variables. One of these variable types is known as primitive, and the other is known as reference. Java treats these variables differently.

Primitive data-types are the simplest of these, and they consist of int, float, char, double, and various other single units of data. Reference data types are the more complex data types, and are always represented by objects in Java. Strings and Arrays are examples of reference data types.

Now, to further complicate the issue, each of the primitive data-types has a corresponding reference data type called a Wrapper Class, and these are used when primitive data-types must be used in a complex context. We'll see why this is important when we discuss polymorphism in a later chapter.

The result of all of this is that it isn't immediately obvious how to create an ArrayList that can hold a simple int value. Intuitively we would expect something like:

ArrayList<int> myList = new ArrayList<int>(); 

This would seem to make perfect sense, but Java will throw a compile-time error because of the distinction between primitive and reference data types. ArrayLists can only work with objects, and an int is not an object.

In older versions of Java, it was necessary to explicitly wrap a primitive before placing it onto an ArrayList:

Integer myInt = new Integer (10); 
myArrayList.add (myInt);

As of Java 1.5, this is no longer necessary. ArrayLists now support a system called auto-boxing that handles all of this for you. All you need to do is create an ArrayList that will support the relevant wrapper class:

ArrayList<Integer> myList = new ArrayList<Integer>(); 

Then we add int values (or Integer objects) as needed:

myList.add (10); 

Every primitive data-type in Java has a corresponding wrapper. The table below shows what these are:

ArrayList and Wrapper Classes
Fig 7.13: ArrayList and Wrapper Classes

So if we wanted an ArrayList that would hold a collection of double values:

ArrayList<Double> myNumbers = new ArrayList<Double>myNumbers.add (10.9); 

In terms of changing the data within the ArrayList, We can also make use of the set method to change the contents at a particular index - so if we called this method on the String ArrayList defined above:

myList.set (2, "Aaawooga"); 

This overwrites the contents of the element at the specified index and replaces it with what we pass as a second parameter:

ArrayList after modifications
Fig 7.14: ArrayList after modifications

Of course, putting things into a data structure is only one part of the functionality - we also need to be able to get things out of it. We do this using the get method of the ArrayList. To use this we pass the numerical index of the element we wish to retrieve:

ArrayList < String > myListOfStrings =new ArrayList < String >; 
myListOfStrings.add ("Blah");
myListOfStrings.add ("Blue");
myListOfStrings.add ("Bang");
for (int i = 0; i <myListOfStrings.size(); i++) {
String str = myListOfStrings.get (i);
System.out.println (str);
}

The introduction of generic types to Java also complicates how we write methods - if we must also specify a type to go along with a collection, then how do we write methods that can accept the right kind of collections as parameters? We can use the standard method, but that gives us the same problems that generics were introduced to solve - that of heterogeneous arrays and mandatory type-casting:


public aMagicMethod() {
tmp = new ArrayList < String >();
testMethod (tmp);
}


public void testMethod (ArrayList tmp) {
String str = (String) tmp.get (0);
}

We need to make use of a new syntax to deal with this. We can declare a generic collection as a parameter, but that limits us only to those collections that polymorphically match the typing (more on polymorphism later in the book). Often we want a method to be able to deal with a range of suitable objects. The solution to this is to develop generic methods. We can make use of wildcards within the generic type passed to the method... we indicate this by a question mark:


public void iterateOverCollection (ArrayList<?> obs) {
for (Object ob : obs) {
System.out.println (ob);
}

}

We can further specify wildcards to ensure that we can still maintain some kind of typing structure:

public void iterateOverCollection (ArrayList<? extends JComponent> comp) 
{
}

Unfortunately, passing a generically typed ArrayList as a parameter doesn't have the type-checking compiler protection you might assume. For example, the following is legal Java from a compiler perspective but it will throw a ClassCastException during the running of the program:


public void ClassCastExample() {
tmp = new ArrayList < String >();
tmp.add ("Bing!");
iterateOverCollection (tmp);
}


public void iterateOverCollection (ArrayList < JComponent > comp) {
for (JComponent c : comp) {
}

}

The moral of the story is - don't rely on the type-checking. It may not work the way you would expect. The extra level of support provided at the compiler level is not enforced in all generically typed interactions, and so you should still be careful when polymorphically casting or passing generic types to methods.

7.9

Stepping over Collections

Collections such as ArrayLists and Arrays provide a concise representation for lists of data - and as such, we often need to step over each element in such a collection with an eye to performing some operation on everything contained within. Java provides us with an excellent structure called the foreach loop to allow for that.Collections such as ArrayLists and Arrays provide a concise representation for lists of data - and as such, we often need to step over each element in such a collection with an eye to performing some operation on everything contained within. Java provides us with an excellent structure called the foreach loop to allow for that.

The foreach loop is syntactically similar to a standard for loop - it even uses the same keyword. However, we use a slightly modified syntax for indicating how we want the loop to behave:

ArrayList < String > myListOfStrings = new ArrayList < String >(); 
myListOfStrings.add ("Hello");
myListOfStrings.add ("World");
for (String s : myListOfStrings) {
System.out.println (s);
}

The first part of the for loop indicates that we want to refer to each element in turn with the variable name s, and that each of these elements will be a String (we must declare this in the body of the for loop - we cannot use an existing variable). The second part is the collection we wish to step over (in this case, myListOfStrings). The first time around the loop, s refers to the first element of the ArrayList. The second time around, it refers to the second element, and so on until it has stepped over every element in the ArrayList.

We can also use this structure with standard arrays in exactly the same way:

int[] myNumbers = new int[100]; 
...
for (int i : myNumbers) {
System.out.println (i);
}

The foreach structure is syntactically compact and allows for a simple way for developers to provide a mechanism for performing an operation on every element of data within a collection - none too shabby!

Historical note:

Don't look to find the foreach loop in versions of Java prior to 1.5 - it didn't exist back then. Instead, the method of stepping over each element of data in a collection was handled by specialised iterator objects.


7.10

Sorting an ArrayList

As with standard arrays, Java gives us a class that allows us to perform common manipulation tasks on an ArrayList. This class is called Collections, and like the Arrays class is found in the java.util package. Both the sort and binarySearch methods are available in this class, and both work on the ArrayList class:

Collections.sort (myList); 
Collections.binarySearch (myList, "Hello!");

There are a large number of other methods supported by this class, but we will only cover these within this chapter. The interested student is invited to examine the Java documentation which will provide further information on what is possible.

7.11

Searching

A fuller discussion of sorting gains us nothing but Masochism Points at this juncture - the mechanisms provided by Java are more than ample for most of our needs. However, we still need a mechanism for searching unordered lists. We should also look a little more at how a binary search works so that we can understand the need for the list to be sorted. We can of course use the methods indicated above without knowing any of these things, but this is not a book about Java - it's a book about programming and so we will often take detours through material that is useful for gaining a broader appreciation of the subject.

Searching techniques will work equally well for arrays and ArrayLists - they are very similar data structures. The only thing that differs is in the syntax.

The easiest kind of search to implement is called a linear search. We simply step over every element in the structure until we find the one we want. What we do after that depends on the exact requirements of our search - if we have a method devoted to the search, we may want to return the matching element or we may wish to return an integer indicating where it is to be found in the structure.

But wait! If we know what we're looking for, why do we need to search for it? Obviously such a technique is of limited usefulness if we are, for example, searching for a String. We know what the String is - we can just recreate it. However, for more complex objects we may only know a portion of the full set of information - we may only know enough to distinguish between different kinds of objects.

Consider for example if Java provided us with a class that let us store all the information about a student. There is no such class... in a few chapters, we'll learn how to write it ourself.

Let's say this class provides ways of setting an ID number, a name, a list of grades, an address, a phone number, and all sorts of other useful information. And then let's say we have an ArrayList that contains two thousand students.

If all we have is the ID number, then we need a way of finding out the rest of the information - this is where a good search will come in. For example, consider the following linear search on an ArrayList called listOfStudents:


public Student findStudent (String id) {
Student temp;
for (int i = 0; i < listOfStudents.size(); i++) {
temp = listOfStudents.get (i);
if (temp.getID() .equals (id)) {
return temp;
}

}

return null;
}

This is the simplest kind of search - alas, it is also the least efficient. We must assume that, on average, that we must search at least half of the list to find the element we want. This means that the larger the number of elements, the more code has to be executed. In technical terms, it does not scale well... what may work well for ten students will not work nearly as well for ten thousand.

Java provides us with a better alternative - a binary search. This works according to a simple heuristic - if we know that a list is sorted, then we know in what direction we must search to find the thing we're looking for. Consider if we have the following list:

List being searched (part one)
Fig 7.15: List being searched (part one)

If we're looking for a number bigger than our current number, then we look towards the end of the list. If we're looking for a number smaller than our current number, then we look towards the start of the list.

A binary search begins searching at the middle point of a list. Let's say we're looking to find where in the list we could find the number 10. The list is nine elements long, and so we begin at element five:

List being searched (part two)
Fig 7.16: List being searched (part two)

The number ten is greater than the number we currently have... we know therefore that we have to look towards the end of the list rather than towards the start. We can therefore discount everything that occurs before element five... in effect, we only search the following list:

List being searched (part three)
Fig 7.17: List being searched (part three)

So with this list, we then again take the mid-point:

List being searched (part four)
Fig 7.18: List being searched (part four)

Now we know that the number 10 is less than the number we currently have - we only need the left hand side of the array now, so we discount the right-hand side.

List being searched (part five)
Fig 7.19: List being searched (part five)

And then again we take the mid-point:

List being searched (part five)
Fig 7.20: List being searched (part five)

And there we have our element, and it was found after only three comparisons. With a linear search, it would require six comparisons. For longer lists, the difference is even more marked... a binary search discounts half of the list left to search each time. Consider for example a list of one thousand and twenty-four elements. A linear search will require five-hundred and twelve comparisons on average. A binary search on the other hand

Binary search - comparisons required
Fig 7.21: Binary search - comparisons required

For 2048 elements, we require only 11 comparisons, and for 4096 we require only twelve. This is an algorithm that scales well.

However, in the best case situation (the element we are looking for is at the very start of the list), a linear search will find the search element in one comparison compared to the 10 required by the binary search. On the other hand, for the worst case scenario (the element we're looking for is at the very end of the list), the linear search will require 1024 comparisons.

A binary search is in general many orders of magnitude more efficient than a linear search... but it comes at a cost in that the list must be sorted. When we come to write our own objects, we'll also find another problem - the standard Java implementation of the binarySearch doesn't work properly with basic classes. That's because Java has no way of being able to tell how you want things to be sorted or compared against... how do we want our Student class to be sorted? Should it be sorted on ID? Or on name? Or on year of registration?

There are mechanisms for solving this problem, and we'll discuss them in due course. For now, it is enough to know how to use linear searching when we must, and how a binary search can find those elements we need in a much more efficient manner.

Assimilation of Concepts:

Sorting and Searching are available as standard algorithms as part of the Java class libraries - however, they are vitally important in terms of providing a sense of understanding as to how one can manipulate collections. The process of doing a binary search involves the close inter-relationship of many fundamental concepts, and it is worth spending some time considering how this may be accomplished using the code we have already covered.


7.12

Conclusion

The effective implementation of data structures is one of the hallmarks of a successful application. Arrays and ArrayLists provide a powerful way of modelling conceptually linked sets of data in an efficient way that is easy to manipulate with code.

There are a wide variety of other structures provided by Java - we will discuss some of these in the course of the book.

7.13

Reader exercises

Exercise one

  1. Create a new JApplet.
  2. Add a JTextField to this JApplet.
  3. Add a JButton to the JApplet.
  4. Register an action listener for the JButton.
  5. Create a class-wide ArrayList.
  6. When the user presses the button, add the contents of the JTextField onto the end of the ArrayList.
  7. Add another JButton to the JApplet.
  8. When this button is pressed, use an iterator to step over all the contents of the ArrayList and add each element to a string variable.
  9. Display this string variable in a message dialog.

Exercise two

  1. Create a new JApplet
  2. Create a class-wide Array of ints.
  3. Add a JButton to the JApplet.
  4. Register a listener object for the JButton.
  5. When the button is pressed, display an input dialog asking the user for a position in the array.
  6. Then display an input dialog asking the user for a number.
  7. Place the number at the requested index in the array.

Further Reading

The following table details further reading on the topic in this chapter, and also any external resources that you may find useful.

ResourceDescription
Example Programs from this chapterThis is a zip file of all the programs shown in this chapter.
ArrayList classJava API documentation on the ArrayList class.
Arrays ClassJava API Documentation on the Arrays class.
Collections ClassJava API Documentation on the Collections class.

PreviousTable of ContentsNext

© 2004-2006 Michael James Heron