SortingTopSearching

Searching

A method to do this:

  /*
   * Search for num in array.  Return the index of the number, or
   * -1 if it is not found.
   */
  int search(int[] array, int num) {
    for (int index = 0; index < array.length; index++) {
      if (array[index] == num) {
        return index;
      }
    }
    return -1;
  }

The procedure here is a lot like the search for a Line in a Scribble. We have no way of knowing that we're done until we either find the number we're looking for, or until we get to the end of the array. So again, if the array contains n numbers, we have to examine all n in an unsuccessful search, and, on average, (n)/(2) for a successful search.

Alternately, we could use recursion instead of a while loop for the search:

  /*
   * Search for num in array recursively. Return the index of the
   * number, or -1 if it is not found.
   */
  int recSearch(int[] array, int num, int start) {
    if (start >= array.length) {
      return -1;
    } else if (array[start] == num) {
      return start;
    } else {
      return recSearch(array, num, start + 1);
    }
  }

Now, suppose the array has been sorted in ascending order.

Class demo: search for a number in an ordered array of numbers.

  /*
   * Search for num in a sorted array recursively. Return the index
   * of the number, or -1 if it is not found.
   */
  int recSearch(int[] array, int num, int start) {
    if (start >= array.length) {
      return -1;
    } else if (array[start] == num) {
      return start;
    } else if (array[start] > num) {
      return -1;  // num will not appear in rest of array since it is sorted.
    } else {
      return recSearch(array, num, start + 1);
    }
  }

Well, we can do the same type of search - start at the beginning and keep looking for the number. In the case of a successful search, we still stop when we find it. But now, we can also determine that a search is unsuccessful as soon as we encouter any number larger than our search number. Assuming that our search number is, on average, going to be found near the median value of the array, our unsuccessful search is now going to require that we examine, on average, (n)/(2) items. This sounds great, but in fact is not a really significant gain, as we will see. These are all examples of a linear search - we examine items one at a time in some linear order until we find the search item or until we can determine that we will not find it.

But there is a better way. To get the intuition for the next way to search for a number, think back to your favorite number guessing game. I pick a number between 1 and 100 and you have to guess what it is. The game usually goes something like this:

Me: Guess my number.
You: 50.
Me: Too High.
You: 25.
Me: Too Low.
You 37.
Me: Too High.
You 31.
Me: That's right.

If you know that there is an order - where do you start your search? In the middle, since then even if you don't find it, you can look at the value you found and see if the search item is smaller or larger. From that, you can decide to look only in the bottom half of the array or in the top half of the array. You could then do a linear search on the appropriate half - or better yet - repeat the procedure and cut the half in half, and so on. This is a binary search. It is an example of a divide and conquer algorithm, because at each step, it divides the problem in half.

A Java method to do this:

  /*
   * Binary Search for num in array.
   */
  int search(int[] array, int num) {
    return binarySearch(array, num, 0, array.length - 1);
  }


  /*
   * Binary Search for num in array.  Pass in the low and high 
   * indices of the array for the range in which the number may
   * still occur.
   */
  int binarySearch(int[] array, int num, int low, int high) {
   if (low > high) {
      return -1;
    } else {
      int mid = (low + high) / 2;
      if (array[mid] == num) {
        // num is same as middle number
        return mid;
      } else if (num < array[mid]) {
        // num is smaller than middle number
        return binarySearch(array, num, low, mid - 1);
      } else {
        // num is larger than middle number
        return binarySearch(array, num, mid + 1, high);
      }
    }
  }

How many steps are needed for this?

So how much better is this, really? In the case of a small array, the difference is not really significant. But as the size grows...

Search/num elts

10 100 1000 1,000,000
n/2 5 50 500 500,000
log n (base 2) 4 7 10 20

That's a pretty huge difference as n increases.

Demo: Searching and Sorting Demo.


SortingTopSearching