CS51 - Spring 2010 - Lecture 26

course reviews

administrative
   - put up practice final
   - our final time slot is Monday, May 10 at 2:00pm
   - TP2 is due last day of class at 5pm

"For me, great algorithms are the poetry of computation. Just like verse, they can be terse, allusive, dense and even mysterious. But once unlocked, they cast a brilliant new light on some aspect of computing.'' -- Francis Sullivan

What is an algorithm?
- way for solving a problem
- method of steps to accomplish a task

Examples
   - sort a list of numbers
   - find a route from one place to another (cars, packet routing, phone routing, ...)
   - find the longest common substring between two strings
   - add two numbers
   - microchip wiring/design (VLSI)
   - solving sudoku
   - cryptography
   - compression (file, audio, video)
   - spell checking
   - pagerank
   - classify a web page
   - ...

Main parts to algorithm analysis
   - developing algorithms that work
   - making them faster
   - analyzing/understanding the efficiency/run-time

Sorting
   Input: An array of numbers nums
   Output: The array of numbers in sorted order, i.e. nums[i] <= nums[j] for all i < j

   - cards
      - sort cards: all cards in view
      - sort cards: only view one card at a time

   - look at Sort code
   - Selection sort
      - How many operations does the algorithm take? How long will it take? How efficient is it?
         - what counts as an operation?
            - Different operations take different amounts of time. Even from run to run, things such as caching, etc. will complicate things
         - will depend on the input
      - We want a tool to allow us to talk about and compare different algorithms while hiding the details that don't matter
      - asymptotic analysis
         - Key idea: how does the run-time grow as we increase the input size?
            - in our case, as we sort more numbers, roughly how will the run-time increase
            - for example, if we double the number of numbers we're sorting, what will happen to the run-time?
               - unchanged?
               - double?
               - triple?
               - quadruple?
         - Compare different algorithms
            - f1(n) takes n^2 steps
            - f2(n) takes 2n + 100 steps
            - f3(n) takes 4n + 1 steps
         - Which algorithm is better? Is the difference between f2 and f3 impor-
tant/signicant?

         - Big-O notation: an upper bound on the function/run-time
            - Gives us the big picture, without worrying about details
            - Given a function/method how will it grow? linearly? quadratically?
            - Examples:
               - n^2 is O(n^2)
               - n^2 + n + 200 is O(n^2)
               - 5n + 10 is O(n)
               - ...

         - runtimes table
            - this gives us groups of methods/functions that behave similarly

      - What is the running time of selection sort?
         - We'll use the variable n to describe the length of the array/input
         - How many times do we go through the for loop in selectionSort?
            - n times
         - Each time through the for loop in selectionSort, we call indexOfSmallest. How many times do we go through the for loop in indexOfSmallest?

            - end_index - start_index + 1
            - first time, n-1, second, n-2, third, n-3 ...
            - O(n)
         - what is the overall cost for selectionSort?
            - we go through the for loop n times
            - each time we go through the for loop we incur a cost of roughly n
            - O(n^2)

   - Insertion sort
      - what is the running time?
         - How many times do we iterate through the while loop?
            - in the best case: no times
               - when does this happen?
               - what is the running time? linear, O(n)
            - in the worst case: j - 1 times
               - when does this happen?
               - what is the running time?
                  - \sum_{j=1}^n-1 j = ((n-1)n)/2
                  - O(n^2)
            - average case: (j-1)/2 times
               - O(n^2)

   - Merge sort
      - first, look at the merge method
         - how can we use this method to sort numbers?
         - how could we use this method to sort two numbers?
         - can we repeat this idea? - what is the runtime?
      - what is the runtime?
         - look at the layers
         - each layer processes n items
         - how many layers are there?
            - each time we split the data in half
            - 2^i = n
            - log(n) levels
         - O( n log n )

Built-in sorting
- Arrays.sort(Object[] a) -- uses merge sort
- Collections.sort(List list) -- uses quicksort