CS51 - Spring 2010 - Lecture 26

  • course reviews

  • administrative
       - put up practice final
       - our final time slot is Monday, May 10 at 2:00pm
       - TP2 is due last day of class at 5pm

  • "For me, great algorithms are the poetry of computation. Just like verse, they can be terse, allusive, dense and even mysterious. But once unlocked, they cast a brilliant new light on some aspect of computing.'' -- Francis Sullivan

  • What is an algorithm?
       - way for solving a problem
       - method of steps to accomplish a task

  • Examples
       - sort a list of numbers
       - find a route from one place to another (cars, packet routing, phone routing, ...)
       - find the longest common substring between two strings
       - add two numbers
       - microchip wiring/design (VLSI)
       - solving sudoku
       - cryptography
       - compression (file, audio, video)
       - spell checking
       - pagerank
       - classify a web page
       - ...

  • Main parts to algorithm analysis
       - developing algorithms that work
       - making them faster
       - analyzing/understanding the efficiency/run-time

  • Sorting
       Input: An array of numbers nums
       Output: The array of numbers in sorted order, i.e. nums[i] <= nums[j] for all i < j

       - cards
          - sort cards: all cards in view
          - sort cards: only view one card at a time
       
       - look at Sort code
       - Selection sort
          - How many operations does the algorithm take? How long will it take? How efficient is it?
             - what counts as an operation?
                - Different operations take different amounts of time. Even from run to run, things such as caching, etc. will complicate things
             - will depend on the input
          - We want a tool to allow us to talk about and compare different algorithms while hiding the details that don't matter
          - asymptotic analysis
             - Key idea: how does the run-time grow as we increase the input size?
                - in our case, as we sort more numbers, roughly how will the run-time increase
                - for example, if we double the number of numbers we're sorting, what will happen to the run-time?
                   - unchanged?
                   - double?
                   - triple?
                   - quadruple?
             - Compare different algorithms
                - f1(n) takes n^2 steps
                - f2(n) takes 2n + 100 steps
                - f3(n) takes 4n + 1 steps
             - Which algorithm is better? Is the difference between f2 and f3 impor-
    tant/signicant?

             - Big-O notation: an upper bound on the function/run-time
                - Gives us the big picture, without worrying about details
                - Given a function/method how will it grow? linearly? quadratically?
                - Examples:
                   - n^2 is O(n^2)
                   - n^2 + n + 200 is O(n^2)
                   - 5n + 10 is O(n)
                   - ...

             - runtimes table
                - this gives us groups of methods/functions that behave similarly

          - What is the running time of selection sort?
             - We'll use the variable n to describe the length of the array/input
             - How many times do we go through the for loop in selectionSort?
                - n times
             - Each time through the for loop in selectionSort, we call indexOfSmallest. How many times do we go through the for loop in indexOfSmallest?
                
                - end_index - start_index + 1
                - first time, n-1, second, n-2, third, n-3 ...
                - O(n)
             - what is the overall cost for selectionSort?
                - we go through the for loop n times
                - each time we go through the for loop we incur a cost of roughly n
                - O(n^2)
       
       - Insertion sort
          - what is the running time?
             - How many times do we iterate through the while loop?
                - in the best case: no times
                   - when does this happen?
                   - what is the running time? linear, O(n)
                - in the worst case: j - 1 times
                   - when does this happen?
                   - what is the running time?
                      - \sum_{j=1}^n-1 j = ((n-1)n)/2
                      - O(n^2)
                - average case: (j-1)/2 times
                   - O(n^2)

       - Merge sort
          - first, look at the merge method
             - how can we use this method to sort numbers?
             - how could we use this method to sort two numbers?
             - can we repeat this idea? - what is the runtime?
          - what is the runtime?
             - look at the layers
             - each layer processes n items
             - how many layers are there?
                - each time we split the data in half
                - 2^i = n
                - log(n) levels
             - O( n log n )
             

  • Built-in sorting
       - Arrays.sort(Object[] a) -- uses merge sort
       - Collections.sort(List list) -- uses quicksort