CS51A - Fall 2019

CS51A - Fall 2019 - Class 5

Example code in this lecture

   while.py
   scores-lists.py
   more-lists.py

Lecture notes

prime numbers
   - what is a prime number?
      - a number that is only divisible by 1 and itself
   - what are the first 10 prime numbers?
      - the first 100?
      - the first 1000?
   - How could we write a program that figured this out?
   - To start with, how can we tell if a number is prime?
      - try and divide it by all of the numbers between 1 and the number
      - if none of them divide evenly, then it's prime, otherwise it's not
   - A few questions:
      - do we need to check all of the numbers up to that number?
         - up to 1/2 the number is okay
         - really, just need to check up to sqrt(number) (inclusive)
          - why? what does it mean if the number has an integer square root?
      - how can we check to see if a number divides evenly?
         - use the remainder/modulo operator and see if it equals 0 (i.e. no remainder)
      - how can we check all of the numbers?
         - use a for loop

look at isprime function in while.py code
   - for loop starting at 2 up to the sqrt of the number
      - there are multiple versions of the range function
         - range with a simple parameter starts counting at 0 up to but not including the specified number
         - range with 2 parameters starts counting at the first number up to, but not including, the second number

            for i in range(10, 20):
               print i

            would print out the numbers from 10 - 19 (but not 20)

      - the if statement checks to see if the number is divisible by i
      - if we find this we can stop early!
         - the minute we find this, we know it's not prime so we can return False
      - what does "return True" do?
         - if we've checked all of the numbers and none of them were divisible (otherwise we would have exited the function with the return False), so return True
   - we can use this to see if a number is prime

      >>> isprime(5)
      True
      >>> isprime(6)
      False
      >>> isprime(100)
      False
      >>> isprime(101)
      True

import math
   - A second way to import: import module_name
   - To reference a function within that module, you then say module_name.function_name
   - Why might we use this option, i.e. when would we use:
      from math import *

      vs.

      import math

      - Use the first if you're going to be using the functions a lot and it's clear that they come from that module
      - Use the second to be extra clear where the functions are coming from and to avoid naming conflicts

how could we use isprime to print out the first 10 (100, 1000, etc) prime numbers?
   - like to do some sort of loop
   - will a for loop work?
      - we don't know when we're going to stop
      - we'd like to keep a count of how many we've seen and only stop when we've reached the number we want

while loop
   - another way to do repetition

   while <bool expression>:
      statement1
      statement2
      ...

   statement3

   as long as the <bool expression> evaluates to True, it continues to repeat the statements, when it becomes False, it then continues on and executes statement3, etc.

   - specifically:
      evaluates the boolean expression
         - if it's False
            - it "exits" the loop and goes on to statement3 and continues there
         - if it's True
            - executes statement1, statement2, ... (all statements inside the "block" of the loop, just like a for loop)
         - go back to beginning and repeat

   - how could we use a while loop for our prime numbers problem?
      - keep a count of how many primes we've found (initially starts at 0)
      - start count from 1 and work our way up
      - check each number
      - if it's prime
         - print it out
         - increment the counter of how many primes we've found
      - keep repeating this as long as (while) the number of primes we've printed is less than the number we want

can you emulate a for loop with a while loop?
   - yes!

   for i in range(10):
      ...

   is equivalent to writing:

   i = 0

   while i < 10:
      ...
      i = i + 1

look at firstprimes function in while.py code
   - current += 1 every time through the loop we increment the number we're examining
   - if that current number happens to be prime, we increment count
   - the loop continues "while" count < num, that is as long as the number we've found is less than the number we're looking for

infinite loops
   - what would the following code do?

   while True:
      print("hello")

   - will never stop
      - in this case you should see some output
      - sometimes, it will look like the program just froze if you're not actually printing anything out
   - you can stop this by selecting "reset shell"
   - be careful about these with your program. They're called an infinite loop.
   - if you think you might have an infinite loop
      - put in some print statements to debug
      - think about when the boolean expression will become False and make sure that is going to happen in your loop

run scores-lists.py code
   - First, prompts the user to enter a list of scores one at a time
      - how is this done?
         - while loop
         - what is the exit condition?
            - checks to see if the line is empty

               while line != ""

   - then, calculate various statistics based on what was entered
   - how are we calculating these statistics?
      - average?
         - could keep track of the sum and the number of things entered
         - divide at the end
      - max?
         - keep track of the largest seen so far
         - each time a new one is entered, see if it's larger, if so, update the largest
      - min?
         - same thing
      - median?
         - the challenge with median is that we can't calculate it until we have all of the scores
         - need to sort them and then find the middle score

   - why can't we do this using int/float variables?
      - we don't know how many scores are going to be entered
      - even if we did, if we had 100 students in the class, we'd need 100 variables!

lists
   - lists are a data structure in Python
      - what is a data structure?
         - a way of storing and organizing data

   - lists allow us to store multiple values with a single variable

creating lists: we can create a new list using square brackets
   >>> [7, 4, 3, 6, 1, 2]
   [7, 4, 3, 6, 1, 2]
   >>> 10 # not a list
   10
   >>> [10]
   [10]
   >>> l = [7, 4, 3, 6, 1, 2]
   >>> l
   [7, 4, 3, 6, 1, 2]
   >>> type(l)
   <type 'list'>

   lists are a type and represent a value, just like floats, ints, bools and strings. We can assign them to variables, print them, etc.

   - what do you think [] represents?
      - empty list
      >>> []
      []

accessing lists
   - we can get at particular values in the list by using the [] to "index" into the list
      >>> l = [7, 4, 3, 6, 1, 2]
      >>> l[3]
      6

      notice that indexing starts counting at 0, not at 1!

      >>> l[0]
      7

   - What do you think l[20] will give us?
      >>> l[20]
      Traceback (most recent call last):
       File "<string>", line 1, in <fragment>
      IndexError: list index out of range

      we can only index from 0 up to the length of the list minus 1

   - What do you think l[-1] will give us?
      >>> l[-1]
      2

      if the index is negative it counts back from the end of the list

   - notice that the type thing in the list is as you'd expect:
      >>> type(l[3])
      <type 'int'>

storing other things in lists
   - draw the list representation
   - a list is a contiguous set of spaces in memory
    - [ _ , _ , _ , _ ]
   - we can store anything in each of these spaces

      >>> ["this", "is", "a", "list", "of", "strings"]
      ['this', 'is', 'a', 'list', 'of', 'strings']
      >>> list_of_strings = ["this", "is", "a", "list", "of", "strings"]
      >>> list_of_strings[0]
      'this'
      >>> [1, 5.0, "my string"]
      [1, 5.0, 'my string']
      >>> l = [1, 5.0, "my string"]
      >>> type(l[0])
      <type 'int'>
      >>> type(l[1])
      <type 'float'>
      >>> type(l[2])
      <type 'str'>

   - In general, it's a good idea to have lists be homogeneous, i.e. be of the same type

slicing
   - sometimes we want more than just one item from the list (this is called "slicing")
   - We can specify a range in the square brackets, [], using the colon (:)

      >>> l = ["this", "is", "a", "list", "of", "strings"]
      >>> l[0:3]
      ['this', 'is', 'a']
      >>> l[1:5]
      ['is', 'a', 'list', 'of']
      >>> l[1:1]
      []
      >>> l[-3:-1]
      ['list', 'of']

      - generates a *new* list
      - that includes the items from the list starting at the first number and up to, but not including, the second number

looping over lists
   - We can use the for loop to iterate over each item in the list:

   >>> my_list = [4, 1, 8, 10, 11]
   >>> for value in my_list:
   ...    print(value)
   ...
   4
   1
   8
   10
   11

   - This is often called a "foreach" loop, i.e. for each item in the list, do an iteration of the loop

write a function called sum that sums up the values in a list of numbers

   def sum(numbers):
      total = 0

      for val in numbers:
         total += val

      return total

back to our stats program... how could we write average given what we know so far, that is a function that takes a list as a parameter and calculates the average?
   - look at the inelegant_average function in scores-lists.py code
      - loop over each of the elements in the list
      - accumulate a sum
      - accumulate a count
      - divide the sum by the count
   - look at the average function in scores-lists.py code

built-in functions over lists: there are also some built-in functions that take a list as a parameter
   - we can get the length of a list
      >>> len(l)
      3
      >>> len([1, 2, 3, 4, 5])
      5
      >>> len([])
      0
   - max
      >>> l = [5, 3, 2, 1, 10]
      >>> max(l)
      10

   - min
      >>> min(l)
      1
   - sum
      >>> sum(l)
      21

lists are objects and therefore have methods. Any guesses?
   - append: add a value on to the end of the list
      >>> my_list = [15, 2, 1, 20, 5]
      >>> my_list.append(100)
      >>> my_list
      [15, 2, 1, 20, 5, 100]

      - notice that append does NOT return a new list, it modifies the existing list!

   - We can look at the documentation do see what is available
      >>> help([])
      >>> help(list)

      http://docs.python.org/tutorial/datastructures.html

      - pop: remove a value off of the end of the list and return it
         >>> my_list.pop()
         100
         >>> my_list
         [15, 2, 1, 20, 5]

         - notice that it both modifies the list and returns a value
         - if you want to use this value, you need to store it!
            >>> x = my_list.pop()
            >>> x
            5
         - pop also has another version where you can specify the index

            >>> my_list = [15, 2, 1, 20, 5]
            >>> my_list.pop(2)
            1
            >>> my_list
            [15, 2, 20, 5]
      - insert: inserts a value at a particular index
         >>> my_list = [15, 2, 1, 20, 5]
         >>> my_list.insert(2, 100)
         >>> my_list
         [15, 2, 100, 1, 20, 5]

         - again, lists are mutable, so insert does not return a new list, but modifies the underlying one
      - sort
         >>> my_list = [15, 2, 1, 20, 5]
         >>> my_list.sort()
         >>> my_list
         [1, 2, 5, 15, 20]
         >>> my_list = ["these", "are", "some", "words", "to", "sort"]
         >>> ["these", "are", "some", "words", "to", "sort"].sort()
         >>> my_list = ["these", "are", "some", "words", "to", "sort"]
         >>> my_list.sort()
         >>> my_list
         ['are', 'some', 'sort', 'these', 'to', 'words']

back to our grades program: look at scores-lists.py code
   - there is a function called get_scores. That gets the scores and returns them as a list. How?
      - starts with an empty list
      - uses append to add them on to the end of the list
      - returns the list when the loop finishes
   - median function
      - sorts the values
         - notice again that sort does NOT return a value, but sorts the list that it is called on
      - returns the middle entry

lists are mutable
   - what does that mean?
      - we can change (or mutate) the values in a list

   - notice that many of the methods that we call on lists change the list itself

   - we can mutate lists with methods, but we can also change particular indices

      >>> my_list = [15, 2, 1, 20, 5]
      >>> my_list
      [15, 2, 1, 20, 5]
      >>> my_list[2] = 100
      >>> my_list
      [15, 2, 100, 20, 5]

sequences
   - lists are part of a general category of data structures called sequences that represent a sequence of things
   - *all* sequences support a number of shared behavior
      - the ability to index using []
      - the ability to slice using [:]
      - a number of built-in functions:
         - len
         - max
         - min
      - the ability to iterate over in with a for loop
   - We've actually seen one other sequence?
      - strings!

strings as sequences
   - notice that we can do all the sequence-like things with strings
      >>> s = "banana"
      >>> s[4]
      'n'
      >>> s[2:5]
      'nan'
      >>> len(s)
      6
      >>> for letter in s:
      ... print letter

      b
      a
      n
      a
      n
      a
   - strings, however, are immutable
      >>> s[4] = "c"
      Traceback (most recent call last):
       File "<string>", line 1, in <fragment>
      TypeError: 'str' object does not support item assignment

      - no matter how hard you try, you cannot mutate a string

What does the list_to_string function do in more-lists.py code?
   - takes as input a list
      - what is the type of the list?
         - a list of almost anything!
         - anything that we can call str() on (which turns out to be lots of things)
   - concatenates all the items in the list into a single string
      - results starts out as the empty string
      - it iterates through each item in the list and concatenates them on to the results
   - this is similar to our example before of summing up all the numbers in a list