Lecture 9: Dictionaries

Topics

Lunch with Prof. Osborn (or others)

Test 1 Monday 3/2

  • in class
  • paper-based
  • can bring in two pages of notes (either two pieces of paper, single-side or one piece, double-sided)
  • problems like practice problems
    • coding
    • what's wrong with this function
    • what would this function do
    • is this valid?
    • what would the output be
  • practice writing code on paper (it's different than on the computer)
  • I'll post practice problems
  • cover everything through today's lecture (not recursion)

Student Presentation

Dictionaries (aka "maps")

  • store keys and an associated value
    • each key is associated with a value
    • lookup can be done based on the key
    • this is a very common phenomena in the real world. What are some examples?
      • social security number
        • key = social security number
        • value = name, address, etc
      • phone numbers in your phone (and phone directories in general)
        • key = name
        • value = phone number
      • websites
        • key = url
        • value = location of the computer that hosts this website
      • car license plates
        • key = license plate number
        • value = owner, type of car, …
      • flight information
        • key = flight number
        • value = departure city, destination city, time, …
  • creating new dictionaries
    • dictionaries can be created using curly braces

      >>> d = {}
      >>> d
      {}
      
    • dictionaries function similarly to lists, except we can put things in ANY index and can use non-numerical indices

      >>> d[15] = 1
      >>> d
      {15: 1}
      
      • notice when a dictionary is printed out, we get the key AND the associated value

        >>> d[100] = 10
        >>> d
        {100: 10, 15: 1}
        >>> my_list = []
        >>> my_list[15] = 1
        Traceback (most recent call last):
         File "<string>", line 1, in <fragment>
        IndexError: list assignment index out of range
        
      • dictionaries ARE very different than lists….
    • we can also update the values already in a dictionary

      >> d[15] = 2
      >>> d
      {100: 10, 15: 2}
      >>> d[100] += 1
      >>> d
      {100: 11, 15: 2}
      
    • keys in the dictionary can be ANY immutable object

      >>> d2 = {}
      >>> >>> d2["dave"] = 1
      >>> d2["anna"] = 1
      >>> d2["anna"] = 2
      >>> d2["seymore"] = 100
      >>> d2
      {'seymore': 100, 'dave': 1, 'anna': 2}
      
    • the values can be ANY object

      - >>> d3 = {}
      >>> d3["dave"] = []
      >>> d3
      {'dave': []}
      >>> d3["dave"].append(1)
      >>> d3["dave"].append(40)
      >>> d3
      {'dave': [1, 40]}
      
    • be careful to put the key in the dictionary before trying to use it

      >>> d3["steve"]
      Traceback (most recent call last):
       File "<string>", line 1, in <fragment>
      KeyError: 'steve'
      >>> d3["steve"].append(1)
      Traceback (most recent call last):
       File "<string>", line 1, in <fragment>
      KeyError: 'steve'
      
    • how do you think we can create non-empty dictionaries from scratch?

      >>> another_dict = {"dave": 1, "anna":100, "seymore": 21}
      >>> another_dict
      {'seymore': 21, 'dave': 1, 'anna': 100}
      
    • what are some other methods you might want for dictionaries (things you might want to ask about them?
      • does it have a particular key?
      • how many key/value pairs are in the dictionary?
      • what are all of the values in the dictionary?
      • what are all of the keys in the dictionary?
      • remove all of the items in the dictionary?
    • dictionaries support most of the other things you'd expect them too that we've seen in other data structures

      >>> "seymore" in another_dict
      True
      >>> len(another_dict)
      3
      
    • dictionaries are a class of objects, just like everything else we've seen (called dict … short for dictionary)

      >>> help(dict)
      
    • some of the more relevant methods:

      >>> d2
      {'seymore': 100, 'dave': 1, 'anna': 2}
      >>> d2.values()
      [100, 1, 2]
      >>> d2.keys()
      dict_keys(['seymore', 'dave', 'anna'])
      >>> d2.pop('seymore')
      >>> d2
      {'dave': 1, 'anna': 2}
      >>> d2.clear()
      >>> d2
      {}
      
TODO dict.items() example

Tracking frequencies

  • We're going to use dictionaries to store counts like we did on paper last lecture
  • Write a function called get_counts that takes a list of numbers and returns a dictionary containing the counts of each of the numbers
  • Key idea:

    def get_counts(numbers):
      d = {}
    
      for num in numbers:
        # do something here
    
      return d
    
  • There are two cases we need to contend with:
    1. if the number isn't in the dictionary
      • In this case we need to add it with the value 1: d[num] = 1
    2. if the number is in the dictionary
      • In this case, we just need to increment it: d[num] = d[num] + 1
      • This can also be written
  • Look at the get_counts function in dictionaries.py
  • We now can generate the counts from our file

    >>> data = read_numbers('numbers.txt')
    >>> data
    >>> [1, 2, 3, 2, 1, 1, 2, 6, 7, 8, 10, 1, 5, 5, 5, 3, 8, 6, 7, 6, 4, 1, 1, 2, 3, 1, 2, 3]
    >>> get_counts(data)
    {1: 7, 2: 5, 3: 4, 6: 3, 7: 2, 8: 2, 10: 1, 5: 3, 4: 1}
    

Iterating over dictionaries

  • We're almost to the point where we can find the most frequent value.
  • Next, we need to go through all of the values in the dictionary to find the most frequent one.
  • there are many ways we could iterate over the things in a dictionary
    • iterate over the values
    • iterate over the keys
    • iterate over the key/value pairs
  • which one is most common?
    • since lookups are done based on the keys, iterating over the keys is the most common
  • by default, if you say:

    for key in dictionary:
      ...
    
    • key will get associated with each key in the dictionary in turn
  • once we have the key, we can use it to lookup the value associated with that key and do whatever we want with the pair

    for key in dictionary:
      value = dictionary[key]
      ...
    
  • look at the print_counts function
    • "\t" is the tab character

      >>> data = read_numbers('numbers.txt')
      >>> counts = get_counts(data)
      >> print_counts(counts)
      1  7
      2  5
      3  4
      6  3
      7  2
      8  2
      10  1
      5  3
      4  1
      
    • Notice that the keys are not in numerical order. In general, there's no guarantee about the ordering of the keys, only that you'll iterate over all of them.
  • look at the get_most_frequent_value function
  • Looks very similar to the my_max function we wrote way back in lecture 6 notes (https://cs.pomona.edu/classes/cs51a/lectures/lec06.html)
    • We keep a variable (max_value) that stores the largest value we've seen so far
      • We'll initialize it to -1 assuming that the numbers are all positive
      • See problem set 6 for a general solution
    • We then iterate through each of the key/value pairs in our dictionary
      • We compare the value (i.e. counts[key]) to the largest value we've seen so far
      • If it's larger, we update max_value
    • The only difference with my_max is that we want to return the key associated with the largest value
      • We need another variable (max_key) that stores this key
      • Whenever we update max_value, we also update max_key

        >>> data = read_numbers('numbers.txt')
        >>> get_most_frequent_value(data)
        1
        
  • It may also be useful to not only get the most frequent value, but also how frequent it is
  • Anytime you want to return more than one value from a function, a tuple is often a good option
  • Look at the get_most_frequent function
    • only difference is that we return a tuple and also include the max_value

      >>> data = read_numbers('numbers.txt')
      >>> get_most_frequent(data)
      (1, 7)
      

Author: Joseph C. Osborn

Created: 2020-04-21 Tue 10:44

Validate