CS150 - Fall 2012 - Class 10
Pi video
http://www.youtube.com/watch?v=jG7vhMMXagQ
quiz problem 1c.
admin
- Lab on Friday can be done with a partner
- must both be there when you're working on it
- should only be working on one computer
- just submit one file with both your names on it
- Test project 1 out soon
- honor code
- must work alone
- may only use: book, your notes, class notes, python.org documentation
- may NOT: get help from other students, get help from the tutors (except for file issues, etc), look online for solutions
- 3 problems
- required to do some extra credit
- 63 points total, but only 60 for just doing what I stated
- More than a third of the points come from code style and commenting
- follow instructions carefully!
lists are objects and therefore have methods. What methods might we want?
-
http://docs.python.org/tutorial/datastructures.html
- append: add a value on to the end of the list
>>> my_list = [15, 2, 1, 20, 5]
>>> my_list.append(100)
>>> my_list
[15, 2, 1, 20, 5, 100]
- notice that append does NOT return a new list, it modifies the existing list!
- pop: remove a value off of the end of the list and return it
>>> my_list.pop()
100
>>> my_list
[15, 2, 1, 20, 5]
- notice that it both modifies the list and returns a value
- if you want to use this value, you need to store it!
>>> x = my_list.pop()
>>> x
5
- pop also has another version where you can specify the index
>>> my_list = [15, 2, 1, 20, 5]
>>> my_list.pop(2)
1
>>> my_list
[15, 2, 20, 5]
- insert: inserts a value at a particular index
>>> my_list = [15, 2, 1, 20, 5]
>>> my_list.insert(2, 100)
>>> my_list
[15, 2, 100, 1, 20, 5]
- again, lists are mutable, so insert does not return a new list, but modifies the underlying one
- sort
>>> my_list = [15, 2, 1, 20, 5]
>>> my_list.sort()
>>> my_list
[1, 2, 5, 15, 20]
>>> my_list = ["these", "are", "some", "words", "to", "sort"]
>>> ["these", "are", "some", "words", "to", "sort"].sort()
>>> my_list = ["these", "are", "some", "words", "to", "sort"]
>>> my_list.sort()
>>> my_list
['are', 'some', 'sort', 'these', 'to', 'words']
lists are mutable
- what does that mean?
- we can change (or mutate) the values in a list
- we can mutate lists with methods, but we can also change particular indices
>>> my_list = [15, 2, 1, 20, 5]
>>> my_list
[15, 2, 1, 20, 5]
>>> my_list[2] = 100
>>> my_list
[15, 2, 100, 20, 5]
back to our grades program: look at
scores-lists.py code
- there is a function called get_scores. That gets the scores and returns them as a list. How?
- starts with an empty list
- uses append to add them on to the end of the list
- returns the list when the loop finishes
- average function
- has a single parameter, but this parameter will represent a list
- inelegant_average
- calculates the sum and divides by the number of entries
- uses a for loop to iterate over the values
- often, we'll use something besides "i" as a variable name that makes our program more readable
- is there a better way to do this?
- look at fancy_average
- us the sum function over lists
- median function
- sorts the values
- notice again that sort does NOT return a value, but sorts the list that it is called on
- returns the middle entry
aliasing
- what will be the output of my_list after doing the following:
>>> my_list = [1, 2, 3, 4, 5]
>>> other_list = my_list
>>> other_list[2] = 100
>>> other_list
[1, 2, 100, 4, 5]
>>> my_list
- [1, 2, 100, 4, 5] ... why?
- my_list and other_list are just references to the SAME object
- this is called aliasing, since other_list is an alias (another name) for my_list
- saying other_list = my_list does not do a deep copy, that is it does NOT create a new list that is a copy of the list
- draw a picture
- notice that if I make changes to either one, changes will be seen in the other
>>> my_list
[1, 2, 100, 4, 5]
>>> other_list
[1, 2, 100, 4, 5]
>>> my_list[0] = 0
>>> other_list[1] = 1000
>>> my_list
[0, 1000, 100, 4, 5]
>>> other_list
[0, 1000, 100, 4, 5]
- aliasing can also show up in other places
def mystery(x):
x[0] = 1000
>>> my_list = [1, 2, 3, 4, 5]
>>> my_list
[1, 2, 3, 4, 5]
>>> mystery(my_list)
>>> my_list
[1000, 2, 3, 4, 5]
- parameters are passed as a shallow copy (i.e. an alias)
- "parameter passing" describes how the values that are input to the function (i.e. the arguments) are bound to the parameters inside the function
- be careful!
- why do you think this is done?
- a deep copy can be a lot of work
- also allows us to write functions that manipulate the parameter (which we may or may not do)
- notice that we cannot change what other_list reference (only mutate the object)
def mystery(alist):
alist = [0]*10
print alist
>>> my_list = [1, 2, 3, 4, 5]
>>> mystery(my_list)
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
>>> my_list
[1, 2, 3, 4, 5]
- slicing does create a new copy
>>> my_list = [1, 2, 3, 4, 5]
>>> other_list = my_list[2:4]
>>> other_list
[3, 4]
>>> other_list[0] = 100
>>> other_list
[100, 4]
>>> my_list
[1, 2, 3, 4, 5]
- given this, how could we create a deep copy of other_list?
>>> my_list = [1, 2, 3, 4, 5]
>>> other_list = my_list[:]
>>> other_list[3] = 100
>>> other_list
[1, 2, 3, 100, 5]
>>> my_list
[1, 2, 3, 4, 5]
run the sentence_stats function from
word-stats.py code
- similar idea to our scores functions except now we're going it over strings instead of numbers
- the string class has a "split" method that splits up a sentence into a list by splitting on spaces
>>> "this is a sentence".split()
['this', 'is', 'a', 'sentence']
- optionally, can specify what to split on (though this is much more rare)
>>> "this is a sentence".split("s")
['thi', ' i', ' a ', 'sentence']
files
- what is a file?
- a chunk of data stored on the hard disk
- why do we need files?
- hard-drives persist state regardless of whether the power is on or not
- when a program is running, all the data it is generating/processing is in main memory (e.g. RAM)
- main memory is faster, but doesn't persist when the power goes off
reading files
- to read a file in Python we first need to open it
file = open("some_file_name", "r")
- open is another function that has two parameters
- the first parameter is a string identifying the filename
- be careful about the path/directory. Python looks for the file in the same directory as the program (.py file) unless you tell it to look elsewhere
- the second parameter is another string telling Python what you want to do with the file
- "r" stands for "read", that is, we're going to read some data from the file
- open returns a "file" object that we can use later on for reading purposes
- above, I've saved that in a variable called "file", but I could have called it anything else
>>> open("english.txt", "r")
<open file 'english.txt', mode 'r' at 0x10120a030>
>>> type(open("english.txt", "r"))
<type 'file'>
- once we have a file open, we can read a line at a time from the file using a for loop:
for <variable> in <file_variable>:
# do something
- for each line in the file, the loop will get run
- each time the variable will get assigned to the next line in the file
- the line will be of type string
- the line will also have an endline at the end of it which you'll often want to get rid of (the strings strip() method is often good for this)
look at the file_stats function in
word-stats.py code
- what does it do?
- opens a file
- reads a line at a time
- appends each entry in the file to a list called words (stripping of the end of line)
- prints out the statistics of the word file
- in this same directory I have a file call "english.txt" that has a large list of English words
>>> file_stats("english.txt")
Number of words: 47158
Longest word: antidisestablishmentarianism
Shortest word: Hz
Avg. word length: 8.37891768099
- notice how quickly it can process through the file
- computers are fast!