CS62 - Spring 2010 - Lecture 24
destructor exercise
a few things from last time
- C++ does NOT have a root, Object class
- memory leaks in Java
- Most commonly, when an object who's lifespan is longer than the lifespan of another object still keeps a reference
- Circular references:
Node n1 = new Node();
Node n2 = new Node();
n1.setNext(n2);
n2.setNext(n1);
a few notes about the assignment
- make sure to pass file streams by reference:
readHelper(ifstream& in)
writeHlper(ofstream& out)
- I've given you code for clear and the big 3, you just need to move it over and use it (modulo a change or two to accommodate the empty tree)
- if you're confused about height and depth, look in the notes for binary trees
- "empty" in the code I gave you refers to the empty tree. Ideally, you only have one empty tree for all required empty trees. You can just make this a public variable and use it whenever you need an empty tree (e.g. in the constructors).
- talk about BinaryTreeIO:: (and why NOT to use it :)
other announcements
- Sr. presentations today and tomorrow
- Pre-registration pizza
- TA reminder
random thing in C++ for the day :)
- arrays in C++
- arrays in C++ inherit most of their functionality (and weirdness) from C
- my advice... just don't use them... use the vector class and call it a day
- If for some reason you come across them...
- int myArray[50];
- allocations an array of ints with 50 elements in it
- the [] have to come AFTER the variable name (unlike in Java where either before or after the variable is fine)
- string myArray[50], etc.
- What is an array in Java?
- reference
- how could you check this?
- I've told you that Java always uses call-by value
- make an array
- pass it to a method
- change an array entry in the method
- see if the change is seen outside the method
- Creating a new array does the following:
- allocates enough memory for the, say 50, objects/items
- the "sizeof" function returns the size in bytes of an object, if you're curious
- an array is then shorthand for a pointer to the beginning of that chunk of memory
- when you access the array, say,
myArray[i]
this is just shorthand for:
*(myArray+i)
- add i "int"s worth to the pointer myArray (remember adding to an int pointer adds 4 bytes!)
- if it were a different object, for example:
IntCell myIntCellArray[50]
then it would increment a different amount of memory, since an IntCell would be larger than an int
- give me the value referenced by this new pointer
- For example, the following are equivalent:
for( int i = 0; i < 50; i++ ){
myArray[i] = 0;
}
int* ptr;
for( ptr = myArray; ptr < myArray+50; ptr++ ){
*ptr = 0;
}
- other differences
- there is no .length member variable for arrays, and no way of telling how long an array is, so you need to pass along the length
- there is no bounds checking, so be careful with your array indices
- iterators
- what methods does an Iterator have in Java?
- next()
- hasNext()
- remove() // optional
- how are they used?
- to traverse through a data set
- In C++, iterators are implemented in a fashion similar to pointers
- unlike Java, each class has it's own iterator type
- vector<int>::iterator
- map<int, int>::iterator
- map<int, list<pair<int,int> > >::iterator
- Look at vector_iterator() method in
iterator.cpp code
- we start out at the beginning of our elements with the begin() method
- returns an iterator at the beginning of the data to iterate through
- we can access the elements we're iterating through via the iterator variable
- notice that we're given a pointer to that object, so we can actually modify the object
- "it" is like a pointer, though, so you need to dereference it or use -> if you want to call methods
- incrementing the iterator, is just like incrementing a pointer
- it++
- the end() method also returns an iterator that is past the end of the data
- we use this to see if we've iterated through all of the data
- just to make it clear, iterators are NOT pointers, but by using operator overloading, they're made to function like operators
- What does map_iterator() method do in
iterator.cpp code
?
- first, we create a map object
- the key is an int
- the value is a pair of ints
- next, we add things to that object
- the key is i
- the value is (i, i)
- finally, we make an iterator and traverse the data
- again, each different type has a different iterator type
- the iterator for a map returns a pointer to a key/value pair
- it->first is the key
- it->second is the value (which is itself a pair)
- look at map_iterator(const ...) method in
iterator.cpp code
- often we want to pass objects by constant references
- in this case, we can't use a normal iterator!
- a normal iterator would allow us to modify the values
- instead, we use a const_iterator
- almost all classes that have iterators also implement const_iterator
graphs, a quick recap
- A graph is a set of vertices (or nodes) V and a set of edges (u,v) in E where u, v are in V
- a path is a list of vertices p_1, p_2, ..., p_k where there exists an edge (p_i, p_{i+1}) in E
- a "simple" path is a path where all edges are unique
representing graphs
- so far, we've drawn them on the board fine, but how are we going to store them for processing?
- adjacency list
- each vertex u in V contains a linked list of all the vertices v such that there exists an edge (u, v) in E, that is that there is an edge from u to v
A: B->D
B: A->D
C: D
D: A->B->C->E
E: D
- adjacency matrix
- a |V| by |V| matrix A, such that A_ij is 1 if edge (i, j) is in E, 0 otherwise
A B C D E
A 0 1 0 1 0
B 1 0 0 1 0
C 0 0 0 1 0
D 1 1 1 0 1
E 0 0 0 1 0
- what will this matrix look like if the graph is undirected?
- it will be symmetric
- examples:
- draw the following graphs
--
A: B C
B: A C
C: A B
--
A: D
B: D E
C: D B
D: A B C D E
E: B C
--
A B C
A 1 0 0
B 0 0 1
C 0 1 0
- how would we incorporate weights into both of these approaches?
- adjacency list: just keep that additional piece of information in the linked list
- adjacency matrix: store that value in the matrix (instead of just a 0 or a 1)
- What are the benefits/drawbacks of each approach and when might each be useful?
- adjacency list
- good for sparse graphs
- more space efficient (for sparse graphs)
- must traverse the adjacency list to discover if an edge exists
- adjacency matrix
- good for dense graphs
- constant time lookup to discover if an edge exists
- for non-weighted graphs, only requires a boolean matrix
- Can we get the best of both worlds (constant lookup, good sparse representation)?
- sparse adjacency matrix
- rather than storing adjacent vertices as a linked list, store as a hashtable
- benefits/drawbacks?
- constant time lookup
- fairly space efficient (though some overhead with keeping the table)
- not good for dense graphs
finding cycles
- given a connected graph, how can we determine if it has a cycle in it?
- or, given a connected graph, determine that it is not a tree
- what is the definition of a cycle?
- a simple path, where the endpoints are the same
- idea:
- start at a node, go down a path
- stop when either we find a vertex on the path that we've already seen
- or when we hit a dead-end
- if we hit a dead-end, backtrack and find another path
- if we visit all of the nodes, without finding a repeat vertex, it's acyclic
- does this sound like anything we've seen before?
- depth first search!
void dfs(vertex u, visited) {
if(!visited(u)){
visited.add(u);
for (v: neighbors of u){
if (!visited(v)){
dfs(v, visited);
}
}
}
}
- what modifications need to be made?
- if we visit a node that we've already visited, then we've found a cycle
- what about where we just came from?
- need to know where we came from so we can avoid calling that a cycle
- want to return true if we find a cycle, false otherwise
bool dfsCycle(vertex u, vertex parent, visited) {
bool result = false;
visited.add(u);
for(v: neighbors of u){
if(!visited(v)){
result = result || dfsCycle(v, u);
} else if (v != parent){
result = true;
}
}
return result;
}
- observations:
- what does it do?
- runs depth first search
- if it finds a visited node that was not it's parent (i.e. a cycle) returns true
- otherwise, false
- how is this different from DFS that we saw before?
- we have the additional else if to see if we've found a cycle
- why do we need the parent as a parameter?
- so we can distinguish finding a visited node in a cycle vs. a visited node where we just came from
- what does "result ||= dfsCycle(v, u)" do?
- result = result || dfsCycle(v, u)
- which is true if we find a cycle anywhere
- walk through an example
- let's try and actually implement our boolean cycle detector
- how can we represent a vertex?
- simplest is just use a number, i.e. an int
- we'll use an adjacency list represenation
- in C++ there is a "list" class in the STL library
- we have a few options for declaring the graph type:
- what if we wanted to use a vector to store the vertices?
- vector<list<int> > graph
- what is one downside to this approach?
- assumes the vertices are sequential, that is 0, 1, 2, ...
- what is another option?
- map<int, list<int> >
- what if we wanted to add weights?
- map<int, list<pair<int, int> > >
- look at dfs_hasCycles in
graph_algorithms.cpp code
- what does "list<int> nbrList = adjMap.find(v)->second" do?
- get's the adjacency list associated with vertex v
- why the "->second"?
- recall, the map iterator returns a pair
- "->first" would give us the key (in this case, just v)
- why can't we write "list<int> nbrList = adjMap[v]"?
- the operator[] is not a const method
- it can be used to change the map
adjMap[v] = ...
- use an iterator to iterate thorough the neighbor list
- what does "visited.find(v) != visited.end()" do?
- checks to see if v is in the list
- another option:
visited.count(v) > 0
- note the recursive call to dfs_hasCycles
- look at grop_hasCycles in
graph_algorithms.cpp code
- takes a graph
- why passed by reference?
- to avoid copying
- the "set" class is useful for keeping track of which nodes we've visited
- why do we have the for loop?
- graphs may not be connected!
- still could have a cycle though
- need to make sure we've visited all possible sections of the graph when looking for cycles
- notice the use of a const_iterator (and the type of that iterator)
- will be implementing a version of this where you actually return the cycle
- running time
- how many times do we call dfsCycle on each vertex?
- exactly once for a connected graph
- the first thing we do is set visited to true for that vertex
- and will never revisit a visited vertex
- what is the cost of each call to dfsCycle?
- depends on the representation
- adjacency matrix:
- we need to traverse all V entries to get the neighbors
- O(|V|^2) overall
- adjacency list:
- a little trickier
- how many times do we process each edge?
- once
- O(|V| + |E|), which for a connected graph is O(|E|)