Day08_sets
Sets
- A set is a collection of zero or more distinct elements.
- forall x, x is not in the empty set
- ∀x, x ∉ ∅
- A is a subset of B precisely when forall x, if x is in A, then x is in B
- A ⊆ B iff ∀x, x ∈ A ⇒ x ∈ B
- The union of sets A and B is a set C such that forall x, x is an element of C if and only if x is an element f A or an element of B.
- A ∪ B = C iff ∀x, x ∈ C ⇔ x ∈ A ∨ x ∈ B
The elements of a set should be distinct. Lists have no
constraints on their elements, so we need to be careful! Just like
we did for BSTs, we'll need an invariant.
To start with, we'll say that a 'well formed' natset won't have
any duplicates. Here are some examples.
The empty set is just the empty list.
Here are two different ways to write down the set of even naturals
less than or equal to 10.
Notice that these two concrete representations are different,
but both of these definitions represent the same set.
Definition evens_to_ten : natset := [0; 2; 4; 6; 8; 10].
Definition evens_to_ten' : natset := [2; 4; 0; 10; 8; 6].
Definition evens_to_ten' : natset := [2; 4; 0; 10; 8; 6].
Here's a list that isn't a set: the number 2 appears twice,
breaking the "distinct" part.
Coq doesn't know anything about our invariant. Just as we could
write a bt nat that didn't have the search property, we can
write something of type natset that isn't 'setlike'!
It's quite common to have our problem: when you define a datatype,
the language will let you freely combine all of the
constructors. But you might not want to allow every such
combination! It can be very hard to rule out bad combinations in
advance, so you instead opt to be very careful.
What does it mean to 'be very careful' about a type's invariant?
The general idea is that the invariant should be guaranteed by
the functions that construct your type and preserved by all of
the functions that manipulate it.
Set operations
Fixpoint member (x : nat) (l : natset) : bool :=
match l with
| [] ⇒ false
| y::l' ⇒ eqb x y || member x l'
end.
Compute (member 1 empty_natset). (* ===> false *)
Compute (member 1 evens_to_ten). (* ===> false *)
Compute (member 2 evens_to_ten). (* ===> true *)
Compute (member 2 evens_to_ten'). (* ===> true *)
Compute (member 2 natset_yikes). (* ===> true *)
match l with
| [] ⇒ false
| y::l' ⇒ eqb x y || member x l'
end.
Compute (member 1 empty_natset). (* ===> false *)
Compute (member 1 evens_to_ten). (* ===> false *)
Compute (member 2 evens_to_ten). (* ===> true *)
Compute (member 2 evens_to_ten'). (* ===> true *)
Compute (member 2 natset_yikes). (* ===> true *)
Exercise: 2 stars, standard (is_setlike)
Write a function is_setlike that determines whether or not a list of natural numbers is 'setlike', i.e., has no duplicates.
Fixpoint is_setlike (l : natset) : bool (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.
☐
☐
Here's union, which combines two sets so that elements of either
consituent set are members of the their union.
Fixpoint union (l1 : natset) (l2 : natset) : natset :=
match l1 with
| [] ⇒ l2
| (x::l1') ⇒ set_insert x (union l1' l2)
end.
match l1 with
| [] ⇒ l2
| (x::l1') ⇒ set_insert x (union l1' l2)
end.
Exercise: 3 stars, standard (intersection)
Define intersection. The intersection of sets A and B, written A ∩ B, is defined as containing those elements that are in both A and B.
Fixpoint intersection (l1 l2 : natset) : natset
(* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.
☐
(* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.
☐
Compute (intersection evens_to_ten [0;1;2;3;4]). (* ===> 0; 2; 4 *)
Compute (intersection evens_to_ten [47;29;13]). (* ===> *)
Exercise: 3 stars, standard (subset)
Now define the subset predicate. We say the set A is a subset of a set B when everything that's a member of A is a member of B.
Fixpoint subset (l1 l2 : natset) : bool
(* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.
☐
(* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.
☐
Exercise: 2 stars, standard (set_equal)
Define a function set_equal that takes two sets and returns true when two sets are equal. Note that their concrete representations may not be equal! You may assume that the inputs are setlike, i.e., satisfy the invariant.
(* FILL IN HERE *)
(* Do not modify the following line: *)
Definition manual_grade_for_set_equal : option (nat×string) := None.
☐
(* Do not modify the following line: *)
Definition manual_grade_for_set_equal : option (nat×string) := None.
☐
(*
Compute (set_equal ). (* ===> true *)
Compute (set_equal 0 1). (* ===> false *)
Compute (set_equal 1;0 0;1). (* ===> true *)
*)
Compute (set_equal ). (* ===> true *)
Compute (set_equal 0 1). (* ===> false *)
Compute (set_equal 1;0 0;1). (* ===> true *)
*)
The invariant we used above was "no duplicates". But we could use
a stronger invariant: keep the list sorted ascending.
Stronger invariants typically come with tradeoffs: they can be
harder to enforce and more expensive to check, but they often
allow for more efficient (or simpler) code.
To test our current invariant, we basically run is_sorted, but
we forbid duplicates.
Note that we're reusing the name `is_setlike`, but we're inside of
the `SortedSets` module, so there's no risk of Coq getting
confused.
Fixpoint is_setlike (l : list nat) : bool :=
match l with
| [] ⇒ true
| [_] ⇒ true
| x::(y::_) as l ⇒
match compare x y with
| LT ⇒ is_setlike l
| EQ | GT ⇒ false
end
end.
match l with
| [] ⇒ true
| [_] ⇒ true
| x::(y::_) as l ⇒
match compare x y with
| LT ⇒ is_setlike l
| EQ | GT ⇒ false
end
end.
Exercise: 2 stars, standard (member)
Define member for the 'sorted list' invariant, i.e., assume that your input set is sorted ascending. (You can return a wrong answer if it isn't.)
Fixpoint member (x : nat) (l: natset) : bool (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.
☐
☐
Fail Fixpoint union (l1 l2 : natset) : natset :=
match (l1, l2) with
| ([], _) ⇒ l2
| (_, []) ⇒ l1
| (x1::l1', x2::l2') ⇒
match compare x1 x2 with
| LT ⇒ x1 :: union l1' (x2::l2')
| EQ ⇒ x1 :: union l1' l2'
| GT ⇒ x2 :: union (x1::l1') l2'
end
end.
(* ===> Error: Cannot guess decreasing argument of fix. *)
match (l1, l2) with
| ([], _) ⇒ l2
| (_, []) ⇒ l1
| (x1::l1', x2::l2') ⇒
match compare x1 x2 with
| LT ⇒ x1 :: union l1' (x2::l2')
| EQ ⇒ x1 :: union l1' l2'
| GT ⇒ x2 :: union (x1::l1') l2'
end
end.
(* ===> Error: Cannot guess decreasing argument of fix. *)
Huh? Why didn't Coq like our function? Coq only allows terminating
functions, and union definitely terminates: one of the two lists
shrinks every time and neither ever grows! What gives?
Coq's notion of termination is pretty simple: it wants to pick a
single argument that decreases every time. It's a little too
simple, but we have a trick up our sleeve: let's just define an
extra, inner function. We'll call union normally when l1
shrinks and we'll call inner when l1 stays the same and only
l2 shrinks.
Coq will see that union is decreasing in l1 and inner is
decreasing in l2. Perfect!
Fixpoint union (l1 l2 : natset) : natset :=
let fix inner l2 :=
match (l1, l2) with
| ([], _) ⇒ l2
| (_, []) ⇒ l1
| (x1::l1', x2::l2') ⇒
match compare x1 x2 with
| LT ⇒ x1 :: union l1' (x2::l2')
| EQ ⇒ x1 :: union l1' l2'
| GT ⇒ x2 :: inner l2'
end
end
in
inner l2.
More conventional programming languages don't make you jump
through these hoops, we know. It is too bad.
Make it efficient, walking through both lists at once.
If you're not sure what to do, think carefully about
sortedness. If you're walking through l1 and l2 and find
elements x1 and x2, what relationships are possible between
x1 and x2 to have l1 be a subset of l2? Can x1 be less
then to x2? Equal? Greater than?
Exercise: 4 stars, standard (subset)
Write subset for sorted-list natsets.
Fixpoint subset (l1 l2 : natset) : bool (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.
☐
☐
BSTs (and trees in general) make very good implementations of
sets.
Exercise: 2 stars, standard (bst_set_definitions)
What would empty_natset be? You can reuse names from Day07_trees directly.
What about set_insert?
Definition set_insert : nat → natset → natset (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.
(* Do not modify the following line: *)
Definition manual_grade_for_bst_set_definitions : option (nat×string) := None.
☐
(* Do not modify the following line: *)
Definition manual_grade_for_bst_set_definitions : option (nat×string) := None.
☐