Lists
Notation "x :: y" := (cons x y)
(at level 60, right associativity).
Notation "[ ]" := nil.
Notation "[ x ; .. ; y ]" := (cons x .. (cons y []) ..).
Notation "x ++ y" := (app x y)
(at level 60, right associativity).
(at level 60, right associativity).
Notation "[ ]" := nil.
Notation "[ x ; .. ; y ]" := (cons x .. (cons y []) ..).
Notation "x ++ y" := (app x y)
(at level 60, right associativity).
Now lists can be written concisely:
Definition list123 := @cons nat 1 (@cons nat 2 (@cons nat 3 (@nil nat))).
Definition list123' := @cons _ 1 (@cons _ 2 (@cons _ 3 (@nil _))).
Definition list123'' := cons 1 (cons 2 (cons 3 nil)).
Definition list123' := @cons _ 1 (@cons _ 2 (@cons _ 3 (@nil _))).
Definition list123'' := cons 1 (cons 2 (cons 3 nil)).
All that's fine. But check this out.
Such a notation is very common for lists and arrays to use this
"concrete, element-wise" notation---a wide variety of programming
languages use it.
The general idea is that:
[ v1; v2; ...; vn-1; vn ]
is the same as
cons v1 (cons v2 (cons ... (cons vn-1 (cons vn nil))))
It's worth noting: when using concrete notation, you only write
the values, never the tail of the list.
But it's easy to get confused! The following quizzes are meant to
hone your understanding.
[ v1; v2; ...; vn-1; vn ]
cons v1 (cons v2 (cons ... (cons vn-1 (cons vn nil))))
Case study: DNA strands and amino acids
We'll redefine eq_base here, to make sure we're all on the same
page. Note how we match on the pair to only specify the true
cases, using _ as a "catch-all" or "default" for the remaining
false cases.
Definition eq_base (b1 b2 : base) : bool :=
match (b1, b2) with
| (C, C) ⇒ true
| (G, G) ⇒ true
| (A, A) ⇒ true
| (T, T) ⇒ true
| (_, _) ⇒ false
end.
match (b1, b2) with
| (C, C) ⇒ true
| (G, G) ⇒ true
| (A, A) ⇒ true
| (T, T) ⇒ true
| (_, _) ⇒ false
end.
Now we can define predicates for strand equality (eq_strand) and
complementarity (complementary).
Fixpoint eq_strand (dna1 dna2 : strand) : bool :=
match (dna1, dna2) with
| ([], []) ⇒ true
| ([], _) ⇒ false
| (_, []) ⇒ false
| (b1 :: dna1', b2 :: dna2') ⇒
eq_base b1 b2 && eq_strand dna1' dna2'
end.
Fixpoint complementary (dna1 dna2 : strand) : bool :=
match (dna1, dna2) with
| ([], []) ⇒ true
| ([], _) ⇒ false
| (_, []) ⇒ false
| (b1 :: dna1', b2 :: dna2') ⇒
eq_base b1 (complement b2) && complementary dna1' dna2'
end.
match (dna1, dna2) with
| ([], []) ⇒ true
| ([], _) ⇒ false
| (_, []) ⇒ false
| (b1 :: dna1', b2 :: dna2') ⇒
eq_base b1 b2 && eq_strand dna1' dna2'
end.
Fixpoint complementary (dna1 dna2 : strand) : bool :=
match (dna1, dna2) with
| ([], []) ⇒ true
| ([], _) ⇒ false
| (_, []) ⇒ false
| (b1 :: dna1', b2 :: dna2') ⇒
eq_base b1 (complement b2) && complementary dna1' dna2'
end.
DNA (modeled as base) encodes, among other things, amino
acids. Three DNA nucleotides encode a single codon: either an
amino acid (like tryptophan (Trp) or glutamine (Glu)) or a
sequence marker like STOP. These sequence of amino acids are
formed into a protein by the ribosome.
Inductive codon : Type :=
| Ala
| Cys
| Asp
| Glu
| Phe
| Gly
| His
| Ile
| Lys
| Leu
| Met
| Asn
| Pro
| Gln
| Arg
| Ser
| Thr
| Val
| Trp
| Tyr
| STOP.
| Ala
| Cys
| Asp
| Glu
| Phe
| Gly
| His
| Ile
| Lys
| Leu
| Met
| Asn
| Pro
| Gln
| Arg
| Ser
| Thr
| Val
| Trp
| Tyr
| STOP.
The truth is that the ribosome doesn't work on DNA, it works on
RNA, which uses slightly different bases, replacing thymine (T)
with uracil (U). To simplify things, we'll just work with
DNA.
Using the DNA codon table from
Wikipedia https://en.wikipedia.org/wiki/DNA_codon_table),
let's implement a function that takes three nucleotides and returns the
corresponding codon.
Programming is often tedious and repetitive. It pays to be careful
and methodical when writing this long function. But... there are
definitely places you can save pattern matches!
Definition encode_one (b1 b2 b3 : base) : codon :=
match b1 with
| T ⇒
match b2 with
| T ⇒ match b3 with
| T ⇒ Phe
| C ⇒ Phe
| A ⇒ Leu
| G ⇒ Leu
end
| C ⇒ Ser
| A ⇒ match b3 with
| T ⇒ Tyr
| C ⇒ Tyr
| A ⇒ STOP (* Ochre *)
| G ⇒ STOP (* Amber *)
end
| G ⇒ match b3 with
| T ⇒ Cys
| C ⇒ Cys
| A ⇒ STOP (* Opal *)
| G ⇒ Trp
end
end
| C ⇒
match b2 with
| T ⇒ Leu
| C ⇒ Pro
| A ⇒ match b3 with
| T ⇒ His
| C ⇒ His
| A ⇒ Gln
| G ⇒ Gln
end
| G ⇒ Arg
end
| A ⇒
match b2 with
| T ⇒ match b3 with
| T ⇒ Ile
| C ⇒ Ile
| A ⇒ Ile
| G ⇒ Met
end
| C ⇒ Thr
| A ⇒ match b3 with
| T ⇒ Asn
| C ⇒ Asn
| A ⇒ Lys
| G ⇒ Lys
end
| G ⇒ match b3 with
| T ⇒ Ser
| C ⇒ Ser
| A ⇒ Arg
| G ⇒ Arg
end
end
| G ⇒
match b2 with
| T ⇒ Val
| C ⇒ Ala
| A ⇒ match b3 with
| T ⇒ Asp
| C ⇒ Asp
| A ⇒ Glu
| G ⇒ Glu
end
| G ⇒ Gly
end
end.
Compute (encode_one T A G). (* ===> STOP *)
Compute (encode_one G A T). (* ===> Asp *)
Compute (encode_one A T G). (* ===> Met *)
match b1 with
| T ⇒
match b2 with
| T ⇒ match b3 with
| T ⇒ Phe
| C ⇒ Phe
| A ⇒ Leu
| G ⇒ Leu
end
| C ⇒ Ser
| A ⇒ match b3 with
| T ⇒ Tyr
| C ⇒ Tyr
| A ⇒ STOP (* Ochre *)
| G ⇒ STOP (* Amber *)
end
| G ⇒ match b3 with
| T ⇒ Cys
| C ⇒ Cys
| A ⇒ STOP (* Opal *)
| G ⇒ Trp
end
end
| C ⇒
match b2 with
| T ⇒ Leu
| C ⇒ Pro
| A ⇒ match b3 with
| T ⇒ His
| C ⇒ His
| A ⇒ Gln
| G ⇒ Gln
end
| G ⇒ Arg
end
| A ⇒
match b2 with
| T ⇒ match b3 with
| T ⇒ Ile
| C ⇒ Ile
| A ⇒ Ile
| G ⇒ Met
end
| C ⇒ Thr
| A ⇒ match b3 with
| T ⇒ Asn
| C ⇒ Asn
| A ⇒ Lys
| G ⇒ Lys
end
| G ⇒ match b3 with
| T ⇒ Ser
| C ⇒ Ser
| A ⇒ Arg
| G ⇒ Arg
end
end
| G ⇒
match b2 with
| T ⇒ Val
| C ⇒ Ala
| A ⇒ match b3 with
| T ⇒ Asp
| C ⇒ Asp
| A ⇒ Glu
| G ⇒ Glu
end
| G ⇒ Gly
end
end.
Compute (encode_one T A G). (* ===> STOP *)
Compute (encode_one G A T). (* ===> Asp *)
Compute (encode_one A T G). (* ===> Met *)
Exercise: 2 stars, standard (encode)
Write a function that tries to translate a DNA strand into a list of codons. Since encode_one needs three bases to work, we'll run into a problem if we don't have the right number of nucleotides in the list. If you end up with a funny number of nucleotides (i.e., not a multiple of 3), you should return None.
Fixpoint encode (dna : strand) : option (list codon) (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.
Compute (encode [A]). (* ==> None *)
Compute (encode [A; T; G; C; G; T; T; A; T; T; A; G]). (* ==> Some Met; Arg; Tyr; STOP *)
☐
Compute (encode [A]). (* ==> None *)
Compute (encode [A; T; G; C; G; T; T; A; T; T; A; G]). (* ==> Some Met; Arg; Tyr; STOP *)
☐
Combine and split
Fixpoint combine {X Y : Type} (lx : list X) (ly : list Y)
: list (X×Y) :=
match lx, ly with
| [], _ ⇒ []
| _, [] ⇒ []
| x :: tx, y :: ty ⇒ (x, y) :: (combine tx ty)
end.
: list (X×Y) :=
match lx, ly with
| [], _ ⇒ []
| _, [] ⇒ []
| x :: tx, y :: ty ⇒ (x, y) :: (combine tx ty)
end.
Try answering the following questions on paper and
checking your answers in Coq:
Fill in the definition of split below. Make sure it passes the
given unit test.
- What is the type of combine (i.e., what does Check @combine print?)
- What does
Compute (combine [1;2] [false;false;true;true]). print?
Exercise: 2 stars, standard (split)
The function split is the right inverse of combine: it takes a list of pairs and returns a pair of lists. In many functional languages, it is called unzip.
Fixpoint split {X Y : Type} (l : list (X×Y))
: (list X) × (list Y)
(* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.
Compute split [(1,false);(2,false)]. (* ==> (1;2,false;false) *)
☐
: (list X) × (list Y)
(* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted.
Compute split [(1,false);(2,false)]. (* ==> (1;2,false;false) *)
☐
Higher-order predicates: forallb and existsb
Exercise: 3 stars, standard (forall_exists)
Define two recursive Fixpoints, forallb and existsb. The first checks whether every element in a list satisfies a given predicate:forallb oddb [1;3;5;7;9] = true
forallb negb [false;false] = true
forallb evenb [0;2;4;5] = false
forallb (eqb 5) [] = true The second checks whether there exists an element in the list that satisfies a given predicate:
existsb (eqb 5) [0;2;3;6] = false
existsb (andb true) [true;true;false] = true
existsb oddb [1;0;0;0;0;3] = true
existsb evenb [] = false
(* FILL IN HERE *)
(* Do not modify the following line: *)
Definition manual_grade_for_forall_exists : option (nat×string) := None.
☐
(* Do not modify the following line: *)
Definition manual_grade_for_forall_exists : option (nat×string) := None.
☐
(* 2022-01-12 10:20 *)