Lecture 9 — 2020-02-18
Parsing with Applicative
This lecture is written in literate Haskell; you can download the raw source.
We wrote up a straightforward instance for Maybe and a more interesting instance for Either e:
instance Applicative (Either e) where
pure x = Right x -- because Left x would be ill typed!
(Right f) <*> (Right v) = Right $ f v
err@(Left e) <*> _ = err
_ <*> err@(Left e) = errThen we went over the Applicative definitions for lists. There were two possibilities: cartesian product…
instance Applicative [] where
pure x = [x]
[] <*> _ = []
_ <*> [] = []
(f:fs) <*> xs = map f xs ++ fs <*> xs…and zipping:
newtype ZipList a = ZipList { getZipList :: [a] }
deriving (Eq, Show, Functor)
instance Applicative ZipList where
pure = ZipList . repeat
ZipList fs <*> ZipList xs = ZipList (zipWith ($) fs xs)
Readers
We define an instance for readers, too… a sort of prelude to defining our parsers.
instance Applicative ((->) r) where
pure v r = v
frab <*> fra = \r -> frab r (fra r)
We were able to use this instance to construct functions quickly and easily, like:
eogth = (&&) <$> even <*> (>100)
aos = (||) <$> isAlpha <*> isSpace
The first function returns true on numbers that are even and greater than one hundred; the latter returns true for characters that are alphabetical or whitespace.
Obey the laws
Like Functor, the Applicative type class is governed by laws.
Identity: pure id <*> v = v Composition: pure (.) <*> u <*> v <*> w = u <*> (v <*> w) Homomorphism: pure f <*> pure x = pure (f x) Interchange: u <*> pure y = pure ($ y) <*> u
Note that identity is a generalization of id <$> v = v from Functor, sincef <$> x = pure f <*> x`.
We defined “classic style” parsers in terms of a lexer (String -> [Token]) and a parser ([Token] -> AST), but we spent most of class looking at an alternative model: Applicative parsing.
import Data.Char
import Control.Applicative
newtype Parser a = Parser { parse :: String -> Maybe (a,String) }
letter :: Char -> Parser Char
letter c = Parser $ \s ->
case s of
c':s' | c == c' -> Just (c,s')
_ -> Nothing
letters :: String -> Parser String
letters str = Parser $ \s ->
if take (length str) s == str
then Just (str,drop (length str) s)
else NothingEG parse (letter 'c') "chocolate"
instance Functor Parser where
fmap f p = Parser $ \s ->
case parse p s of
Nothing -> Nothing
Just (v,s') -> Just (f v,s')Notice how, if you squint, you can see that this Functor instance of Parser is a combination of the instances for Maybe and for readers:
instance Functor Maybe where
fmap f Nothing = Nothing
fmap f (Just v) = Just (f v)
instance Functor ((->) r) where
-- (a -> b) -> (r -> a) -> (r -> b)
fmap f g x = f (g x)instance Applicative Parser where
pure a = Parser $ \s -> Just (a,s)
f <*> a = Parser $ \s -> -- f :: Parser (a -> b), a :: Parser a
case parse f s of
Nothing -> Nothing
Just (g,s') -> parse (fmap g a) s' -- g :: a -> b, fmap g a :: Parser b(Our Applicative instance is also a combination of the instances for Maybe and reader. You can take my word for it… or verify for yourself.)
EG parse ((\x -> [x,x,x]) <$> letter 'c') "chocolate"
EG parse strCH "chocolate"
string :: String -> Parser String
string [] = pure ""
string (c:s) = (:) <$> (letter c) <*> string sEG parse (string "choco") "chocolate"
EG parse (string "vanilla") "chocolate"
eof :: Parser ()
eof = Parser $ \s -> if null s then Just ((),"") else Nothing
strCH' = (\c h _ -> [c,h]) <$> letter 'c' <*> letter 'h' <*> eofEG ‘parse strCH’ “ch”isJust (“ch”,[])`
EG parse strCH' "chocolate" yields Nothing.
Notice how we ignored a value for eof. We can use (<*) and (*>) to save ourselves some trouble, writing, e.g.,
strCH'' = (\c h -> [c,h]) <$> letter 'c' <*> letter 'h' <* eof
ensure :: (a -> Bool) -> Parser a -> Parser a
ensure pred p = Parser $ \s ->
case parse p s of
Nothing -> Nothing
Just (a,s') -> if pred a then Just (a,s') else Nothing
satisfy :: (Char -> Bool) -> Parser Char
satisfy p = Parser f
where f [] = Nothing
f (x:xs) = if p x then Just (x,xs) else NothingObserve that letter c is equivalent to satisfy (==c).
lookahead :: Parser (Maybe Char)
lookahead = Parser f
where f [] = Just (Nothing,[])
f (c:s) = Just (Just c,c:s)We could manually define integer parsing:
integer :: Parser Int
integer = Parser $ \s ->
let (digits,rest) = span isDigit s in
if null digits then Nothing else Just (read digits,rest)But it’s nicer if we define a notion of choice:
class Applicative f => Alternative f where
empty :: f a
(<|>) :: f a -> f a -> f a
many, some :: Alternative f => f a -> f [a]
some p = (:) <$> p <*> many p
many p = some p <|> pure []
instance Alternative Maybe where
-- empty :: f a
empty = Nothing
-- (<|>) :: f a -> f a -> f a
Just x <|> _ = Just x
Nothing <|> r = r
-- empty <|> f == f
-- f <|> empty == finstance Alternative Parser where
empty = Parser $ \s -> Nothing
p1 <|> p2 = Parser $ \s ->
case parse p1 s of
Just (a,s') -> Just (a,s')
Nothing -> parse p2 s
integer' :: Parser String
integer' = read <$> someDigits
where someDigits = (:) <$> satisfy isDigit <*> moreDigits
moreDigits = someDigits <|> pure []
int :: Parser Int
int = read <$> some (satisfy isDigit)EG parse int "8675309"
EG parse int "5551212zoop"
EG parse int "KL51212"
threeInts = (\n1 n2 n3 -> [n1,n2,n3]) <$> (int <* char ',') <*> (int <* char ',') <*> (ensure (>0) int)EG parse threeInts "1,2,3"
EG parse threeInts "1,2,0"
Let’s build a parser for arithmetic expressions. We’ll keep it as an invariant that we parse spaces up before each actual phrase, so “2 + 2” and “2+2” and " 2 +2" all yield Plus (Num 2) (Num 2).
spaces :: Parser ()
spaces = many (satisfy isSpace) *> pure ()
char :: Char -> Parser Char
char c = spaces *> satisfy (==c)
plus, times :: Parser Char
plus = char '+'
times = char '*'
num :: Parser Int
num = spaces *> int
data Arith =
Num Int
| Plus Arith Arith
| Times Arith Arith
deriving Show
term, factor, atom :: Parser Arith
term = Plus <$> factor <* plus <*> term
<|> factor
factor = Times <$> atom <* times <*> factor
<|> atom
atom = Num <$> num
<|> (char '(' *> term <* char ')')Compare this with the CFG:
Term ::= Factor + Term | Factor
Factor ::= Atom * Factor | Atom
Atom ::= n | ( Term )
Note that this parse has a (benign, for now) bug: its arithmetic is right associative.