Homework 3.0

The “While” programming language

This homework is written in literate Haskell; you can download the raw source to fill in yourself. You’re welcome to submit literate Haskell yourself, or to start fresh in a new file, literate or not.

Please submit homeworks via the new submission page.

{-# OPTIONS_GHC -W #-}
module Hw03 where

import qualified Data.Map as Map
import Data.Map (Map)
import qualified Data.Set as Set
import Data.Set (Set)

Throughout this homework, we’ll be experimenting with our first interpreter for what really is a programming language. We’ll need two concepts throughout: variable names (which will just be strings) and stores (a/k/a heaps, where we keep the contents of the variables of our language). All of our variables will store integers.

type VarName = String

type Store = Map VarName Int

Problem 1: Interpreting While

We’ll define an interpreter for a language that goes beyond the simple WhileNZ language we saw in class.

data AExp =
    Var VarName
  | Num Int
  | Plus AExp AExp
  | Times AExp AExp
  | Neg AExp
  deriving (Show, Eq, Ord)

Write an interpreter for these arithmetic expressions. When evaluating variables, you should return 0 if they’re not in the store (such variables are called unbound or undefined).

evalA :: Store -> AExp -> Int
evalA _ _ = undefined

We can define boolean expressions similarly. Rather than concretely specifying which arithmetic expressions they’re defined over, we just take in a parameter.

data BExp a =
    Bool Bool
  | Equal a a
  | Lt a a
  | Not (BExp a)
  | Or (BExp a) (BExp a)
  | And (BExp a) (BExp a)
  deriving (Show, Eq, Ord)

Write an interpreter for boolean expressions over our prior arithmetic expressions.

evalB :: Store -> BExp AExp -> Bool
evalB _ _ = undefined

Finally, we’ll define a simple programming language. Its abstract syntax tree (AST) takes two type parameters: one identifying the arithmetic expressions we’ll use, one identifying the boolean expressions we’ll use.

data Stmt a b =
    Skip
  | Assign VarName a
  | Seq (Stmt a b) (Stmt a b)
  | If (b a) (Stmt a b) (Stmt a b)
  | While (b a) (Stmt a b)
  deriving (Show, Eq, Ord)

Write an interpreter for this language.

eval :: Store -> Stmt AExp BExp -> Store
eval _ _ = undefined

Problem 2: While, with failures

Here’s a new definition for arithmetic expressions, adding division.

data AExp' =
    Var' VarName
  | Num' Int
  | Plus' AExp' AExp'
  | Times' AExp' AExp'
  | Neg' AExp'
  | Div' AExp' AExp'
  deriving (Show, Eq)

Note that division is an operation that can fail. Write another interpreter (defining whatever functions you need). Do not use the error function.

In the interpreter above, variables not in the store were given the default value of 0. In this version of the interpreter, make it so that unbound variables in arithmetic expressions cause errors, just like division. Here are the two errors that can happen:

data Error = NoSuchVariable VarName | DivideByZero AExp'

When you encounter an unbound variable, the error has a slot for identifying the culpable variable. Similarly, when you try to divide by zero, you should record the entire division expression responsible, not just the divisor. (In a more serious AST, we might keep track of the source file and line number each expression came from, in order to better indicate the source of the problem.)

eval' :: Store -> Stmt AExp' BExp -> Either Error Store
eval' _ _ = undefined

Problem 3: Static analysis

Can we determine in advance whether a given program will try to use an unbound variable if they’re run in an initially empty store? This kind of analysis is called “def/use analysis”, and it’s a common early step in compilation. More generally, this is “static analysis”, becuase we inspect our programs before we run them. (Static and dynamic are opposites; you can read them as “at compile time” and “at run time”, respectively.)

In some programs, it’s easy:

unboundY = Assign "x" (Var' "y")

The program unboundY will always fail in an unbound store. It can be more ambiguous, though, as in:

ambiguous b = Seq (If b (Assign "y" (Num' 0)) Skip) unboundY

Depending on what we know about b, we may or may not have a problem on our hands. Absent any information about b, it could happen that ambiguous b will try to read from y before it’s defined.

In PL, we tend to stay on the safe side: the general philosophy is that’s better to have a false positive (saying a program is unsafe when it’s actually fine) than to have a false negative (saying a program is safe when it isn’t!). That is, PL prioritizes soundness (if we say X, then X is really true) over completeness (if X is really true, then we say X). As a side note, observe that it’s easy to write a trivial sound analysis (everything’s unsafe, please wear a helmet) as it is a trivial complete analysis (everything’s safe, take it easy).

To get started, write functions that collect all of the variables that appear in given arithmetic and boolean expressions.

varsA :: AExp' -> Set VarName
varsA _ = undefined

For example, varsA (Times (Plus' (Var' "x") (Var' "y")) (Num 3)) == Set.fromList ["x", "y"].

varsB :: BExp AExp' -> Set VarName
varsB _ = undefined

For example, varsB (Or (Not (Equal (Var' "foo") (Var' "bar"))) (Bool True)) == Set.fromList ["bar", "foo"].

Now let’s write our analysis: we’ll take in a set of variables that we know to be defined, a statement in our language, and we’ll return a pair of sets: the set of variables that have been defined and the set of variables that have been used but not defined.

useBeforeDef :: Set VarName -> Stmt AExp' BExp -> (Set VarName, Set VarName)
useBeforeDef defs Skip = (defs, Set.empty)
useBeforeDef defs (Assign x a) = (Set.insert x defs, varsA a `Set.difference` defs)

What should the other cases do? Remember, you have to be sound: the variable in the first part of the pair (the defined variables) must always be defined; if it’s at all possible for a variable to undefined, it must not appear in the first part. Similarly, if it’s at all possible for variable to ever be used before it’s defined, it must appear in the second part.

With these guiding principles, what should we do for Seq s1 s2? Everything s1 defines will be defined for s2. The final set of definitions will also include what s2 defines. What about the the variables that are used before they’re defined? If x is used in s1 before it’s defined, it doesn’t matter if it’s later defined in s2—it’s too late.

What about If b s1 s2? It’s too hard to know anything about the condition b. But if we can be certain that both branches define a variable, then we can be certain that it’ll be defined at the end. Conversely, if either branch could use a given variable before it’s defined, then that variable could potentially be used before being defined.

Once you know how If and Seq works, you should have the general principle for While. Sketch it out on the board!

useBeforeDef _ _ = undefined

Be very careful testing your function. Strive for soundness. The tests below show the results for my useBeforeDef—don’t feel obligated to do better, but don’t do worse. You can modify or delete these tests—my grader ignores them.

testUnbound, testAmbiguous :: Bool
testUnbound = useBeforeDef Set.empty unboundY ==
              (Set.singleton "x", Set.singleton "y")

testAmbiguous = useBeforeDef Set.empty (ambiguous (Bool True)) ==
                (Set.singleton "x", Set.singleton "y")

Problem 4: Mission Impossible

Your final task is to solve the halting problem. We’ll start by writing a function that runs a program a little bit—just one “step”. Then we’ll look at the trace of steps the program takes. If we ever end up in a state we’ve seen before, then the program diverges. This is a dynamic analysis, since we’ll be running our programs.

First, fill in the step function below.

type Config = (Store, Stmt AExp BExp)

step :: Config -> Maybe Config
step (_,Skip) = Nothing
step (st,Assign x a) = Just (Map.insert x (evalA st a) st,Skip)
step (st,Seq Skip s2) = Just (st,s2)
step (st,Seq s1 s2) = undefined
step (st,If b s1 s2) = undefined
step (st,While b s) = undefined

Given a step function, we can compute a trace, i.e., the possibly infinite list of Configs that the program will step through. Such a program is safe to write in Haskell because Haskell is lazy, i.e., it will only compute things on demand.

trace :: (a -> Maybe a) -> a -> [a]
trace f v =
  case f v of
    Nothing -> [v]
    Just v' -> v:trace f v'

I may have gotten excited earlier when I said we’d “solve” the halting problem. We can try to solve it, but sometimes we’ll have to throw up our hands and say “Who knows?”. To facilitate that, we’ll use three-valued logic, which extends the booleans with a notion of “don’t know”.

data TVL = No | Maybe | Yes deriving (Show, Eq, Ord)

Write a function diverges that checks for loops in a list of configurations. (Note that I’ve written a much more general type.) The integer paramter should serve as a timeout—a limit as to how far we’re willing to look.

What counts as a loop? Each element in the list will represent a Config, i.e., a pair of a store and a statement currently being executed. If we ever see the same pair twice, we know the program diverges because our programs are deterministic, i.e., they do the same thing every time. So your job is to check for duplicate configurations, i.e., elements that appear more than once in the loop. A wise choice of data structure here will make your life easier (and speed up your program).

diverges :: Ord a => Int -> [a] -> TVL
diverges limit = undefined

Write a function haltsIn that takes a starting configuration and a limit and tries to determine whether that configuration ever halts (within the specified limit, from the empty store).

haltsIn :: Stmt AExp BExp -> Int -> TVL
haltsIn s limit = undefined

Now we have our analysis… let’s see what it can do. Write a While program loop that diverges and:

loop `haltsIn` 1000 == No
loop :: Stmt AExp BExp
loop = undefined

Write a While program long that converges and:

long `haltsIn` 1000 == Maybe
long `haltsIn` 5000 == Yes
long :: Stmt AExp BExp
long = undefined

Write a While program tricky that diverges but for all n:

tricky `haltsIn` n == Maybe
tricky :: Stmt AExp BExp
tricky = undefined

Explain why your haltsIn gives an imprecise answer.

Do you think you can write a program where haltsIn gives a wrong answer? If so, explain your idea—or write it! If not, explain (or prove!) why not.