CS 334
Programming Languages
Spring 2002

Lecture 11

Summary of types so far:

postpone ADT's until later

Modern tendency to strengthen static typing and avoid implicit holes in types system.

- usually explicit (dangerous ) means for bypassing types system, if desired

Try to push as many errors to compile time as possible by:

Requiring overspecification through typing
Distinguishing btn diff. uses of same types (name equiv.)
Mandating constructs designed to eliminate typing holes
Minimizing or eliminating use of explicit pointers (esp. user-controlled deallocation of ptrs).

Problem: loss of flexibility which obtainable from dynamic typing or lack of any typing.

Important direction of current research in computer science:

Provide type safety, but increase flexibility.

Important progress over last 20 years:

Polymorphism, ADT's, Subtyping & other aspects of object-oriented languages.

COMMANDS OR STATEMENTS:

Change "state" of machine.

State of computer corresponds to contents of memory and any external devices (I/O)

State sometimes called "store"

Note distinction between "state" and "environment". Environment is mapping between identifiers and values (including locations). State includes mapping between locations and values.

Values in store or memory are "storable" versus "denotable" (or "bindable")

Symbol table depends on declarations and scope - static

Environment tells where to find values - dynamic

State depends on previous computation - dynamic

If have compiler, use symbol table when generating code to determine meaning of all identifiers. At run-time, symbol table no longer needed (hard coded into compiled code), but state and environment change dynamically.

In interpreter, may have to keep track of symbol table, environment, and state at run-time. (In fact could avoid using state if there is no "aliasing" in the language.)

Assignment:

   vble := expressions

Order of evaluation can be important, especially if there are side-effects. Usually left-side evaluated first, then right-side.

		A[f(j)] := j * f(j) + j --

difficult to predict value if f has side effect of changing j

Two kinds of assignments:

assignment by copying and
assignment by sharing (often handy w/dynamic typing or OOL's)

Most statements are actually control structures for combining other expressions and statements:

Selection: If .. then ... else ...
Repetition: while ... do ...

FORTRAN started with very primitive control structures:

GO TO n
GO TO (17, 43, 12, 99), I (also other variants)
IF(arith exp) 17, 43, 12 means go to statement number 17 if arith exp is negative, 43 if zero, and 12 if positive
DO label ivble = 1, 20, 2

Very close to machine instructions

Why need repetition - can do it all with goto's?

"The static structure of a program should correspond in a simple way with the dynamic structure of the corresponding computation." Dijkstra letter to editor.

ALGOL 60 more elaborate:

GO TO 99
IF ... THEN ... ELSE .... (hierarchical)
for i := 3, 7, 11 step 1 until 16, i/2 while i >= 1, 2 step i until 32 do ..
BAROQUE, all expressions re-eval each time through loop:
```
	3,  7, 11, 12, 13, 14, 15, 16, 8, 4, 2, 1, 2, 4, 8, 16, 32.
    
```
switch - like in C/C++/Java.

Pascal expanded but simplified:

go to
if .. then .. else
for, while, repeat (confusion w/positive vs. negative exit)
labelled case - Hoare's most important invention
clear & efficient, construct jump table, optimize depending on size,
self-documenting.
Modula 2 improved by adding otherwise clause
ML's pattern matching is compiled into a case statement:
```
   fun reverse l = case l of  
                       nil => nil |
                       (h::rest) => (reverse rest)@[h];
		       
```
A non-exhaustive match error occurs if all possible cases are not handled. A MATCH exception is raised if no match is found at run-time. Interestingly, if-then-else expressions also translate as case statements:
```
   case cond of
          true => e1 |
          false => e2;
   
```

Ada like Pascal but more uniform loop with exit

		iteration specification loop 
			loop body
		end loop.

where iteration specification can be:

while condition,
for vbl in discrete range (e.g. for i in 1..10 loop .. end loop)
(Note: loop vble implicitly declared - restricted scope)

Also can have vanilla loop which can be left w/ exit statement.

Also provide exit when ...., syntactic sugar for if .. then exit

Can also exit from several depths of loops

Interesting theoretical result of Bohm and Jacopini (1966) that every flowchart can be programmed entirely in terms of sequential, if, and while commands.

Natural Semantics for commands

Can write natural semantics for various commands:

With commands must keep track of store: locations -> storable values.

If expressions can have side-effects then must update rules to keep track of effect on store. Rewriting rules now have conclusions of form (e, ev, s) >> (v, s') where v is a storable value, ev is an environment (mapping from identifiers to denotable values - including locations), s is initial state (or store), and s' is state after evaluation of e.

    (b, ev, s) >> (true, s')    (e1, ev, s') >> (v, s'')
    ------------------------------------------------------
          (if b then e1 else e2, ev, s) >> (v, s'')

Thus if evaluation of b and e1 have side-effects on memory, then show up in "answer".

Axioms - no hypotheses!

    (id, ev, s) >> (s(loc), s)        where  loc = ev(id)
    (id++, ev, s) >> (v, s[loc:=v+1]) where  loc = ev(id), v = s(loc)

Note s[loc:=v+1] is state, s', identical to s except s'(loc) = v+1.

    (e1, ev, s) >> (v1, s')    (e2, ev, s') >> (v2, s'')
    ------------------------------------------------------
            (e1 + e2, ev, s) >> (v1 + v2, s'')

When evaluate a command, "result" is a state only.

E.g.,

        (e, ev, s) >> (v, s')
    ------------------------------   where ev(x) = loc
    (x := e, ev, s) >> s'[loc:=v]

    (C1, ev, s) >> s'    (C2, ev, s') >> s''
    ------------------------------------------
             (C1; C2, ev, s) >> s''

    (b, ev, s) >> (true, s')   (C1, ev, s') >> s''
    ------------------------------------------------
          (if b then C1 else C2, ev, s) >> s''

+ similar rule if b false


     (b, ev, s) >> (false, s')
    ---------------------------
    (while b do C, ev, s) >> s'

    (b, ev, s) >> (true, s')    (C, ev, s') >> s''   
             (while b do C, ev, s'') >> s'''
    ------------------------------------------------
              (while b do C, ev, s) >> s'''

Notice how similar definition of semantics for

    while E do C

is to

    if E then begin 
        C; 
        while E do C 
    end

Iterators

Clu allows definition of user-defined iterators (abstract over control structures):

        for c : char in string_chars(s) do ...

where have defined:

        string_chars = iter (s : string) yields (char);
            index : Int := 1;
            limit : Int := string$size (s);
            while index <= limit do
                yield (string$fetch(s, index));
                index := index + 1;
            end;
        end string_chars;

Behave like restricted type of co-routine.

Each time at top of loop continue executing iterator code from where last left off.
When hit "yield" statement then return the associated value.
When hit end of iterator, quit loop.

Can be implemented on stack similarly to procedure call.

Now available in Java and C++ using object-oriented features to retain state of traversal.

Exceptions

Need mechanism to handle exceptional conditions.

Example: Using a stack, and try to pop element off of empty stack.

Clearly corresponds to mistake of some sort, but stack module doesn't know how to respond.

In older languages main way to handle is to print error message and halt or include boolean flag in every procedure telling if succeeded. Then must remember to check!

Another option is to pass in a procedure parameter which handles exceptions.

Exception mechanisms in programming languages:

Can raise an exception and send back to caller who is responsible for handling exception.

Call program robust if recovers from exceptional conditions, rather than just halting (or crashing).

Typical exceptions:

Arithmetic or I/O faults (e.g., divide by 0, read int and get char, array or subrange bounds, etc.),
failure of precondition,
unpredictable conditions (read past end of file, end of printer page, etc.),
tracing program flow during debugging.

When exception is raised, it must be handled or program will fail!

Exception handling in Ada:

Raise exception via: raise excp_name

Attach exception handlers to subprogram body, package body, or block.

Ex:

    begin
        C
    exception
        when excp_name1 => C'
        when excp_name2 => C''
        when others => C'
    end

When raise an exception, where do you look for handler? In most languages, start with current block (or subprogram). If not there, force return from unit and raise same exception to routine which called current one, etc., up the call chain until find handler or get to outer level and fail. (Clu starts at calling routine.)

Semantics of raising and handling exceptions is dynamic rather than static!

Handler can attempt to handle exception, but give up and call another exception.

Resuming after exceptions

What happens after have found exception handler and successfully executed it (i.e., no further exceptions raised)?

In Ada and Java, return from the procedure (or block) containing the handler - called termination model.

PL/I has resumption model - go back to re-execute statement where failure occurred (makes sense for read errors, for example) unless GOTO in handler code.

Eiffel (an OOL) uses variant of resumption model.

Exceptions in ML can pass parameter to exception handlers (like datatype defs). Otherwise very similar to Ada.

Example:

datatype 'a stack = EmptyStack | Push of 'a * ('a stack);
exception empty;

fun pop EmptyStack = raise empty
  | pop(Push(n,rest)) = rest;

fun top EmptyStack = raise empty
  | top (Push(n,rest)) = n;

fun IsEmpty EmptyStack = true
  | IsEmpty (Push(n,rest)) = false;
  
exception nomatch;
 
fun buildstack nil initstack = initstack
  | buildstack ("("::rest) initstack = buildstack rest (Push("(",initstack))
  | buildstack (")"::rest) (Push("(",bottom)) = bottom
  | buildstack (")"::rest) initstack = raise nomatch
  | buildstack (fst::rest) initstack = buildstack rest initstack;
        
fun balanced string = (buildstack (explode string) = EmptyStack) 
                                                     handle nomatch => false;

Notice awkwardness in syntax. Need to put parentheses around the expression to which the handler is associated!

Some would argue shouldn't use exception nomatch since really not unexpected situation. Just a way of introducing goto's in code!

ABSTRACTION

Distinction between what something does and how it does it.

Interested in supporting abstraction (separation between what and how).

Originally, designers attempted to create languages w/ all types and statements that were necessary.

Realized quickly that needed extensible languages.

First abstractions for statements and expressions - Procedures and Functions

Arrays and records, then pointers introduced to build new types and operations on them.

Built-in types have associated operations - representation is hidden (for most part)

Support of ADT's is most important innovation of 1970's.

Simula 67 - package op's w/ data types - representation not hidden

Clu, Mesa, Modula-2, Ada, Smalltalk

Come back to them in Chapter 9.

Iterators correspond to abstraction over control structure
- high-order fcns in ML even more so!

More support for abstraction, generally more expressive is language.

Use of parameters supports abstraction -
Creates more flexible program phrases.

Accessing non-local information:

Common, Global variables (in block-structured languages),

Parameters - data, subprograms, types

Data Parameters

1. Call by Reference (FORTRAN, Pascal):

Pass address of actual parameter.

Access via indirection.

What if parameter is expression or constant? CHGTO4(2).

2. Call by Copy (Algol 60, Pascal, C, etc.):

Actual parameter copies value to formal parameter (and/or vice-versa).

value (in), result (out), value-result (in-out)

result and value-result parameters must be variables, value can be any storable value.

Can be expensive for large parameters.

3. Call by Name (Algol-60)

Actual parameter provides expression to formal parameter - re-evaluated whenever accessed.

Ex.

        Procedure  swap(a, b : integer);
            var temp : integer;
            begin
                temp := a;
                a := b;
                b := temp
            end;

Won't always work, e.g.

swap(i, a[i]) with i = 1, a[1] = 3, a[3] = 17.

No way to define a correct swap in Algol-60!

Expressive power - Jensen's device:

To compute: x = Sum for i=1 to n of V_i

    real procedure SUM (k, lower, upper, ak);
        value lower, upper;     
        integer k, lower, upper;
        real ak;
        begin
            real s;
            s := 0;
            for k := lower step 1 until upper do
                s := s + ak;
            sum := s
        end;

What is result of sum(i, 1, m, A[i])?

What about sum(i, 1, m, sum(j, 1, n, B[i,j]))?

If evaluating parameters has side-effects (e.g., read), then must know how and how many times parameter is evaluated to predict what will happen.

Therefore try to avoid call-by-name with expressions with side-effects.

Lazy evaluation is efficient implementation of call-by-name where only evaluate parameter once. Requires that there be no side-effects, since owise get diff. results.

Implement call-by-name using thunks - procedures which evaluate expressions - difficult and slow. Must pass around code for evaluating expression (including environment defined in). Can use the same THUNK's as show up in environment based interpreter.

Note different from call-by-text (which would allow capture of free vbles).

Back to:

CS 334 home page

Kim Bruce's home page

CS Department home page

kim@cs.williams.edu

CS 334 Programming Languages Spring 2002 Lecture 11

CS 334
Programming Languages
Spring 2002

Lecture 11