CS52 - Fall 2015 - Class 23

Lecture notes

Admin
   - Assignment 9 & 10
   - Office hours might change on Thursday
   - Mentors for next semester (applications due Wednesday!)

deterministic finite automata (DFA) review
   - basic idea
      - we have a set of states (indicated by circles)
      - we have a start state where computation (indicated by an arrow)
      - we have a collections of final states (indicate by states with an inner circle)
      - for each state and each letter in our alphabet, we have a transition to another state

   - computing
      - we have a string as input on a tape
      - we start at the beginning of the string
      - read a symbol from the tape and transition to the state indicated by the model
      - if:
         - we end in a final state (i.e. get to the end of the string) we accept the string
         - otherwise, if when we get to the end of the string/tape we're in a non-final state we reject

DFAs over numbers
   - we can use any alphabet we want
   - if we use 1's and 0's we can interpret them as binary numbers!

   - greater_5 (6): determines if the input string, when interpreted as a binary number, is greater than 5

   - write a DFA that determines if a number is odd
      - look at odd_number

non-deterministic finite automata (NFA)
   - almost identical definition to DFA except:
      - for a given state and input, can go to zero, one or *more* states (rather than just a single one for DFAs)

      - can have epsilon (or sometimes called lambda) transitions from one state to another
         - doesn't read anything from the input, just transitions


      - do not require that there is a transition for every alphabet letter for every state
         - if you encounter a state without a transition for a particular letter, it does *not* accept that path

   - tend to be a bit easier to create than DFAs

some NFA examples (found in NFA_examples)
   - start_end_a (1): start and end with a
      - a(a|b)*a

   - 2_or_3_a (2): strings of a's that have lengths divisible by 2 OR 3
      - (aa)*|(aaa)*

   - end_aa (3): ends in two a's
      - (a|b)*aa

   - aa_bb (4): has either aa or bb as a substring
      - (a|b)*(aa|bb)(a|b)*

   - no_bb (5): any string of a's and b's that doesn't have two adjacent b's
      - a*b(aa*b)*a*

Do NFAs give us more power, i.e. are there some languages that we can recognize with NFAs that we cannot recognize with DFAs?
   - how would we show that NFAs are more powerful?
      - find a language that can be represented by an NFA, but cannot be represented by a DFA

   - how would we show that they're not more powerful?
      - if we can show that for any NFA there is an equivalent DFA and vice versa, then we can show that they are equivalent, i.e. have the same representative power

   - Given an DFA, how can we create an equivalent NFA?
      - Easy... don't do anything!

   - Given an NFA, how can we create an equivalent DFA?

   - Consider the end_aa NFA (found in NFA_examples)
      - Is aaaaa in the language?
         - what states could we be in after reading the first a?
            - q_0 or q_1
         - what states would we be in after reading the second a?
            - we could start in either q_0 *or* q_1
               - if we were in q_0, we'd end up in q_0 or q_1
               - if we were in q_1, we'd end up in q_2
            - therefore, after reading two a's we could end up in any of: q_0 or q_1 or q_2
         - the third a?
            - we could be in q_0 or q_1 or q_2
               - if we were in q_0 -> q_0 or q_1
               - if we were in q_1 -> q_2
               - if we were in q_2 -> reject
                  - does this matter?
                  - No. We only need to find *one* path through the state transitions that ends in an accepting state
         - the fourth a?
            - q_o or q_1 or q2
         - the fifth a?
            - q_o or q_1 or q2
            - since q2 is *an* option, then there's a set of transitions between the states that gets us to an accepting state

      - Is ababaa in the language?
         - what states would we be in after reading the first a?
            - q_0 or q_1
         - second b?
            - q_0
         - third letter, a?
            - q_0 or q_1
         - fourth letter, b?
            - q_0
         - fifth letter, a?
            - q_0 or q_1
         - sixth letter, a?
            - q_0 or q_1 or q_2
            - since q_2 is *an* option, then there's a set of transitions between the states that gets us to an accepting state

      - is ababa in the language?
         - a:
            - q_0 or q_1
         - b:
            - q_0
         - a:
            - q_0 or q_1
         - b:
            - q_0
         - a:
            - q_0 or q_1
            - reject: no way to get to an accepting state

Constructing a DFA from an NFA
   - the basic idea is that we're going to create NFA states that represent one or more of the NFA states
      - DFA state [Q] (where Q is 1 or more NFA states) will transition to DFA state [Q'] on letter l if there exists a transition to every q' \in Q' on letter l from *some* q \in Q
      - start state is [q_0]
      - accepting states are any [Q] where at least one q \in Q is an accepting state in the NFA

   - how many DFA states can we have at most?
      - 2^k - 1 where k is the number of NFA states
         - think of each DFA state like a k bit number
            - 1 if it represents the original NFA state
            - 0 otherwise
         - can't have all zeros, so 2^k - 1

   - one algorithm
      - create the start state (q_0)
      - add [q_0] to process queue
      - as long as process queue isn't empty:
         - remove state s from process queue
         - new_s = []
         - for each letter l in the alphabet
            - if any "old state", q_i, in s has a transition q_i: l -> q_j
               - add q_j to new_s

         - if new_s doesn't exist already
            - create state new_s
            - add new_s to process q
      - if any states don't have transitions for all letters in the alphabet, create a "sink" state that transitions to itself on all letters and have states transition to here for any remaining alphabet letters

A few examples (in NFA examples):
- We can construct a DFA from the end_aa NFA: end_aa_DFA

- We can construct a DFA from the start_end_a NFA: start_end_a_DFA

Does this show that DFAs and NFAs are equivalent?
- Yes, given either one (a DFA or NFA) we can create through a deterministic process a corresponding machine of the other type
- Therefore, they can process/accept the same set of languages

We can handle lambda/epsilon transitions in a similar way
- I'll let you figure out/investigate for those that are curious :)

regular language
   - any language that can be described by a DFA (or an NFA, remember, they're equivalent)
   - any language that can be described by a regular expression!
      - how would you prove this?

What languages are *not* regular?
   - 0^n 1^n for any n
      - i.e. the language of some number of zeros followed by the *same* number of 1s

   - why not?
      - can you come up with a regular expression for this language?
         - seems hard, since there's no tool for us to count
      - can you come up with a DFA or NFA for this language?
         - would have to have 2^(n+1) states
            - states are the only way we can count
         - only problem is that n isn't finite!
            - consider any DFA that recognizes strings of 0^n 1^n for some fixed n
            - won't recognize string O^(n+1) 1^(n+1)

   - This is a bit of a "hand-wavy" proof
      - see the pumping lemma (or take CS81) to see more concrete proof