Determinism costs! A NFA with exponentially bigger DFA

Two well known mathematical models of computation are finite automaton, either Non-deterministic or deterministic: NFA and DFA, respectively. Intuitively, whereas NFAs can be "smaller", it is "simpler" to match strings against DFAs. In this post, we investigate a NFA that has an exponentially larger DFA, showing off the "cost of determinism".

First, lets define DFAs and NFAs:

DFAs

A finite automaton is:

  • A (finite) set of states with finitely-many transitions between states,
  • Each transition has a single-character label,
  • Each state has an outgoing transition with each label,
  • Some states are "initial", and some states are "accepting".

Here's an example of an automaton:

NFA with hash 0fffc4b1005038f00af6f4ed39df5383

Here, 0 is initial, and 2 is accepting. We can "consume" a string of characters (also referred to as "matching" the string by the DFA) by taking appropriately-labelled transitions between states: in each state we consider the next character of the string, taking a transition labelled with that character, and then considering the next character, and so on. We say a word (sequence of characters) is "accepted" by the automaton if we can consume the string by starting in an initial state, and ending (i.e. when the whole string is consumed) in an accepting state.

For example, the word "xyyx" is accepted by the previous automaton, but "yyy" is not (we can't reach 2, starting at 0).

The "language" of an automaton is the collection of all strings that it accepts. If an initial state is also accepting, then the automaton can accept the empty string, written as ε.

The previous example of an automaton is "deterministic": for each state, and a particular label choice, there is exactly one outgoing transition with that label.

I'll now give some more examples, along with their languages.

A DFA, but has an empty language (no accepting states):

NFA with hash d11a39d0592ccdf4fdac0896b1a0b1e9

Not a DFA - multiple transitions from state 0 with label 'a':

NFA with hash 50ba84e91c215fa557439ad79e6eb6a1

A DFA that has the language of all even-length binary strings:

ε, 00, 01, 10, 11, 0000, 0001, ...

NFA with hash 8d5cc97fb3877cfb0929d233cac28596

NFAs

A non-deterministic finite automaton (NFA), lifts the restriction on transitions: we can have multiple (or zero!) transitions from a single state with the same label. For example:

NFA with hash a0c4c207e61ab1ada4c1ce751987d05c

which has the language of all words beginning with 'a', and finishing with zero or more 'b's, or those beginning with 'a' and ending with zero or more 'c's.

Now, why would we want to use a NFA instead of a DFA? The answer is "compactness": we can use fewer states to recognise the same language. Indeed, NFAs and DFAs can recognise exactly the same languages. The drawback is that the NFA is "harder" to match a string against: since we can have a choice of transitions to take, we must try each, accepting a string if any of our choices succeed (finish in an accepting state)1.

So, to summarise, a DFA is more efficient to "run" on a string, but a NFA is more efficient to represent, a classic time-vs-space tradeoff!

Now the question is, how much more compact is an NFA vs. a DFA. Or, put differently, how much bigger is a DFA compared to an NFA recognising the same language?

A Tricky Case

Consider the language of binary strings that have a '1' at a fixed position within the string (say two characters from the end of the string, obviously forcing the length of the words to be >= 3):

    (0|1)*1(0|1)2

for example, the following strings are in the language:

100, 00111, 10101, 111111

and the following are not:

ε, 000, 11011, 00001

Now, in general, this language is, for any natural number n (which is the minimum length of the words):

    (0|1)*1(0|1)(n - 1)

but we'll stick with the n=3 case, for now. We can easily design an NFA for this language:

NFA with hash 4d306b466f4a91d26cb0dec57205c5d8

Now, what does the DFA that accepts this language look like? To find out, we can perform the powerset construction to "determinise" a NFA. We can easily (but tediously) perform the powerset construction, creating a table to record the possible set of states the NFA can be in (after starting in the set containing the initial state), and check how this set changes as we take transitions with either label:

NFA State Set 0 Label 1 Label DFA State
{0} {0} {0,1} 0
{0,1} {0,2} {0,1,2} 1
{0,2} {0,3} {0,1,3} 2
{0,1,2} {0,2,3} {0,1,2,3} 3
{0,3} {0} {0,1} 4
{0,1,3} {0,2} {0,1,2} 5
{0,2,3} {0,3} {0,1,3} 6
{0,1,2,3} {0,2,3} {0,1,2,3} 7

Now, we draw this as a DFA: the start state is the set containing the initial states and the accepting states are those states whose set contains a accepting state of the original NFA:

NFA with hash d493b6ff5dda16e41c02867cb036f8ff

We've gone from a 4 state NFA to a 8 state DFA, and in general, from a n + 1 state NFA to a 2n state DFA - this is the "exponential blowup" of converting a NFA to DFA.

Now, let's see if we can see why we've had this blowup.

Explaining the blowup

As a first step to explaining the structure of the DFA, let's create a DFA to match a subset of the final language:

NFA with hash 91c56ff2fda8af18fd2533bc2db133b2

The language of this DFA is 0*1(0|1)2 in other words, we can match the desired strings, as long as the first '1' appears 2 characters before the end of the string.

Now, consider what happens if we have more characters to match, when we reach state 4,5,6 or 7; in other words, the '1' we matched when moving between state 0 and 1 wasn't the '1' that was 2 characters before the end of the string.

Call the 1 that seperates the two sections of the word the switch; the initial part is the arbitrary-length leader and the final part is the fixed-length trailer.

The NFA is able to choose a particular '1' to be the switch, by non-deterministically (or indeed, trying one choice, and if it fails, backtracking to try the other choice) treating the '1' as part of the leader, or as the switch. Only once the NFA has made this choice does it match the fixed-length trailer.

A DFA can't have this non-deterministic choice, so every time the DFA sees a '1', it must allow it to be either the switch or part of the leader. This insight shows us the way to understanding the DFA: once we've seen two more characters, we can decide if the '1' was part of the leader, or the switch, depending on whether there are any further characters to consume. Indeed, after seeing these two extra characters, we have 8 choices for what to do next: 2 choices for each 2-character string (one for '0' and one for '1'). These choices are:

2 additional characters 0 action 1 action
00 Go back to the initial state (no '1' in the last 3 chars) Go back to the state where we've just seen the potential switch
01 Go the state reached after seeing "0" after the switch Go to the state reached after seeing "1" after the switch
10 Go to the state reached after seeing "00" after the switch Go to the state reached after seeing "01" after the switch
11 Go the state reached after seeing "10" after the switch Stay in the same state

In other words, each of the 4 choices leads to a different pair of states, thus we must have a different state for each of these 4 choices. If we add in transitions as desribed in the previous table, we obtain the same DFA as was generated by the subset construction.

Now, lets prove that we must have exponentially more states for this particular language.

Why must we have exponentially more states?

Assume that we have a DFA with fewer than 23 states and has language (0|1)*1(0|1)2.

Now, by the pidgeonhole principle, there must be 2 distinct strings in (0|1)3, that lead to the same state of the DFA. Let us call these two strings x and y; since they are different, there must be a position, i (with 0 < i < |x|) in x and y, at which x[i] /= y[i], let us assume2 that x[i] = 1 and y[i] = 0.

Now, let z = 11, clearly, we have z ∈ (0|1)2, so it is a valid trailer. Now, we have that "xz" is in the langauge, since it has the switch and the required two characters and that "yz" is not in the language, since it has a '0' in the switch position, which should be '1'. However, both composite words lead to the same state (we have added a common suffix to two strings that lead to the same state), which is a contradiction. Our only assumption was that we had fewer than 23 states, which must therefore be a bad assumption. Put differently, we must have 23 states in any DFA reconising this tricky language! Indeed, we can easily generalise the argument to any n, not just 3.

And we're done! To sum up:

Summary

  • NFAs and DFAs recognise the same set of languages, the "regular languages",
  • NFAs can be exponentially smaller than DFAs recognising the same language,
  • DFAs, however, are simpler to compute with,
  • The pidgeon-hole principle can be used to prove that we must have this blowup.

  1. An alternative approach is to use backtracking, where note the point at which we made a choice, and if we fail (consume the string without ending in an accepting state), we reverse backwards in the string and retrace our steps to the noted state, and try again, taking the next choice of transition instead, and so on. 

  2. A symmetric argument follows if the values are swapped.