Determinism costs! A NFA with exponentially bigger DFA
Two well known mathematical models of computation are finite automaton, either Nondeterministic or deterministic: NFA and DFA, respectively. Intuitively, whereas NFAs can be "smaller", it is "simpler" to match strings against DFAs. In this post, we investigate a NFA that has an exponentially larger DFA, showing off the "cost of determinism".
First, lets define DFAs and NFAs:
DFAs
A finite automaton is:
 A (finite) set of states with finitelymany transitions between states,
 Each transition has a singlecharacter label,
 Each state has an outgoing transition with each label,
 Some states are "initial", and some states are "accepting".
Here's an example of an automaton:
Here, 0 is initial, and 2 is accepting. We can "consume" a string of characters (also referred to as "matching" the string by the DFA) by taking appropriatelylabelled transitions between states: in each state we consider the next character of the string, taking a transition labelled with that character, and then considering the next character, and so on. We say a word (sequence of characters) is "accepted" by the automaton if we can consume the string by starting in an initial state, and ending (i.e. when the whole string is consumed) in an accepting state.
For example, the word "xyyx" is accepted by the previous automaton, but "yyy" is not (we can't reach 2, starting at 0).
The "language" of an automaton is the collection of all strings that it accepts. If an initial state is also accepting, then the automaton can accept the empty string, written as ε.
The previous example of an automaton is "deterministic": for each state, and a particular label choice, there is exactly one outgoing transition with that label.
I'll now give some more examples, along with their languages.
A DFA, but has an empty language (no accepting states):
Not a DFA  multiple transitions from state 0 with label 'a':
A DFA that has the language of all evenlength binary strings:
ε, 00, 01, 10, 11, 0000, 0001, ...
NFAs
A nondeterministic finite automaton (NFA), lifts the restriction on transitions: we can have multiple (or zero!) transitions from a single state with the same label. For example:
which has the language of all words beginning with 'a', and finishing with zero or more 'b's, or those beginning with 'a' and ending with zero or more 'c's.
Now, why would we want to use a NFA instead of a DFA? The answer is "compactness": we can use fewer states to recognise the same language. Indeed, NFAs and DFAs can recognise exactly the same languages. The drawback is that the NFA is "harder" to match a string against: since we can have a choice of transitions to take, we must try each, accepting a string if any of our choices succeed (finish in an accepting state)^{1}.
So, to summarise, a DFA is more efficient to "run" on a string, but a NFA is more efficient to represent, a classic timevsspace tradeoff!
Now the question is, how much more compact is an NFA vs. a DFA. Or, put differently, how much bigger is a DFA compared to an NFA recognising the same language?
A Tricky Case
Consider the language of binary strings that have a '1' at a fixed position within the string (say two characters from the end of the string, obviously forcing the length of the words to be >= 3):
(01)^{*}1(01)^{2}
for example, the following strings are in the language:
100, 00111, 10101, 111111
and the following are not:
ε, 000, 11011, 00001
Now, in general, this language is, for any natural number n (which is the minimum length of the words):
(01)^{*}1(01)^{(n  1)}
but we'll stick with the n=3 case, for now. We can easily design an NFA for this language:
Now, what does the DFA that accepts this language look like? To find out, we can perform the powerset construction to "determinise" a NFA. We can easily (but tediously) perform the powerset construction, creating a table to record the possible set of states the NFA can be in (after starting in the set containing the initial state), and check how this set changes as we take transitions with either label:
NFA State Set  0 Label  1 Label  DFA State 

{0}  {0}  {0,1}  0 
{0,1}  {0,2}  {0,1,2}  1 
{0,2}  {0,3}  {0,1,3}  2 
{0,1,2}  {0,2,3}  {0,1,2,3}  3 
{0,3}  {0}  {0,1}  4 
{0,1,3}  {0,2}  {0,1,2}  5 
{0,2,3}  {0,3}  {0,1,3}  6 
{0,1,2,3}  {0,2,3}  {0,1,2,3}  7 
Now, we draw this as a DFA: the start state is the set containing the initial states and the accepting states are those states whose set contains a accepting state of the original NFA:
We've gone from a 4 state NFA to a 8 state DFA, and in general, from a n + 1 state NFA to a 2^{n} state DFA  this is the "exponential blowup" of converting a NFA to DFA.
Now, let's see if we can see why we've had this blowup.
Explaining the blowup
As a first step to explaining the structure of the DFA, let's create a DFA to match a subset of the final language:
The language of this DFA is 0^{*}1(01)^{2}
in
other words, we can match the desired strings, as long as the first '1' appears
2 characters before the end of the string.
Now, consider what happens if we have more characters to match, when we reach state 4,5,6 or 7; in other words, the '1' we matched when moving between state 0 and 1 wasn't the '1' that was 2 characters before the end of the string.
Call the 1 that seperates the two sections of the word the switch; the initial part is the arbitrarylength leader and the final part is the fixedlength trailer.
The NFA is able to choose a particular '1' to be the switch, by nondeterministically (or indeed, trying one choice, and if it fails, backtracking to try the other choice) treating the '1' as part of the leader, or as the switch. Only once the NFA has made this choice does it match the fixedlength trailer.
A DFA can't have this nondeterministic choice, so every time the DFA sees a '1', it must allow it to be either the switch or part of the leader. This insight shows us the way to understanding the DFA: once we've seen two more characters, we can decide if the '1' was part of the leader, or the switch, depending on whether there are any further characters to consume. Indeed, after seeing these two extra characters, we have 8 choices for what to do next: 2 choices for each 2character string (one for '0' and one for '1'). These choices are:
2 additional characters  0 action  1 action 

00  Go back to the initial state (no '1' in the last 3 chars)  Go back to the state where we've just seen the potential switch 
01  Go the state reached after seeing "0" after the switch  Go to the state reached after seeing "1" after the switch 
10  Go to the state reached after seeing "00" after the switch  Go to the state reached after seeing "01" after the switch 
11  Go the state reached after seeing "10" after the switch  Stay in the same state 
In other words, each of the 4 choices leads to a different pair of states, thus we must have a different state for each of these 4 choices. If we add in transitions as desribed in the previous table, we obtain the same DFA as was generated by the subset construction.
Now, lets prove that we must have exponentially more states for this particular language.
Why must we have exponentially more states?
Assume that we have a DFA with fewer than 2^{3}
states
and has language (01)^{*}1(01)^{2}
.
Now, by the pidgeonhole principle, there must be 2
distinct strings in (01)^{3}
, that lead to the same
state of the DFA. Let us call these two strings x
and y
; since they are
different, there must be a position, i
(with 0 < i < x
) in x
and
y
, at which x[i] /= y[i]
, let us assume^{2} that x[i] = 1
and y[i] =
0
.
Now, let z = 11
, clearly, we have z ∈
(01)^{2}
, so it is a valid trailer. Now, we have that "xz" is
in the langauge, since it has the switch and the required two characters and
that "yz" is not in the language, since it has a '0' in the switch position,
which should be '1'. However, both composite words lead to the same state (we
have added a common suffix to two strings that lead to the same state), which
is a contradiction. Our only assumption was that we had fewer than
2^{3}
states, which must therefore be a bad assumption.
Put differently, we must have 2^{3}
states in any DFA
reconising this tricky language! Indeed, we can easily generalise the argument
to any n
, not just 3.
And we're done! To sum up:
Summary
 NFAs and DFAs recognise the same set of languages, the "regular languages",
 NFAs can be exponentially smaller than DFAs recognising the same language,
 DFAs, however, are simpler to compute with,
 The pidgeonhole principle can be used to prove that we must have this blowup.

An alternative approach is to use backtracking, where note the point at which we made a choice, and if we fail (consume the string without ending in an accepting state), we reverse backwards in the string and retrace our steps to the noted state, and try again, taking the next choice of transition instead, and so on. ↩

A symmetric argument follows if the values are swapped. ↩