CATEGORII DOCUMENTE |

Bulgara | Ceha slovaca | Croata | Engleza | Estona | Finlandeza | Franceza |

Germana | Italiana | Letona | Lituaniana | Maghiara | Olandeza | Poloneza |

Sarba | Slovena | Spaniola | Suedeza | Turca | Ucraineana |

+ Font mai mare | - Font mai mic

All of the algorithms we have studied thus far have been *polynomial-time*** algorithms**: on inputs of size

The subject of this chapter, however, is an interesting class of problems, called the 'NP-complete' problems, whose status is unknown. No polynomial-time algorithm has yet been discovered for an NP-complete problem, nor has anyone yet been able to prove a superpolynomial-time lower bound for any of them. This so-called P NP question has been one of the deepest, most perplexing open research problems in theoretical computer science since it was posed in 1971.

Most theoretical computer
scientists believe that the NP-complete problems are intractable. The reason is
that if any single NP-complete problem can be solved in polynomial time, then *every*
NP-complete problem has a polynomial-time algorithm. Given the wide range of
NP-complete problems that have been studied to date, without any progress
toward a polynomial-time solution, it would be truly astounding if all of them
could be solved in polynomial time.

To become a good algorithm designer, you must understand the rudiments of the theory of NP-completeness. If you can establish a problem as NP-complete, you provide good evidence for its intractability. As an engineer, you would then do better spending your time developing an approximation algorithm (see Chapter 37) rather than searching for a fast algorithm that solves the problem exactly. Moreover, many natural and interesting problems that on the surface seem no harder than sorting, graph searching, or network flow are in fact NP-complete. Thus, it is important to become familiar with this remarkable class of problems.

This chapter studies the aspects of NP-completeness that bear most directly on the analysis of algorithms. In Section 1, we formalize our notion of 'problem' and define the complexity class P of polynomial-time solvable decision problems. We also see how these notions fit into the framework of formal-language theory. Section 2 defines the class NP of decision problems whose solutions can be verified in polynomial time. It also formally poses the P NP question.

Section 3 shows how relationships between problems can be studied via polynomial-time 'reductions.' It defines NP-completeness and sketches a proof that one problem, called 'circuit satisfiability,' is NP-complete. Having found one NP-complete problem, we show in Section 4 how other problems can be proven to be NP-complete much more simply by the methodology of reductions. The methodology is illustrated by showing that two formula-satisfiability problems are NP-complete. A variety of other problems are shown to be NP-complete in Section 5.

We begin our study of NP-completeness by formalizing our notion of polynomial-time solvable problems. These problems are generally regarded as tractable. The reason why is a philosophical, not a mathematical, issue. We can offer three supporting arguments.

First, although it is
reasonable to regard a problem that requires time (*n*^{100})
as intractable, there are very few practical problems that require time on the
order of such a high-degree polynomial. The polynomial-time computable problems
encountered in practice typically require much less time.

Second, for many
reasonable models of computation, a problem that can be solved in polynomial
time in one model can be solved in polynomial time in another. For example, the
class of problems solvable in polynomial time by the serial random-access
machine used throughout most of this book is the same as the class of problems
solvable in polynomial time on abstract Turing machines.^{1 }It is also
the same as the class of problems solvable in polynomial time on a parallel
computer, even if the number of processors grows polynomially with the input
size.

See Hopcroft and Ullman [104] or Lewis and Papadimitriou [139] for a thorough treatment of the Turing-machine model. of calls to polynomial-time subroutines, the running time of the composite algorithm is polynomial.

Third, the class of polynomial-time solvable problems has nice closure properties, since polynomials are closed under addition, multiplication, and composition. For example, if the output of one polynomial-time algorithm is fed into the input of another, the composite algorithm is polynomial. If an otherwise polynomial-time algorithm makes a constant number of calls to polynomial-time subroutines, the running time of the composite algorithm is polynomial.

To understand the class of
polynomial-time solvable problems, we must first have a formal notion of what a
'problem' is. We define an *abstract problem** Q* to
be a binary relation on a set *I* of problem ** instances** and a
set

This formulation of an abstract problem is more
general than is required for our purposes. For simplicity, the theory of
NP-completeness restricts attention to** decision problems**: those
having a yes/no solution. In this case, we can view an abstract decision
problem as a function that maps the instance set

Many abstract problems are not decision problems, but
rather ** optimization problems**, in which some value must be
minimized or maximized. In order to apply the theory of NP-completeness to
optimization problems, we must recast them as decision problems. Typically, an
optimization problem can be recast by imposing a bound on the value to be
optimized. As an example, in recasting the shortest-path problem as the
decision problem PATH, we added a bound

Although the theory of NP-completeness compels us to recast optimization problems as decision problems, this requirement does not diminish the impact of the theory. In general, if we can solve an optimization problem quickly, we can also solve its related decision problem quickly. We simply compare the value obtained from the solution of the optimization problem with the bound provided as input to the decision problem. If an optimization problem is easy, therefore, its related decision problem is easy as well. Stated in a way that has more relevance to NP-completeness, if we can provide evidence that a decision problem is hard, we also provide evidence that its related optimization problem is hard. Thus, even though it restricts attention to decision problems, the theory of NP-completeness applies much more widely.

If a computer program is to solve an
abstract problem, problem instances must must be represented in a way that the
program understands. An ** encoding **of a set

The codomain of *e* need not be *binary* strings; any set of
strings over a finite alphabet having at least 2 symbols will do.

Thus, a computer algorithm that 'solves' some abstract
decision problem actually takes an encoding of a problem instance as input. We
call a problem whose instance set is the set of binary strings a ** concrete
problem**. We say that an algorithm

We can now formally define the ** complexity class
P** as the set of concrete decision problems that are solvable in polynomial
time.

We can use encodings to
map abstract problems to concrete problems. Given an abstract decision problem *Q*
mapping an instance set *I* to , an encoding *e : I* ^{*}
can be used to induce a related concrete decision problem, which we denote by *e*(*Q*).
If the solution to an abstract-problem instance *i *I* is *Q*(*i*)
*,
then the solution to the concrete-problem instance *e*(*i*) ^{* }is
also *Q*(*i*). As a technicality, there may be some binary strings
that represent no meaningful abstract-problem instance. For convenience, we
shall assume that any such string is mapped arbitrarily to 0. Thus, the
concrete problem produces the same solutions as the abstract problem on
binary-string instances that represent the encodings of abstract-problem
instances.

We would like to extend the definition of
polynomial-time solvability from concrete problems to abstract problems using
encodings as the bridge, but we would like the definition to be independent of
any particular encoding. That is, the efficiency of solving a problem should
not depend on how the problem is encoded. Unfortunately, it depends quite
heavily. For example, suppose that an integer *k* is to be provided as the
sole input to an algorithm, and suppose that the running time of the algorithm
is (*k*). If
the integer *k* is provided in ** unary**--a string of

The encoding of an abstract problem is therefore quite important to our understanding of polynomial time. We cannot really talk about solving an abstract problem without first specifying an encoding. Nevertheless, in practice, if we rule out 'expensive' encodings such as unary ones, the actual encoding of a problem makes little difference to whether the problem can be solved in polynomial time. For example, representing integers in base 3 instead of binary has no effect on whether a problem is solvable in polynomial time, since an integer represented in base 3 can be converted to an integer represented in base 2 in polynomial time.

We say that a function *f *: ^{* }^{*}
is ** polynomial-time computable** if there exists a polynomial-time
algorithm

Lemma 1

Let *Q* be an
abstract decision problem on an instance set *I*, and let *e*_{1}
and *e*_{2} be polynomially related encodings on *I*. Then, *e*_{1}(*Q*)
P if and only
if *e*_{2}(*Q*) P.

*Proof** *We need only prove the forward direction,
since the backward direction is symmetric. Suppose, therefore, that *e*_{1}(*Q*)
can be solved in time *O*(*n ^{k}*) for some constant

Thus, whether an abstract
problem has its instances encoded in binary or base 3 does not affect its
'complexity,' that is, whether it is polynomial-time solvable or not,
but if instances are encoded in unary, its complexity may change. In order to
be able to converse in an encoding-independent fashion, we shall generally
assume that problem instances are encoded in any reasonable, concise fashion,
unless we specifically say otherwise. To be precise, we shall assume that the
encoding of an integer is polynomially related to its binary representation,
and that the encoding of a finite set is polynomially related to its encoding
as a list of its elements, enclosed in braces and separated by commas. (ASCII
is one such encoding scheme.) With such a 'standard' encoding in
hand, we can derive reasonable encodings of other mathematical objects, such as
tuples, graphs, and formulas. To denote the standard encoding of an object, we
shall enclose the object in angle braces. Thus, *G* denotes the
standard encoding of a graph *G*.

As long as we implicitly use an encoding that is polynomially related to this standard encoding, we can talk directly about abstract problems without reference to any particular encoding, knowing that the choice of encoding has no effect on whether the abstract problem is polynomial-time solvable. Henceforth, we shall generally assume that all problem instances are binary strings encoded using the standard encoding, unless we explicitly specify the contrary. We shall also typically neglect the distinction between abstract and concrete problems. The reader should watch out for problems that arise in practice, however, in which a standard encoding is not obvious and the encoding does make a difference.

One of
the convenient aspects of focusing on decision problems is that they make it
easy to use the machinery of formal-language theory. It is worthwhile at this
point to review some definitions from that theory. An ** alphabet** is a finite
set of symbols. A

There
are a variety of operations on languages. Set-theoretic operations, such as ** union**
and

The** closure** or

where *L ^{k}*
is the language obtained by concatenating

From the point of view of
language theory, the set of instances for any decision problem *Q* is
simply the set *, where = .
Since *Q* is entirely characterized by those problem instances that
produce a 1 (yes) answer, we can view *Q* as a language *L* over = ,
where

For example, the decision problem PATH has the corresponding language

PATH = .(Where convenient, we shall sometimes use the same name--PATH in this case--to refer to both a decision problem and its corresponding language.)

Even if language *L* is accepted by an algorithm *A*,
the algorithm will not necessarily reject a string *x* *L*
provided as input to it. For example, the algorithm may loop forever. A
language *L* is ** decided** by an algorithm

As an example, the
language PATH can be accepted in polynomial time. One polynomial-time accepting
algorithm computes the shortest path from *u* to *v* in *G*,
using breadth-first search, and then compares the distance obtained with *k*.
If the distance is at most *k*, the algorithm outputs 1 and halts.
Otherwise, the algorithm runs forever. This algorithm does not decide PATH,
however, since it does not explicitly output 0 for instances in which the
shortest path has length greater than *k*. A decision algorithm for PATH
must explicitly reject binary strings that do not belong to PATH. For a
decision problem such as PATH, such a decision algorithm is easy to design. For
other problems, such as Turing's Halting Problem, there exists an accepting
algorithm, but no decision algorithm exists.

We can
informally define a ** complexity class** as a set of languages,
membership in which is determined by a

Using this language-theoretic framework, wee can provide an alternative definition of the complexity class P:

P = * : there exists an algorithmthat decides

In fact, P is also the class of languages that can be accepted in polynomial time.

Theorem 2

P = * .*

** Proof** Since the class of languages decided by polynomial-time algorithms is
a subset of the class of languages accepted by polynomial-time algorithms, we
need only show that if

Note that the proof of
Theorem 2 is nonconstructive. For a given language *L* P, we may not
actually know a bound on the running time for the algorithm *A* that
accepts *L*. Nevertheless, we know that such a bound exists, and
therefore, that an algorithm *A*' exists that can check the bound, even
though we may not be able to find the algorithm *A*' easily.

Define the optimization problem LONGEST-PATH-LENGTH as the relation that associates each instance of a undirected graph and two vertices with the length of the longest simple path between the two vertices. Define the decision problem LONGEST-PATH = . Show that the optimization problem LONGEST-PATH-LENGTH can be solved in polynomial time if and only if LONGEST-PATH P.

Give a formal definition for the problem of finding the longest simple cycle in an undirected graph. Give a related decision problem. Give the language corresponding to the decision problem.

Give a formal encoding of directed graphs as binary strings using an adjacency-matrix representation. Do the same using an adjacency-list representation. Argue that the two representations are polynomially related.

Is the dynamic-programming algorithm for the 0-1 knapsack problem that is asked for in Exercise 17.2-2 a polynomial-time algorithm? Explain your answer.

Suppose that a language *L*
can accept any string *x **L *in
polynomial time, but that the algorithm that does this runs in superpolynomial
time if *x **L. *Argue
that* L* can be decided in polynomial time.

Show that an algorithm that makes at most a constant number of calls to polynomial-time subroutines runs in polynomial time, but that a polynomial number of calls to polynomial-time subroutines may result in an exponential-time algorithm.

Show that the class P,
viewed as a set of languages, is closed under union, intersection,
concatenation, complement, and Kleene star. That is, if *L*_{l}, *L*_{2}
P, then *L*_{l}
*L*_{2}
P, etc.

We now look at algorithms that 'verify'
membership in languages. For example, suppose that for a given instance *G, u, v, k* of the
decision problem PATH, we are also given a path *p* from *u* to *v*.
We can easily check whether the length of *p* is at most *k*, and if
so, we can view *p* as a 'certificate' that the instance indeed
belongs to PATH. For the decision problem PATH, this certificate doesn't seem
to buy us much. After all, PATH belongs to P--in fact, PATH can be solved in
linear time--and so verifying membership from a given certificate takes as long
as solving the problem from scratch. We shall now examine a problem for which
we know of no polynomial-time decision algorithm yet, given a certificate,
verification is easy.

The problem of finding a
hamiltonian cycle in an undirected graph has been studied for over a hundred
years. Formally, a ** hamiltonian cycle** of an undirected graph

We can define the ** hamiltonian-cycle
problem,** 'Does a graph

How might an algorithm
decide the language HAM-CYCLE? Given a problem instance* **G*, one possible
decision algorithm lists all permutations of the vertices of *G* and then
checks each permutation to see if it is a hamiltonian path. What is the running
time of this algorithm? If we use the 'reasonable' encoding of a
graph as its adjacency matrix, the number *m* of vertices in the graph is , where *n*
= |*G*| is the
length of the encoding of *G*. There are *m*! possible permutations
of the vertices, and therefore the running time is , which is not
*O*(*n ^{k}*) for any constant

Consider a slightly
easier problem, however. Suppose that a friend tells you that a given graph *G*
is hamiltonian, and then offers to prove it by giving you the vertices in order
along the hamiltonian cycle. It would certainly be easy enough to verify the
proof: simply verify that the provided cycle is hamiltonian by checking whether
it is a permutation of the vertices of *V* and whether each of the
consecutive edges along the cycle actually exists in the graph. This
verification algorithm can certainly be implemented to run in *O*(*n ^{2}*)
time, where

We define a ** verification algorithm **as
being a two-argument algorithm

Intuitively, an algorithm
*A* verifies a language *L* if for any string *x* *L*,
there is a certificate *y* that A can use to prove that *x* *L*.
Moreover, for any string *x* *L* there
must be no certificate proving that *x* *L.* For
example, in the hamiltonian-cycle problem, the certificate is the list of
vertices in the hamiltonian cycle. If a graph is hamiltonian, the hamiltonian
cycle itself offers enough information to verify this fact. Conversely, if a
graph is not hamiltonian, there is no list of vertices that can fool the
verification algorithm into believing that the graph is hamiltonian, since the
verification algorithm carefully checks the proposed 'cycle' to be
sure.

The ** complexity class NP** is the class of
languages that can be verified by a polynomial-time algorithm.

such that

We say that algorithm *A*
** verifies** language

^{ }The name
'NP' stands for 'nondeterministic polynomial time.' The
class NP was originally studied in the context of nondeterminism, but this book
uses the somewhat simpler yet equivalent notion of verification. Hopcroft and
Ullman [104] give a good presentation of NP-completeness in terms of
nondeterministic models of computation.

From our earlier
discussion on the hamiltonian-cycle problem, it follows that HAM-CYCLE NP. (It is
always nice to know that an important set is nonempty.) Moreover, if *L* P, then *L*
NP, since if
there is a polynomial-time algorithm to decide *L*, the algorithm can be
easily converted to a two-argument verification algorithm that simply ignores
any certificate and accepts exactly those input strings it determines to be in *L*.
Thus, P NP.

It is unknown whether P = NP, but most researchers believe that P and NP are not the same class. Intuitively, the class P consists of problems that can be solved quickly. The class NP consists of problems for which a solution can be verified quickly. You may have learned from experience that it is often more difficult to solve a problem from scratch than to verify a clearly presented solution, especially when working under time constraints. Theoretical computer scientists generally believe that this analogy extends to the classes P and NP, and thus that NP includes languages that are not in P.

There is more compelling evidence that P NP--the existence of 'NP-complete' languages. We shall study this class in Section 3.

Many other fundamental questions beyond the P NP question
remain unresolved. Despite much work by many researchers, no one even knows if
the class NP is closed under complement. That is, does *L* NP imply ? We can define
the ** complexity class co-NP** as the set of languages

Thus, our understanding of the precise relationship between P and NP is woefully incomplete. Nevertheless, by exploring the theory of NP-completeness, we shall find that our disadvantage in proving problems to be intractable is, from a practical point of view, not nearly so great as we might suppose.

Consider the language GRAPH-ISOMORPHISM = . Prove that GRAPH-ISOMORPHISM NP by describing a polynomial-time algorithm to verify the language.

Prove that if *G* is
an undirected bipartite graph with an odd number of vertices, then *G* is
nonhamiltonian.

Show that if HAM-CYCLE P, then the problem of listing the vertices of a hamiltonian cycle, in order, is polynomial-time solvable.

Prove that the class NP of languages is closed under union, intersection, concatenation, and Kleene star. Discuss the closure of NP under complement.

Show that any language in
NP can be decided by an algorithm running in time 2^{O}^{(nk)}
for some constant *k*.

A ** hamiltonian path **in a graph is a simple path that
visits every vertex exactly once. Show that the language HAM-PATH = belongs to NP.

Show that the hamiltonian-path problem can be solved in polynomial time on directed acyclic graphs. Give an efficient algorithm for the problem.

Let be a boolean
formula constructed from the boolean input variables *x*_{1}, *x*_{2},
. . . , *x _{k}*, negations (), AND's (), OR's (), and
parentheses. The formula is a

Prove that P co-NP.

Prove that if NP co-NP, then P NP.

Let *G* be a
connected, undirected graph with at least 3 vertices, and let *G*^{3 }be
the graph obtained by connecting all pairs of vertices that are connected by a
path in *G* of length at most 3. Prove that *G*^{3} is
hamiltonian. (*Hint*: Construct a spanning tree for *G*, and use an
inductive argument.)

Perhaps the most
compelling reason why theoretical computer scientists believe that P NP is the
existence of the class of 'NP-complete' problems. This class has the
surprising property that if any *one* NP-complete problem can be solved in
polynomial time, then *every* problem in NP has a polynomial-time
solution, that is, P = NP. Despite years of study, though, no polynomial-time
algorithm has ever been discovered for any NP-complete problem.

The language HAM-CYCLE is one NP-complete problem. If we could decide HAM-CYCLE in polynomial time, then we could solve every problem in NP in polynomial time. In fact, if NP - P should turn out to be nonempty, we could say with certainty that HAM-CYCLE NP - P.

The NP-complete languages are, in a sense, the 'hardest' languages in NP. In this section, we shall show how to compare the relative 'hardness' of languages using a precise notion called 'polynomial-time reducibility.' First, we formally define the NP-complete languages, and then we sketch a proof that one such language, called CIRCUIT-SAT, is NP-complete. In Section 5, shall use the notion of reducibility to show that many other problems are NP-complete.

Intuitively, a problem *Q* can be
reduced to another problem *Q*' if any instance of *Q* can be
'easily rephrased' as an instance of *Q*', the solution to which
provides a solution to the instance of *Q*. For example, the problem of
solving linear equations in an indeterminate *x* reduces to the problem of
solving quadratic equations. Given an instance *ax* + *b* = 0, we
transform it to 0*x*^{2} + *ax* + *b* = 0, whose
solution provides a solution to *ax* + *b* = 0. Thus, if a problem *Q*
reduces to another problem *Q*', then *Q* is, in a sense, 'no
harder to solve' than *Q*'.

Returning to our formal-language framework for
decision problems, we say that a language *L*_{1} is ** polynomial-time
reducible** to a language

We call the function *f* the ** reduction
function**, and a polynomial-time algorithm

Figure 3 illustrates the
idea of a polynomial-time reduction from a language *L*_{1} to
another language *L*_{2}. Each language is a subset of *.
The reduction function *f* provides a polynomial-time mapping such that if
*x* *L*_{1},
then *f*(*x*) *L*_{2}.
Moreover, if *x* *L*_{1},
then *f*(*x*) *L*_{2}.
Thus, the reduction function maps any instance *x* of the decision problem
represented by the language *L*_{1} to an instance *f*(*x*)
of the problem represented by *L*_{2}. Providing an answer to
whether *f*(*x*) *L*_{2}
directly provides the answer to whether *x* *L*_{1}.

Polynomial-time reductions give us a powerful tool for proving that various languages belong to P.

Lemma 3

If *L*_{1}, *L*_{2}
^{*}
are languages such that *L*_{1} *L*_{2},
then *L*_{2} P implies *L*_{1}
P.

** Proof** Let

Figure 4 illustrates the
construction of *A*_{1}. For a given input *x* *, the
algorithm *A*_{1} uses *F* to transform *x* into *f*(*x*),
and then it uses *A*_{2} to test whether *f*(*x*) *L*_{2}.
The output of *A*_{2} is the value provided as the output from *A*_{1}.

The correctness of *A*_{1}
follows from condition (1). The algorithm runs in polynomial time, since both *F*
and *A*_{2} run in polynomial time (see Exercise 1-6).

Polynomial-time reductions provide a
formal means for showing that one problem is at least as hard as another, to
within a polynomial-time factor. That is, if *L*_{1} _{p} *L*_{2},
then *L*_{1} is not more than a polynomial factor harder than *L*_{2},
which is why the 'less than or equal to' notation for reduction is
mnemonic. We can now define the set of NP-complete languages, which are the
hardest problems in NP.

A language *L* ^{*}
is ** NP-complete** if

1. *L* NP, and

2. *L*' _{p} *L*
for every *L*' NP.

If a language *L* satisfies
property 2, but not necessarily property 1, we say that *L* is ** NP-hard**.
We also define NPC to be the class of NP-complete languages.

As the following theorem shows, NP-completeness is at the crux of deciding whether P is in fact equal to NP.

Theorem 4

If any NP-complete problem is polynomial-time solvable, then P = NP. If any problem in NP is not polynomial-time solvable, then all NP-complete problems are not polynomial-time solvable.

** Proof** Suppose that

To prove the second
statement, suppose that there exists an *L* NP such that *L*
P. Let *L*'
NPC be any
NP-complete language, and for the purpose of contradiction, assume that *L*'
P. But then,
by Lemma 3, we have *L* _{p} *L*',
and thus *L* P.

It is for this reason that research into the P NP question centers around the NP-complete problems. Most theoretical computer scientists believe that P NP, which leads to the relationships among P, NP, and NPC shown in Figure 5. But for all we know, someone may come up with a polynomial-time algorithm for an NP-complete problem, thus proving that P = NP. Nevertheless, since no polynomial-time algorithm for any NP-complete problem has yet been discovered, a proof that a problem is NP-compete provides excellent evidence for its intractability.

We have defined the notion of an NP-complete problem, but up to this point, we have not actually proved that any problem is NP-complete. Once we prove that at least one problem is NP-complete, we can use polynomial-time reducibility as a tool to prove the NP-completeness of other problems. Thus, we now focus on demonstrating the existence of an NP-complete problem: the circuit-satisfiability problem.

Unfortunately, the formal proof that the circuit-satisfiability problem is NP-complete requires technical detail beyond the scope of this text. Instead, we shall informally describe a proof that relies on a basic understanding of boolean combinational circuits. This material is reviewed at the beginning of Chapter 29.

Figure 6 shows two boolean combinational circuits, each with three
inputs and a single output. A ** truth assignment** for a boolean
combinational circuit is a set of boolean input values. We say that a
one-output boolean combinational circuit is

The ** circuit-satisfiability
problem** is, 'Given a boolean combinational circuit composed of
AND, OR, and NOT gates, is it satisfiable?' In order to pose this question
formally, however, we must agree on a standard encoding for circuits. One can
devise a graphlike encoding that maps any given circuit

The circuit-satisfiability problem has great importance in the area of computer-aided hardware optimization. If a circuit always produces 0, it can be replaced by a simpler circuit that omits all logic gates and provides the constant 0 value as its output. A polynomial-time algorithm for the problem would have considerable practical application.

Given a circuit *C*,
we might attempt to determine whether it is satisfiable by simply checking all
possible assignments to the inputs. Unfortunately, if there are *k*
inputs, there are 2* ^{k}* possible assignments. When the size of

Lemma 5

The circuit-satisfiability problem belongs to the class NP.

** Proof** We shall provide a two-input, polynomial-time algorithm

The algorithm *A* is
constructed as follows. For each logic gate in the circuit, it checks that the
value provided by the certificate on the output wire is correctly computed as a
function of the values on the input wires. Then, if the output of the entire
circuit is 1, the algorithm outputs 1, since the values assigned to the inputs
of *C* provide a satisfying assignment. Otherwise, *A* outputs 0.

Whenever a satisfiable
circuit *C* is input to algorithm *A*, there is a certificate whose
length is polynomial in the size of *C* and that causes *A* to output
a 1. Whenever an unsatisfiable circuit is input, no certificate can fool *A *into
believing that the circuit is satisfiable. Algorithm *A* runs in
polynomial time: with a good implementation, linear time suffices. Thus,
CIRCUIT-SAT can be verified in polynomial time, and CIRCUIT-SAT NP.

The second part of proving that CIRCUIT-SAT is NP-complete is to show that the language is NP-hard. That is, we must show that every language in NP is polynomial-time reducible to CIRCUIT-SAT. The actual proof of this fact is full of technical intricacies, and so we shall settle for a sketch of the proof based on some understanding of the workings of computer hardware.

A computer program is
stored in the computer memory as a sequence of instructions. A typical
instruction encodes an operation to be performed, addresses of operands in
memory, and an address where the result is to be stored. A special memory
location, called the ** program counter**, keeps track of which
instruction is to be executed next. The program counter is automatically
incremented whenever an instruction is fetched, thereby causing the computer to
execute instructions sequentially. The execution of an instruction can cause a
value to be written to the program counter, however, and then the normal
sequential execution can be altered, allowing the computer to loop and perform
conditional branches.

At any point during the
execution of a program, the entire state of the computation is represented in
the computer's memory. (We take the memory to include the program itself, the
program counter, working storage, and any of the various bits of state that a
computer maintains for bookkeeping.) We call any particular state of computer
memory a ** configuration**. The execution of an instruction can be
viewed as mapping one configuration to another. Importantly, the computer
hardware that accomplishes this mapping can be implemented as a boolean
combinational circuit, which we denote by

Lemma 6

The circuit-satisfiability problem is NP-hard.

** Proof** Let

Since *L* NP, there must
exist an algorithm *A* that verifies *L* in polynomial-time. The
algorithm *F* that we shall construct will use the two-input algorithm *A*
to compute the reduction function a.

Let *T*(*n*)
denote the worst-case running time of algorithm *A* on length-*n *input
strings, and let *k* 1 be a
constant such that *T*(*n*)* = O*(*n ^{k}*) and the
length of the certificate is

The basic idea of the
proof is to represent the computation of *A* as a sequence of
configurations. As shown in Figure 7, each configuration can be broken into
parts consisting of the program for *A*, the program counter and auxiliary
machine state, the input *x*, the certificate *y*, and working
storage. Starting with an initial configuration *c*_{0}, each
configuration *c _{i}* is mapped to a subsequent configuration

The reduction algorithm *F*
constructs a single combinational circuit that computes all configurations
produced by a given initial configuration. The idea is to paste together *T*(*n*)
copies of the circuit *M*. The output of the *i*th circuit, which
produces configuration *c _{i}*, is fed directly into the input of
the (

Recall what the
polynomial-time reduction algorithm *F* must do. Given an input *x*,
it must compute a circuit *C = a*(*x*) that is satisfiable if and
only if there exists a certificate *y* such that *A*(*x*,* y*)
= 1. When *F *obtains an input *x*, it first computes *n = |x|*
and constructs a combinational circuit *C'* consisting of *T*(*n*)
copies of *M*. The input to *C'* is an initial configuration
corresponding to a computation on *A*(*x, y*), and the output is the
configuration *c _{T}*

The circuit *C = **a*(*x*) that *F*
computes is obtained by modifying *C*' *slightly. First, the inputs to *C*'*
corresponding to the program for *A*, the initial program counter, the
input *x*, and the initial state of memory are wired directly to these
known values. Thus, the only remaining inputs to the circuit correspond to the
certificate *y*. Second, all outputs to the circuit are ignored, except
the one bit of *c _{T}*

Two properties remain to
be proved. First, we must show that *F *correctly computes a reduction
function a.
That is, we must show that *C* is satisfiable if and only if there exists
a certificate *y* such that *A*(*x, y*) = 1. Second, we must
show that *F* runs in polynomial time.

To show that *F*
correctly computes a reduction function, let us suppose that there exists a
certificate *y* of length *O*(*n ^{k}*) such that

To complete the proof, we
need only show that *F* runs in time polynomial in *n = *|*x* . The first observation we
make is that the number of bits required to represent a configuration is
polynomial in *n*. The program for *A* itself has constant size,
independent of the length of its input *x*. The length of the input *x*
is *n*, and the length of the certificate *y* is *O*(*n ^{k}*).
Since the algorithm runs for at most

The combinational circuit
*M* implementing the computer hardware has size polynomial in the length
of a configuration, which is polynomial in *O*(*n ^{k}*) and
hence is polynomial in

The language CIRCUIT-SAT is therefore at least as hard as any language in NP, and since it belongs to NP, it is NP-complete.

Theorem 7

The circuit-satisfiability problem is NP-complete.

** Proof** Immediate from Lemmas 5 and 6 and the definition of NP-completeness.

Show that the p relation is a transitive
relation on languages. That is, show that if *L*_{1} p *L*_{2} and *L*_{2}
p *L*_{3}, then *L*_{1}
p *L*_{3}.

Prove that if and only if .

Show that a satisfying assignment can be used as a certificate in an alternative proof of Lemma 5. Which certificate makes for an easier proof?

The proof of Lemma 6
assumes that the working storage for algorithm *A *occupies a contiguous
region of polynomial size. Where in the proof is this assumption exploited?
Argue that this assumption does not involve any loss of generality.

A language *L* is ** complete** for a
language class

Show that *L* is
complete for NP if and only if is complete
for co-NP.

The reduction algorithm *F*
in the proof of Lemma 6 constructs the circuit *C* = *f*(*x*)
based on knowledge of *x*, *A*, and *k*. Professor Sartre
observes that the string *x* is input to *F*, but only the existence
of *A* and *k* is known to *F* (since the language *L*
belongs to NP), not their actual values. Thus, the professor concludes that *F*
can't possibly construct the circuit *C* and that the language CIRCUIT-SAT
is not necessarily NP-hard. Explain the flaw in the professor's reasoning.

The NP-completeness of the circuit-satisfiability
problem relies on a direct proof that *L* _{P}
CIRCUIT-SAT for every language *L* NP. In this
section, we shall show how to prove that languages are NP-complete without
directly reducing *every* language in NP to the given language. We shall
illustrate this methodology by proving that various formula-satisfiability
problems are NP-complete. Section 5 provides many more examples of the
methodology.

The following lemma is the basis of our method for showing that a language is NP-complete.

Lemma 8

If *L* is a language
such that *L'* _{p} *L*
for some *L' *NPC, then *L*
is NP-hard. Moreover, if *L* NP, then *L*
NPC.

** Proof** Since

In other words, by
reducing a known NP-complete language *L*' to *L*, we implicitly
reduce every language in NP to *L*. Thus, Lemma 8 gives us a method for
proving that a language *L* is NP-complete:

1. Prove *L* NP.

2. Select a known
NP-complete language *L*'.

3. Describe an algorithm
that computes a function *f* mapping every instance of *L*' to an
instance of *L*.

4. Prove that the
function *f* satisfies *x* *L*' if
and only if *f*(*x*) *L* for
all *x* ^{*}.

5. Prove that the
algorithm computing *f* runs in polynomial time.

This methodology of reducing from a single known NP-complete language is far simpler than the more complicated process of providing reductions from every language in NP. Proving CIRCUIT-SAT NPC has given us a 'foot in the door.' Knowing that the circuit-satisfiability problem is NP-complete now allows us to prove much more easily that other problems are NP-complete. Moreover, as we develop a catalog of known NP-complete problems, applying the methodology will become that much easier.

We illustrate the reduction methodology by giving an NP-completeness proof for the problem of determining whether a boolean formula, not a circuit, is satisfiable. This problem has the historical honor of being the first problem ever shown to be NP-complete.

We formulate the *(formula)*** satisfiability**
problem in terms of the language SAT as follows. An instance of SAT is a
boolean formula

1. boolean variables: *x*_{1},
*x*_{2}, . . . . ;

2. boolean connectives: any boolean function with one or two inputs and one output, such as (AND), (OR), (NOT), (implication), (if and only if); and

3. parentheses.

As in boolean combinational circuits, a ** truth assignment**
for a boolean formula

As an example, the formula

= ((has the satisfying
assignment *x*_{1
}= 0, *x*_{2 }= 0, *x*_{3 }= 1, *x*_{4 }=
1, since

= (1 (1 1)) 1

= (1 0) 1

and thus this formula belongs to SAT.

The naive algorithm to
determine whether an arbitrary boolean formula is satisfiable does not run in
polynomial time. There are 2* ^{n}* possible assignments in a
formula

Theorem 9

Satisfiability of boolean formulas is NP-complete.

** Proof** We shall argue first that SAT NP. Then, we
shall show that CIRCUIT-SAT

To show that SAT belongs to NP, we show that a certificate consisting of a satisfying assignment for an input formula can be verified in polynomial time. The verifying algorithm simply replaces each variable in the formula with its corresponding value and then evaluates the expression, much as we did in equation (2) above. This task is easily doable in polynomial time. If the expression evaluates to 1, the formula is satisfiable. Thus, the first condition of Lemma 8 for NP-completeness holds.

To prove that SAT is
NP-hard, we show that CIRCUIT-SAT _{p}
SAT. In other words, any instance of circuit satisfiability can be reduced in
polynomial time to an instance of formula satisfiability. Induction can be used
to express any boolean combinational circuit as a boolean formula. We simply
look at the gate that produces the circuit output and inductively express each
of the gate's inputs as formulas. The formula for the circuit is then obtained
by writing an expression that applies the gate's function to its inputs'
formulas.

Unfortunately, this straightforward method does not constitute a polynomial-time reduction. Shared subformulas can cause the size of the generated formula to grow exponentially (see Exercise 4-1). Thus, the reduction algorithm must be somewhat more clever.

Figure 8 illustrates the
basic idea of the reduction from CIRCUIT-SAT to SAT on the circuit from Figure 6(a).
For each wire *x _{i}* in the circuit

The formula produced by the reduction algorithm is the AND of the circuit-output variable with the conjunction of clauses describing the operation of each gate. For the circuit in the figure, the formula is

(

(

(

(

(

(

Given a circuit *C*,
it is straightforward to produce such a formula in polynomial
time.

Why is the circuit satisfiable
exactly when the formula is
satisfiable? If* C* has a satisfying assignment, each wire of the circuit
has a well-defined value, and the output of the circuit is 1. Therefore, the assignment
of wire values to variables in makes each
clause of evaluate to 1,
and thus the conjunction of all evaluates to 1. Conversely, if there is an
assignment that causes to evaluate to
1, the circuit *C* is satisfiable by an analogous argument. Thus, we have
shown that CIRCUIT-SAT _{p}
SAT, which completes the proof.

Many problems can be proved NP-complete by reduction from formula satisfiability. The reduction algorithm must handle any input formula, though, and this can lead to a huge number of cases that must be considered. It is often desirable, therefore, to reduce from a restricted language of boolean formulas, so that fewer cases need be considered. Of course, we must not restrict the language so much that it becomes polynomial-time solvable. One convenient language is 3-CNF satisfiability, or 3-CNF-SAT.

We
define 3-CNF satisfiability using the following terms. A** literal**
in a boolean formula is an occurrence of a variable or its negation. A boolean
formula is in

For example, the boolean formula

(is in 3-CNF. The first of
its three clauses is (*x*_{1 }*x*_{ }*x*_{ }), which
contains the three literals *x*_{1}, *x*_{1},
and *x*_{2}.

In 3-CNF-SAT, we are asked whether a given boolean formula in 3-CNF is satisfiable. The following theorem shows that a polynomial-time algorithm that can determine the satisfiability of boolean formulas is unlikely to exist, even when they are expressed in this simple normal form.

Theorem 10

Satisfiability of boolean formulas in 3-conjunctive normal form is NP-complete.

** Proof** The argument we used in the proof of Theorem 9 to show that SAT NP applies
equally well here to show that 3-CNF-SAT NP. Thus, we
need only show that 3-CNF-SAT is NP-hard. We prove this by showing that SAT

The reduction algorithm can be broken into three basic steps. Each step progressively transforms the input formula closer to the desired 3-conjunctive normal form.

The first step is similar to the one used to prove
CIRCUIT-SAT _{p }SAT
in Theorem 9. First, we construct a binary 'parse' tree for the input
formula , with
literals as leaves and connectives as internal nodes. Figure 9 shows such a
parse tree for the formula

Should the input formula contain a clause such as the OR of several literals, associativity can be used to parenthesize the expression fully so that every internal node in the resulting tree has 1 or 2 children. The binary parse tree can now be viewed as a circuit for computing the function.

Mimicking the reduction
in the proof of Theorem 9, we introduce a variable *y _{i}* for the
output of each internal node. Then, we rewrite the original formula as the AND of
the root variable and a conjunction of clauses describing the operation of each
node. For the formula (3), the resulting expression is

(

(

(

(

(

Observe that the formula ' thus
obtained is a conjunction of clauses ,_{ }each
of which has at most 3 literals. The only additional requirement is that each
clause be an OR of literals.

1 1 0

1 0 1

0 1 0

0 0 0

0 1 1 1

0 1 0 0

0 0 1 1

0 0 0 1

The second step of the reduction converts each clause into conjunctive
normal form. We construct a truth table for by evaluating
all possible assignments to its variables. Each row of the truth table consists
of a possible assignment of the variables of the clause, together with the
value of the clause under that assignment. Using the truth-table entries that
evaluate to 0, we build a formula in ** disjunctive normal form** (or

In our example, we
convert the clause into CNF as
follows. The truth table for *' _{i}*
is given in Figure 10. The DNF formula equivalent to

Applying DeMorgan's laws, we get the CNF formula

which is equivalent to the original clause .

Each clause of the formula
' has now been
converted into a CNF formula , and thus *'* is
equivalent to the CNF formula *'*
consisting of the conjunction of the . Moreover,
each clause of '* has
at most 3 literals.*

The third and final step
of the reduction further transforms the formula so that each clause has *exactly*
3 distinct literals. The final 3-CNF formula ''' is
constructed from the clauses of the CNF formula *'*.
It also uses two auxiliary variables that we shall call *p* and *q*.
For each clause *C _{i}*

If *C _{i}* has 3 distinct literals, then simply include

If *C _{i}* has 2 distinct literals, that is, if

If *C _{i}* has just 1 distinct literal

We can see that the 3-CNF
formula ''' is
satisfiable if and only if is satisfiable
by inspecting each of the three steps. Like the reduction from CIRCUIT-SAT to
SAT, the construction of ' from in the first
step preserves satisfiability. The second step produces a CNF formula ' *that
is algebraically equivalent to '. The third
step produces a 3-CNF formula ''' that is
effectively equivalent to '*,
since any assignment to the variables *p* and *q* produces a formula
that is algebraically equivalent to '*.*

We must also show that
the reduction can be computed in polynomial time. Constructing ' from introduces at
most 1 variable and 1 clause per connective in . Constructing
' from ' can
introduce at most 8 clauses into ' for
each clause from ', since each
clause of ' has at most
3 variables, and the truth table for each clause has at most 2^{3} = 8
rows. The construction of ''' from '
introduces at most 4 clauses into ''' for each
clause of '. Thus,
the size of the resulting formula ''' is
polynomial in the length of the original formula. Each of the constructions can
easily be accomplished in polynomial time.

Consider the
straightforward (nonpolynomial-time) reduction in the proof of Theorem 9.
Describe a circuit of size *n* that, when converted to a formula by this
method, yields a formula whose size is exponential in *n*.

Show the 3-CNF formula that results when we use the method of Theorem 10 on the formula (3).

Show that the problem of determining whether a boolean
formula is a tautology is complete for co-NP. (*Hint:* See Exercise 3-6.)

Show that the problem of determining the satisfiability of boolean formulas in disjunctive normal form is polynomial-time solvable.

Suppose that someone gives you a polynomial-time algorithm to decide formula satisfiability. Describe how to use this algorithm to find satisfying assignments in polynomial time.

Let 2-CNF-SAT be the set of satisfiable boolean
formulas in CNF with exactly 2 literals per clause. Show that 2-CNF-SAT P. Make your
algorithm as efficient as possible. (*Hint:* Observe that *x* *y* is equivalent to *x* *y*.
Reduce 2-CNF-SAT to a problem on a directed graph that is efficiently
solvable.)

NP-complete problems arise in diverse domains: boolean logic, graphs, arithmetic, network design, sets and partitions, storage and retrieval, sequencing and scheduling, mathematical programming, algebra and number theory, games and puzzles, automata and language theory, program optimization, and more. In this section, we shall use the reduction methodology to provide NP-completeness proofs for a variety of problems drawn from graph theory and set partitioning.

Figure 11 outlines the structure of the NP-completeness proofs in this section and Section 4. Each language in the figure is proved NP-complete by reduction from the language that points to it. At the root is CIRCUIT-SAT, which we proved NP-complete in Theorem 7.

A ** clique**
in an undirected graph

A naive algorithm for
determining whether a graph *G* = (*V*, *E*) with |*V|*
vertices has a clique of size *k* is to list all *k*-subsets of *V*,
and check each one to see whether it forms a clique. The running time of this
algorithm is , which is
polynomial if *k* is a constant. In general, however, *k* could be
proportional to |*V|*, in which case the algorithm runs in superpolynomial
time. As one might suspect, an efficient algorithm for the clique problem is
unlikely to exist.

Theorem 11

The clique problem is NP-complete.

** Proof **To show that CLIQUE NP, for a
given graph

We next show that the clique problem is NP-hard by proving that 3-CNF-SAT p CLIQUE. That we should be able to prove this result is somewhat surprising, since on the surface logical formulas seem to have little to do with graphs.

The reduction algorithm
begins with an instance of 3-CNF-SAT. Let =_{ }*C*_{1}
*C*_{ } *C _{k}* be a boolean formula in 3-CNF
with

The graph *G* = (*V*,
*E*) is constructed as follows. For each clause in , we place a
triple of vertices in *V*.
We put an edge between two vertices if both of the
following hold:

are in
different triples, that is, *r* *s*, and

their corresponding
literals are ** consistent**, that is, is not the
negation of .

This graph can easily be computed from in polynomial time. As an example of this construction, if we have

= (then *G* is the
graph shown in Figure 12.

We must show that this
transformation of into *G*
is a reduction. First, suppose that has a
satisfying assignment. Then, each clause *C _{r}* contains at least
one literal that is
assigned 1, and each such literal corresponds to a vertex . Picking one
such 'true' literal from each clause yields a set of

Conversely, suppose that *G*
has a clique *V'* of size *k*. No edges in *G *connect vertices
in the same triple, and so *V' *contains exactly one vertex per triple. We
can assign 1 to each literal such that without fear
of assigning 1 to both a literal and its complement, since *G* contains no
edges between inconsistent literals. Each clause is satisfied, and so satisfied.
(Any variables that correspond to no vertex in the clique may be set
arbitrarily.)

In the example of Figure 12,
a satisfying assignment of is *x*_{1}
= 0, *x*_{2} = 0, *x*_{3} = 1. A
corresponding clique of size *k* = 3 consists of the vertices
corresponding to *x*_{2}
from the first clause, *x*_{3} from the second clause, and *x*_{3}*
*from the third clause.

A ** vertex
cover** of an undirected graph

The ** vertex-cover problem** is to find a
vertex cover of minimum size in a given graph. Restating this optimization
problem as a decision problem, we wish to determine whether a graph has a
vertex cover of a given size

The following theorem shows that this problem is NP-complete.

Theorem 12

The vertex-cover problem is NP-complete.

** Proof** We first show that VERTEX-COVER NP. Suppose we
are given a graph

We prove that the vertex-cover problem is NP-hard by
showing that CLIQUE _{P}
VERTEX-COVER. This reduction is based on the notion of the
'complement' of a graph. Given an undirected graph *G* = (*V,
E*), we define the ** complement** of

The reduction algorithm
takes as input an instance *G, k* of the clique
problem. It computes the complement , which is
easily doable in polynomial time. The output of the reduction algorithm is the
instance , of the
vertex-cover problem. To complete the proof, we show that this transformation
is indeed a reduction: the graph *G* has a clique of size *k* if and
only if the graph has a vertex
cover of size |*V*| - *k*.

Suppose that *G* has
a clique *V*' *V* with
|*V*'| = *k*. We claim that *V - V' *is a vertex cover in . Let (*u*,
*v*) be any edge in . Then, (*u,
v*) *E*,
which implies that at least one of *u* or *v* does not belong to *V*',
since every pair of vertices in *V*' is connected by an edge of *E*.
Equivalently, at least one of *u* or *v* is in *V - V*', which
means that edge (*u,v*) is covered by *V - V*'. Since (*u, v*)
was chosen arbitrarily from , every edge
of is covered by
a vertex in *V - V*'. Hence, the set *V - V*', which has size |*V|
- k*, forms a vertex cover for .

Conversely, suppose that has a vertex
cover *V*' *V* ,
where |*V*'| = |*V| - k*. Then, for all *u, v* *V*, if (*u,
v*) , if , then *u*
*V*' or *v*
*V*' or
both. The contrapositive of this implication is that for all *u, v* *V*, if *u*
*V*' and *v*
*V*',
then (*u, v*) *E*. In
other words, *V - V*' is a clique, and it has size |*V| - |V*'*| =
k*.

Since VERTEX-COVER is NP-complete, we don't expect to find a polynomial-time algorithm for finding a minimum-size vertex cover. Section 37.1 presents a polynomial-time 'approximation algorithm,' however, which produces 'approximate' solutions for the vertex-cover problem. The size of a vertex cover produced by the algorithm is at most twice the minimum size of a vertex cover.

Thus, we shouldn't give up hope just because a problem is NP-complete. There may be a polynomial-time approximation algorithm that obtains near-optimal solutions, even though finding an optimal solution is NP-complete. Chapter 37 gives several approximation algorithms for NP-complete problems.

The next NP-complete problem we consider is
arithmetic. In the ** subset-sum problem**, we are given a finite set

As usual, we define the problem as a language:

SUBSET-SUMAs with any arithmetic problem, it is important to recall that our standard encoding assumes that the input integers are coded in binary. With this assumption in mind, we can show that the subset-sum problem is unlikely to have a fast algorithm.

Theorem 13

The subset-sum problem is NP-complete.

** Proof** To show that SUBSET-SUM is in NP, for an instance

We now show that
VERTEX-COVER _{P}
SUBSET-SUM. Given an instance *G, k* of the
vertex-cover problem, the reduction algorithm constructs an instance *S, t* of the
subset-sum problem such that *G* has a vertex cover of size *k* if
and only if there is a subset of *S* whose sum is exactly *t*.

At the heart of the reduction is
an incidence-matrix representation of *G*. Let *G* = (*V, E*) be
an undirected graph and let *V* = and E* = . The incidence
matrix of G is a |V| |E| matrix
B = (b*

For example, Figure 14(b)
shows the incidence matrix for the undirected graph of Figure 14(a). The
incidence matrix is shown with lower-indexed edges on the right, rather than on
the left as is conventional, in order to simplify the formulas for the numbers
in *S*.

Given a graph *G*
and an integer *k*, the reduction algorithm computes a set *S* of
numbers and an integer *t*. To understand how the reduction algorithm
works, let us represent numbers in a 'modified base-4' fashion. The |*E|*
low-order digits of a number will be in base-4 but the high-order digit can be
as large as *k*. The set of numbers is constructed in such a way that no
carries can be propagated from lower digits to higher digits.

The set *S* consists
of two types of numbers, corresponding to vertices and edges respectively. For
each vertex *v _{i}*

For each edge *e _{j}
*

The first digit of the
target sum *t* is *k*, and all |*E|* lower-order digits are 2's.
Formally,

All of these numbers have polynomial size when we represent them in binary. The reduction can be performed in polynomial time by manipulating the bits of the incidence matrix.

We must now show that
graph *G* has a vertex cover of size *k* if and only if there is a
subset *S*'* S* whose
sum is *t*. First, suppose that *G* has a vertex cover *V*' *V* of size *k*.
Let *V*' = , and define *S'* by

To see that * _{s}*S

Now, suppose that there
is a subset *S*'* S* that
sums to *t*. Let *S* = . We claim that
*m* = *k* and that *V'* = is a vertex cover for *G*. To prove this
claim, we start by observing that for each edge *e _{j}*

In Figure 14, the vertex
cover *V'* =
corresponds to the subset *S*' = . All of the *y _{j}* are included in

We now return to the hamiltonian-cycle problem defined in Section 2.

Theorem 14

The hamiltonian cycle problem is NP-complete.

** Proof** We first show that HAM-CYCLE belongs to NP. Given a graph

Our first widget is the
subgraph *A* shown in Figure 15(a). Suppose that *A* is a subgraph of
some graph *G* and that the only connections between *A* and the
remainder of *G* are through the vertices *a, a*', b, and b*'*.
Furthermore, suppose that graph *G* has a hamiltonian cycle. Since any
hamiltonian cycle of *G* must pass through vertices *z*_{1}, *z*_{2},
*z*_{3}, and *z*_{4} in one of the ways shown in
Figures 15(b) and (c), we may treat subgraph *A* as if it were simply a
pair of edges (*a, a*') and (*b, b*') with the restriction that any
hamiltonian cycle of *G* must include exactly one of these edges. We shall
represent widget *A* as shown in Figure 15(d).

The subgraph *B* in
Figure 16 is our second widget. Suppose that *B* is a subgraph of some
graph *G* and that the only connections from *B* to the remainder of *G*
are through vertices *b*_{1}, *b*_{2}, *b*_{3},
and *b*_{4}. A hamiltonian cycle of graph *G* cannot traverse
*all* of the edges (*b*_{1}, *b*_{2}), (*b*_{2},
*b*_{3}), and (*b*_{3}, *b*_{4}), since
then all vertices in the widget other than *b*_{1}, *b*_{2},
*b*_{3}, and *b*_{4} would be missed. A hamiltonian
cycle of *G* may, however, traverse any proper subset of these edges.
Figures 16(a)-(e) show five such subsets; the remaining two subsets can be
obtained by performing a top-to-bottom flip of parts (b) and (e). We represent
this widget as in Figure 16(f), the idea being that at least one of the paths
pointed to by the arrows must be taken by a hamiltonian cycle.

The graph *G* that
we shall construct consists mostly of copies of these two widgets. The construction
is illustrated in Figure 17. For each of the *k* clauses in , we include a
copy of widget *B*, and we join these widgets together in series as
follows. Letting *b _{ij}* be the copy of vertex

Then, for each variable *x _{m}*
in , we include
two vertices . We connect
these two vertices by means of two copies of the edge , which we
denote by

We are not yet finished
with the construction of graph *G*, since we have yet to relate the
variables to the clauses. If the *j*th literal of clause *C _{i}*
is

between (*b*_{2,1},
*b*_{2,2}) and *e*_{1},

between (*b*_{2,2},
*b*_{2,3}) and , and

between (*b*_{2,3},
*b*_{2,4}) and *e*_{3}.

Note that connecting two
edges by means of *A* widgets actually entails replacing each edge by the
five edges in the top or bottom of Figure 15(a) and, of course, adding the
connections that pass through the *z* vertices as well. A given literal *l _{m}*
may appear in several clauses (

We claim that formula is satisfiable
if and only if graph *G* contains a hamiltonian cycle. We first suppose
that *G* has a hamiltonian cycle* h* and show that is
satisfiable. Cycle *h* must take a particular form:

First, it traverses edge to go from the top left to the top right.

It then follows all of the vertices from
top to bottom, choosing either edge e_{m} or edge , but not
both.

It next traverses edge to get back to the left side.

Finally, it traverses the B widgets from bottom to top on the left.

(It actually traverses
edges within the *A* widgets as well, but we use these subgraphs to
enforce the either/or nature of the edges it connects.)

Given the hamiltonian
cycle *h*, we define a truth assignment for as follows. If
edge *e _{m}* belongs to

We claim that this
assignment satisfies . Consider a
clause *C _{i}* and the corresponding

Conversely, let us
suppose that formula is satisfied by some truth assignment. By following the
rules from above, we can construct a hamiltonian cycle for graph *G*:
traverse edge *e _{m}* if

Finally, we note that
graph *G* can be constructed in polynomial time. It contains one *B*
widget for each of the *k* clauses in . There is one *A* widget for
each instance of each literal in , and so there are 3*k* *A*
widgets. Since the *A* and *B* widgets are of fixed size, the graph *G*
has *O*(*k*) vertices and edges and is easily constructed in
polynomial time. Thus, we have provided a polynomial-time reduction from
3-CNF-SAT to HAM-CYCLE.

In the ** traveling-salesman
problem**, which is closely related to the hamiltonian-cycle problem, a
salesman must visit

The following theorem shows that a fast algorithm for the traveling-salesman problem is unlikely to exist.

The traveling-salesman problem is NP-complete.

** Proof **We first show that TSP belongs to NP. Given an instance of the problem,
we use as a certificate the sequence of

To prove that TSP is
NP-hard, we show that HAM-CYCLE _{p}
TSP. Let *G* = (*V*, *E*) be an instance of HAM-CYCLE. We
construct an instance of TSP as follows. We form the complete graph *G'*=
(*V,E'*), where *E' *= , and
we define the cost function *c* by

The instance of TSP is
then (*G'*, *c*, 0), which is easily formed in polynomial time.

We now show that graph *G*
has a hamiltonian cycle if and only if graph *G'* has a tour of cost at
most 0. Suppose that graph *G* has a hamiltonian cycle *h*. Each edge
in *h* belongs to *E* and thus has cost 0 in *G'.* Thus, *h*
is a tour in *G' *with cost 0. Conversely, suppose that graph *G' *has
a tour *h'* of cost at most 0. Since the costs of the edges in *E'*
are 0 and 1, the cost of tour *h' *is exactly 0. Therefore, *h' *contains
only edges in *E*. We conclude that *h* is a hamiltonian cycle in
graph *G*.

The ** subgraph-isomorphism problem** takes
two graphs

Given an integer *m*-by-*n* matrix *A*
and an integer *m*-vector *b*, the ** 0-1 integer-programming
problem** asks whether there is an integer

Show that the subset-sum problem is
solvable in polynomial time if the target value *t* is expressed in unary.

The ** set-partition problem** takes as input
a set

Show that the hamiltonian-path problem is NP-complete.

The ** longest-simple-cycle problem** is the
problem of determining a simple cycle (no repeated vertices) of maximum length
in a graph. Show that this problem is NP-complete.

Professor Marconi
proclaims that the subgraph used as widget *A* in the proof of Theorem 14
is more complicated than necessary: vertices *z*_{3} and *z*_{4}
of Figure 15(a) and the vertices above and below them are not needed. Is the
professor correct? That is, does the reduction work with this smaller version
of the widget, or does the 'either/or' property of the widget
disappear?

36-1 Independent set

An ** independent
set** of a graph

** a. **Formulate a related decision problem for the independent-set problem,
and prove that it is NP-complete. (

** b. **Suppose that you are given a subroutine to solve the decision problem
you defined in part (a). Give an algorithm to find an independent set of
maximum size. The running time of your algorithm should be polynomial in |

Although the independent-set decision problem is NP-complete, certain special cases are polynomial-time solvable.

** c. **Give an efficient algorithm to solve the independent-set problem when
each vertex in

** d. **Give an efficient algorithm to solve the independent-set problem when

36-2 Graph coloring

A ** k-coloring**
of an undirected graph

** a. **Give an efficient algorithm to determine a 2-coloring of a graph if one
exists.

** b. **Cast the graph-coloring problem as a decision problem. Show that your
decision problem is solvable in polynomial time if and only if the
graph-coloring problem is solvable in polynomial time.

** c. **Let
the language 3-COLOR be the set of graphs that can be 3-colored. Show that if
3-COLOR is NP-complete, then your decision problem from part (b) is
NP-complete.

To prove that 3-COLOR is
NP-complete, we use a reduction from 3-CNF-SAT. Given a formula of *m*
clauses on *n* variables *x*_{1}, *x*_{2}, . . .
, *x _{n}*, we construct a graph

** d. **Argue that in any 3-coloring

The widget shown in
Figure 20 is used to enforce the condition corresponding to a clause (*x*
V *y* V *z*). Each clause requires a unique copy of the 5 vertices
that are heavily shaded in the figure; they connect as shown to the literals of
the clause and the special vertex TRUE

** e. **Argue that if each of

** f. **Complete the proof that 3-COLOR is NP-complete.

Garey and Johnson [79] provide a wonderful guide to NP-completeness, discussing the theory at length and providing a catalogue of many problems that were known to be NP-complete in 1979. (The list of NP-complete problem domains at the beginning of Section 5 is drawn from their table of contents.) Hopcroft and Ullman [104] and Lewis and Papadimitriou [139] have good treatments of NP-completeness in the context of complexity theory. Aho, Hopcroft, and Ullman [4] also cover NP-completeness and give several reductions, including a reduction for the vertex-cover problem from the hamiltonian-cycle problem.

The class P was introduced in 1964 by Cobham [44] and, independently, in 1965 by Edmonds [61], who also introduced the class NP and conjectured that P NP. The notion of NP-completeness was proposed in 1971 by Cook [49], who gave the first NP-completeness proofs for formula satisfiability and 3-CNF satisfiability. Levin [138] independently discovered the notion, giving an NP-completeness proof for a tiling problem. Karp [116] introduced the methodology of reductions in 1972 and demonstrated the rich variety of NP-complete problems. Karp's paper included the original NP-completeness proofs of the clique, vertex-cover, and hamiltonian-cycle problems. Since then, hundreds of problems have been proven to be NP-complete by many researchers.

The proof of Theorem 14 was adapted from Papadimitriou and Steiglitz [154].

Politica de confidentialitate | Termeni si conditii de utilizare |

Vizualizari: 3210

Importanta:

Termeni si conditii de utilizare | Contact

© SCRIGROUP 2024 . All rights reserved