| DNA Computing | ||
| Member: Juliet Idemudia Andreas Heigl Levent Saribas gongpu luo Hassan Syed | ||
| Please click here to see the presentation slides | ||
|
DNA
– Computing Attribution For Dr. Leonard Adleman. He
is the author of Molecular Computation of Solutions To Combinatorial
Problems which has led to the continuously growing interest on “DNA
based computers”. He
writes with exceptional clarity and has time to give answers to questions
posted in his web site. His
DNA computing experiment has been heralded as the “first example of true
nanotechnology” and even the “start of a new era”, forging an
unprecedented link between computational science and life science. Abstract
Molecular biologists are beginning to unravel the information-processing tools such as enzymes that evolution has spent billions of years refining. These tools are now been taken in large numbers of DNA molecules and using them as biological computer processors. Dr. Leonard Adleman, a well-known scientist, found a way to exploit the speed and efficiency of the biological reactions to solve the “Hamiltonian path problem”, also known as the “traveling salesman problem”. Based on Dr. Adleman’s experiment, we will explain DNA
computing, its algorithms, how to manage DNA based computing and the
advantages and disadvantages of DNA computing. Introduction
Molecular biology
could be used to solve a difficult computational problem. Prof. L. Adleman reported the first actual
experiment on 1994. Adleman demonstrated how standard methods of molecular biology could be used to solve a difficult computational problem. Since then the interest on DNA based computers has been continuously growing. Definitions DNA means DeoxyriboNucleic
Acid. The
complete set of instructions for making an organism is called its GENOME.
It contains the master blueprint for all cellular structures and
activities for the lifetime of the cell or organism. Found in every
nucleus of a persons many trillions of cells, the human genomes consists
of tightly coiled threads of deoxyribonucleic acid (DNA) and associated
protein molecules, organized into structures called chromosomes. If unwound and
tied together, the strands of DNA would stretch more than 5feet but would
be only 50 trillionths of an inch wide. For each organism, the components
of these slender threads encode all the information necessary for building
and maintaining life, from simple bacteria to remarkably complex human
beings. Understanding how DNA performs this function requires some
knowledge of its structure and organization. DNA In humans, as in
other higher organisms, a DNA molecule consists of two strands that wrap
around each other to resemble a twisted ladder whose sides, made of sugar
and phosphate molecules, are connected by rings of nitrogen-containing
chemicals called bases. Each
strand is a linear arrangement of repeating similar units called nucleotides. A
nucleotide is composed of one sugar, one phosphate, and a nitrogenous
base.
Four
different bases are present in DNA. They are adenine (A), thymine (T),
cytosine(C), and guanine(G). The particular order of the bases arranged
along the sugar-phosphate backbone is called DNA
sequence; the sequence specifies the exact genetic instructions
required to create a particular organism with its own unique traits. The
two DNA strands are held together by weak bonds between the bases on each
strand, forming base pairs. Each time a cell
divides into two daughter cells, its full genome is duplicated; for humans
and other complex organisms, this duplication occurs in the nucleus.
During cell division, the DNA molecule unwinds and the weak bonds between
the base pairs break, allowing the strands to separate. Each strand
directs the synthesis of a complementary new strand, with free nucleotides
matching up with their complementary bases on each of the separated
strands. Strict base-pairing rules are adhered to. Adenine will pair only
with thymine (an A-T pair) and cytosine with guanine (a C-G pair). Each
daughter cell rules ensures that the new strand is an exact copy of the
old one. This minimizes the incidences of errors (mutations) that may
affect the resulting organism or its offspring.
Genes Each DNA molecule
contains many genes, the basic physical and functional units of
hereditary. A gene is a specific sequence of nucleotide bases, whose
sequences carry the information required for constructing proteins, which
provides the structural components of cells and tissues as well as enzymes
for essential biochemical reactions. The human genome is estimated to
comprise at least 100,000genes.
Chromosomes The 3 billion
base pairs in the human genome are organized into 24 distinct, physically
separate microscopic units called chromosomes.
All genes are arranged linearly along the chromosomes. The nucleus of most
human cells contains 2 sets of chromosomes, 1 set given by each parent.
Chromosomes contain roughly equal parts of protein and DNA. DNA molecules
are among the largest molecules now known.
Sequences of Base Pairs Mapping ·
Genetic
maps are
representations of disease traits, physiological traits, or random genes
assigned to particular chromosomes and mapped relative to one another. ·
·
Physical
maps are
not representations but overlapping collections of DNA fragments. DNA is
snipped into fragments by the action of restriction enzymes, then cloned
and stored in a variety of forms. This tiny fragments (measured in
kilobases, kb) may then be analyzed by various means to discover the
base-by-base sequence of DNA. DNA Computing Molecular
biologists are beginning to unravel the information processing tools-such
as enzymes, copying tools, and so on-that evolution has spent billions of
years refining. Now we are taking those tools in large numbers of DNA
molecules and using them as biological computer processors. This
is how it works: ·
Information
specifying a computational problem too complex for even a supercomputer is
encoded in DNA. ·
Then
various molecular-biological tools are used to process this information. Algorithm
Turing
machine: A Turing machine is, as described in theoretical
informatics, a machine which may easily be described by a band, able to
store information and a few operations on it. Each rule is stated in the
form: "In state n, if the head is reading symbol x, write symbol y,
then move left or right one cell on the tape, and change the state to
m." It was shown by Dr. Adleman that the Turing machine has more
computing power than every other deterministic machine especially the
computer itself. A Turing machine is an all-purpose machine, which may be
used for every computation.
One example of a Turing machine. The goal of the computer, which works with DNA, is to
develop an all-purpose computer like the Turing machine. As a matter of
fact this is rather difficult. Every command should be possible in every
state of the machine and it should be readable (extractable) at every
point of time. Great efforts in research are necessary to implement this. Recent experiments on DNA computing have shown that
it is more efficient to use the power of DNA computing in specific
algorithms where the mapping in DNA is easy and the parallel computing
power of DNA’s are useable. Specifically, problems which are hard to
solve with Turing machines may work better (in
terms of speed and efficiency) with DNA computing. These problems
referred to, as NP-complete problems are not solvable in polynomial time
with Turing machines. This means that the number of steps to solve the
problems increase exponentially with the number of the input data. NP-complete problems are common in the real world.
Examples are the Hamiltonian path or the Shortest-Path in a grap. The Hamiltonian Path
Problem
Consider
a directed graph. This graph is said to have a Hamiltonian path if and
only if there exists a sequence of compatible ‘one way’ edges e1,
e2, …, ez (that is a path) which begins at vin,
ends at vout and enters every other vertex exactly once. Figure
1 shows a graph with an Hamiltonian path.
A
Hamiltonian Graph There are well known
algorithms for deciding whether an arbitrary directed graph with
designated vertices has a Hamiltonian path or not. However, all known
algorithms for this problem have exponential worst-case complexity and
hence there are instances of modest size for which these algorithms
require an impractical amount of computer time to render a decision. Since
the directed Hamiltonian path problem has been proven to be NP-complete,
it seem likely that no efficient (that is, polynomial time) algorithm
exists for solving it. Adleman has shown that
it is possible to solve the Hamilton path problem with DNA. He has used
the following (non deterministic) algorithm to solve the directed
Hamiltonian path problem: -
Step 1: Generate random paths through the graph. -
Step 2: Keep only those paths, which begin with vin and
end with vout. -
Step 3: If the graph has n
vertices, then keep only those paths, which enter exactly n vertices -
Step 4: Keep only those paths, which enter all of the vertices of
the graph at least once. -
Step
5: If any paths remain, say “Yes”, otherwise say “No”. How this algorithm works by implement it with DNA’s
will be explained later. The
real interesting thing on this DNA solution for the Hamiltonian path
problems is that most input data grow just linear with the growth of the
number of edges. Just the quantity of DNA’s should grow exponentially
but this is not a problem because the quantity of DNA’s for all known
problems is small enough. The computation itself is done in one step (step
1) because the DNA is trying all possibilities by itself. This Adleman
found could be done under an hour in parallel. This computing power is
impossible in a normal computer. Using DNA Computing to solve the “Hamiltonian path
problem”. Dr. Adleman found
a way to exploit the speed and efficiency of the biological reactions to
solve the “Hamiltonian path problem”. His DNA computing experiment is
known as the “start of a new era”, forging an unprecedented link
between computational science and life science. All computers in
existence today make use of binary code -1’s and 0’s, or on’s and
off’s on the circuits of a computer chip, forming the basis for every
calculation a computer performs, from simple addition to the most complex
differential equations. Since the DNA molecule is also a code, but instead
made up of a sequence of four bases, which pair up in a predictable
manner, Adleman saw the possibility of using it as a molecular computer. Rather than
relying on the position of electronic switches on a microchip, Adleman
relied on the much faster reactions of DNA nucleotides binding with their
complements, a brute force method that indeed worked. Dr. Adleman used
DNA to solve a system of six vertices, which is not difficult for modern
computers, but as the number of cities grows, so does the number of paths
between them, making a 1,000 city path impossible to solve for even the
best supercomputers. He accomplished this through the following steps: 1. He synthesize the DNA strands of known sequences, each
strand 20 nucleotides long. 2.
He represented each of the six vertices of the path by a separate
strand. Further he represented each edge between two consecutive vertices,
such as 1 to 2, by a DNA strand which consisted of the last ten
nucleotides of the strand-representing vertex 1 plus the first 10
nucleotides of the vertex 2 strand. Then, through the sheer amount of DNA
molecules joining together in all possible combinations, many random paths
were generated. Adelman used
well-established techniques of molecular for paths much larger than six
vertices, the power of molecular computers is still worth comparing to
today’s systems. In speed, the DNA clearly wins the race, performing
1,000 operations per second more than the fastest (which execute abrupt
1012 operations per second). To get a better idea of the speed, consider
that a typical desktop computer performs a thousand million times slower
than the DNA, a measly 106 operations per second! In energy
efficiently, a biological system such as a cell can perform 2x1019
power operations using one joule of energy (the amount of energy needed to
burn a 100-watt light bulb for a second), while a supercomputer only
manages 1010 operations, making it 1010 less energy
efficient! Just as the cell pushes the limit of the second law of
thermodynamics, which predicts that one joule can fuel a maximum of 34x1019
irreversible power operations, the DNA computer’s energy consumption
from DNA strand synthesis and PCR should also be small compared to that
used up by a supercomputer. The potential
for information storage in molecular computers follows the same trend as
speed and efficiency. While storage media of today, such as videotapes,
store information at a density of one bit per 1012 cubic
nanometer, the molecules of DNA make this figure seem ridiculous, with an
information storage density of 1 bit per cubic nanometer - a trillion
times less space! Even though
Adleman’s molecular computer would have a hard time multiplying two
100-digit integers, an easy task for one of today’s electronic
computers, its capability to solve complex to solve complex problems is
unparalleled. However, as work continues in this exciting area, molecular
computers may impress us once again and challenge the dominance of
electronic systems in solving even more problems. After all, the DNA based
system of computation has had millions of years to evolve and perfect
itself, while the man-made systems have only existed for a small fraction
of this span. It is an impressive computer indeed that can spend eons
producing new and varied organisms through trial and error until it
finally finds a solution - the intelligent species we call human beings.
The computers of tomorrow are on the threshold. DNA computing is
where silicon computing was the year after the transistor was invented. It
does not do anything useful yet, but who knows what might happen if you
play with it. DNA computing is
not a here-and-now-practical technology; it’s a pie-in-the-sky research
project. It has astounding possibilities, but it’s going to take a lot
of good ideas, hard work and luck to realize its potential. Other Algorithms where
DNA Computing would be preferred
A paper
by Dr. Adleman showed how the same principles can be applied to breaking
the Data Encryption Standard (DES). He described in detail a library of
operations, which are useful when working with a molecular computer. He
estimated that given one arbitrary (plain-text, cipher-text) pair, one can
recover the DES key in about 4 months of work. Furthermore he showed that
under chosen plain-text attack, it is possible to recover the DES key in
one day using some preprocessing. This method can be generalized to break
any cryptosystem, which uses keys of length less then 64 bits. Another
hard solvable problem is the dividing of a big number to its prime
numbers. This may be used to break the public key encryption. There are
lots of problems which come from the graph theory or from operations
research that are also hard solvable problems and are worth solving with
DNA computing. Conclusion The idea of DNA Based computing is to subvert the
mechanisms produced by evolution and use them to do data processing we
want to do. Advantages
Disadvantages
of DNA Computing:
References
1.
Molecular Computation of Solutions to Combinatorial Problems Nov. 11th 1994
…………. By Leonard Adleman 2.
Breaking DES Using a Molecular Computer
Priceton CS Tech Report 1996
………… By C. Dunworth et al 3.
On
the Computational Power of DNA ………….
By D. Boneh and R. Lipton 4.
Web
Sites: http://www.englib.cornell.edu/scitech/w96/DNA.html http://www.bis.med.jhmi.edu/Dan/DOE/prim1.html http://www.utu.fi/ml/mat/tucs/dna.html http://crypto.stanford.edu/~dabo/biocomp.html http://www.accessexcellence.org/AB/GG/human.html http://www.msci.memphis.edu/~garzonm/bmc.html |
||