DNA Computing  
 Member: Juliet Idemudia    Andreas Heigl    Levent Saribas    gongpu luo     Hassan Syed  
   
Please click here to see the presentation slides  

 

DNA – Computing  

Attribution

 

For Dr. Leonard Adleman.

 

He is the author of Molecular Computation of Solutions To Combinatorial Problems which has led to the continuously growing interest on “DNA based computers”.

 

He writes with exceptional clarity and has time to give answers to questions posted in his web site.

 

His DNA computing experiment has been heralded as the “first example of true nanotechnology” and even the “start of a new era”, forging an unprecedented link between computational science and life science.

 

Abstract

 

Molecular biologists are beginning to unravel the information-processing tools such as enzymes that evolution has spent billions of years refining. These tools are now been taken in large numbers of DNA molecules and using them as biological computer processors.

 

Dr. Leonard Adleman, a well-known scientist, found a way to exploit the speed and efficiency of the biological reactions to solve the “Hamiltonian path problem”, also known as the “traveling salesman problem”.  

 

Based on Dr. Adleman’s experiment, we will explain DNA computing, its algorithms, how to manage DNA based computing and the advantages and disadvantages of DNA computing.

 

 

Introduction

 

Molecular biology could be used to solve a difficult computational problem.

Prof. L. Adleman reported the first actual experiment on 1994.

Adleman demonstrated how standard methods of molecular biology could be used to solve a difficult computational problem. Since then the interest on DNA based computers has been continuously growing.       

Definitions

DNA means DeoxyriboNucleic Acid.

The complete set of instructions for making an organism is called its GENOME. It contains the master blueprint for all cellular structures and activities for the lifetime of the cell or organism. Found in every nucleus of a persons many trillions of cells, the human genomes consists of tightly coiled threads of deoxyribonucleic acid (DNA) and associated protein molecules, organized into structures called chromosomes.

If unwound and tied together, the strands of DNA would stretch more than 5feet but would be only 50 trillionths of an inch wide. For each organism, the components of these slender threads encode all the information necessary for building and maintaining life, from simple bacteria to remarkably complex human beings. Understanding how DNA performs this function requires some knowledge of its structure and organization.

 

DNA

In humans, as in other higher organisms, a DNA molecule consists of two strands that wrap around each other to resemble a twisted ladder whose sides, made of sugar and phosphate molecules, are connected by rings of nitrogen-containing chemicals called bases. Each strand is a linear arrangement of repeating similar units called nucleotides.  A nucleotide is composed of one sugar, one phosphate, and a nitrogenous base.

 

 

 

 

Four different bases are present in DNA. They are adenine (A), thymine (T), cytosine(C), and guanine(G). The particular order of the bases arranged along the sugar-phosphate backbone is called DNA sequence; the sequence specifies the exact genetic instructions required to create a particular organism with its own unique traits. The two DNA strands are held together by weak bonds between the bases on each strand, forming base pairs.

 

Each time a cell divides into two daughter cells, its full genome is duplicated; for humans and other complex organisms, this duplication occurs in the nucleus. During cell division, the DNA molecule unwinds and the weak bonds between the base pairs break, allowing the strands to separate. Each strand directs the synthesis of a complementary new strand, with free nucleotides matching up with their complementary bases on each of the separated strands. Strict base-pairing rules are adhered to. Adenine will pair only with thymine (an A-T pair) and cytosine with guanine (a C-G pair). Each daughter cell rules ensures that the new strand is an exact copy of the old one. This minimizes the incidences of errors (mutations) that may affect the resulting organism or its offspring.

 

 

 

 

Genes

Each DNA molecule contains many genes, the basic physical and functional units of hereditary. A gene is a specific sequence of nucleotide bases, whose sequences carry the information required for constructing proteins, which provides the structural components of cells and tissues as well as enzymes for essential biochemical reactions. The human genome is estimated to comprise at least 100,000genes.

 

 

Chromosomes

The 3 billion base pairs in the human genome are organized into 24 distinct, physically separate microscopic units called chromosomes. All genes are arranged linearly along the chromosomes. The nucleus of most human cells contains 2 sets of chromosomes, 1 set given by each parent. Chromosomes contain roughly equal parts of protein and DNA. DNA molecules are among the largest molecules now known.

 

 

 

Sequences of Base Pairs Mapping

·        Genetic maps are representations of disease traits, physiological traits, or random genes assigned to particular chromosomes and mapped relative to one another.

·       

 

·        Physical maps are not representations but overlapping collections of DNA fragments. DNA is snipped into fragments by the action of restriction enzymes, then cloned and stored in a variety of forms. This tiny fragments (measured in kilobases, kb) may then be analyzed by various means to discover the base-by-base sequence of DNA.

 

DNA Computing

Molecular biologists are beginning to unravel the information processing tools-such as enzymes, copying tools, and so on-that evolution has spent billions of years refining. Now we are taking those tools in large numbers of DNA molecules and using them as biological computer processors.

 

This is how it works:

·        Information specifying a computational problem too complex for even a supercomputer is encoded in DNA.

·        Then various molecular-biological tools are used to process this information.

Algorithm

Turing machine:

A Turing machine is, as described in theoretical informatics, a machine which may easily be described by a band, able to store information and a few operations on it. Each rule is stated in the form: "In state n, if the head is reading symbol x, write symbol y, then move left or right one cell on the tape, and change the state to m." It was shown by Dr. Adleman that the Turing machine has more computing power than every other deterministic machine especially the computer itself. A Turing machine is an all-purpose machine, which may be used for every computation.

One example of a Turing machine.

 

The goal of the computer, which works with DNA, is to develop an all-purpose computer like the Turing machine. As a matter of fact this is rather difficult. Every command should be possible in every state of the machine and it should be readable (extractable) at every point of time. Great efforts in research are necessary to implement this.

Recent experiments on DNA computing have shown that it is more efficient to use the power of DNA computing in specific algorithms where the mapping in DNA is easy and the parallel computing power of DNA’s are useable. Specifically, problems which are hard to solve with Turing machines may work better (in  terms of speed and efficiency) with DNA computing. These problems referred to, as NP-complete problems are not solvable in polynomial time with Turing machines. This means that the number of steps to solve the problems increase exponentially with the number of the input data.

NP-complete problems are common in the real world. Examples are the Hamiltonian path or the Shortest-Path in a grap.

 

The Hamiltonian Path Problem
Consider a directed graph. This graph is said to have a Hamiltonian path if and only if there exists a sequence of compatible ‘one way’ edges e1, e2, …, ez (that is a path) which begins at vin, ends at vout and enters every other vertex exactly once. Figure 1 shows a graph with an Hamiltonian path.

 

A Hamiltonian Graph

 

There are well known algorithms for deciding whether an arbitrary directed graph with designated vertices has a Hamiltonian path or not. However, all known algorithms for this problem have exponential worst-case complexity and hence there are instances of modest size for which these algorithms require an impractical amount of computer time to render a decision. Since the directed Hamiltonian path problem has been proven to be NP-complete, it seem likely that no efficient (that is, polynomial time) algorithm exists for solving it.

Adleman has shown that it is possible to solve the Hamilton path problem with DNA. He has used the following (non deterministic) algorithm to solve the directed Hamiltonian path problem:

-         Step 1: Generate random paths through the graph.

-         Step 2: Keep only those paths, which begin with vin and end with vout.

-         Step 3: If the graph has n vertices, then keep only those paths, which enter exactly n vertices

-         Step 4: Keep only those paths, which enter all of the vertices of the graph at least once.

-         Step 5: If any paths remain, say “Yes”, otherwise say “No”.

How this algorithm works by implement it with DNA’s will be explained later.

The real interesting thing on this DNA solution for the Hamiltonian path problems is that most input data grow just linear with the growth of the number of edges. Just the quantity of DNA’s should grow exponentially but this is not a problem because the quantity of DNA’s for all known problems is small enough. The computation itself is done in one step (step 1) because the DNA is trying all possibilities by itself. This Adleman found could be done under an hour in parallel. This computing power is impossible in a normal computer.

Using DNA Computing to solve the “Hamiltonian path problem”.

 

Dr. Adleman found a way to exploit the speed and efficiency of the biological reactions to solve the “Hamiltonian path problem”. His DNA computing experiment is known as the “start of a new era”, forging an unprecedented link between computational science and life science.

All computers in existence today make use of binary code -1’s and 0’s, or on’s and off’s on the circuits of a computer chip, forming the basis for every calculation a computer performs, from simple addition to the most complex differential equations. Since the DNA molecule is also a code, but instead made up of a sequence of four bases, which pair up in a predictable manner, Adleman saw the possibility of using it as a molecular computer.

Rather than relying on the position of electronic switches on a microchip, Adleman relied on the much faster reactions of DNA nucleotides binding with their complements, a brute force method that indeed worked.

 

Dr. Adleman used DNA to solve a system of six vertices, which is not difficult for modern computers, but as the number of cities grows, so does the number of paths between them, making a 1,000 city path impossible to solve for even the best supercomputers. He accomplished this through the following steps:

 

1.   He synthesize the DNA strands of known sequences, each strand 20 nucleotides long.

2.      He represented each of the six vertices of the path by a separate strand. Further he represented each edge between two consecutive vertices, such as 1 to 2, by a DNA strand which consisted of the last ten nucleotides of the strand-representing vertex 1 plus the first 10 nucleotides of the vertex 2 strand. Then, through the sheer amount of DNA molecules joining together in all possible combinations, many random paths were generated.

 

Adelman used well-established techniques of molecular for paths much larger than six vertices, the power of molecular computers is still worth comparing to today’s systems. In speed, the DNA clearly wins the race, performing 1,000 operations per second more than the fastest (which execute abrupt 1012 operations per second). To get a better idea of the speed, consider that a typical desktop computer performs a thousand million times slower than the DNA, a measly 106 operations per second!

In energy efficiently, a biological system such as a cell can perform 2x1019 power operations using one joule of energy (the amount of energy needed to burn a 100-watt light bulb for a second), while a supercomputer only manages 1010 operations, making it 1010 less energy efficient! Just as the cell pushes the limit of the second law of thermodynamics, which predicts that one joule can fuel a maximum of 34x1019 irreversible power operations, the DNA computer’s energy consumption from DNA strand synthesis and PCR should also be small compared to that used up by a supercomputer.

 

The potential for information storage in molecular computers follows the same trend as speed and efficiency. While storage media of today, such as videotapes, store information at a density of one bit per 1012 cubic nanometer, the molecules of DNA make this figure seem ridiculous, with an information storage density of 1 bit per cubic nanometer - a trillion times less space!

 

Even though Adleman’s molecular computer would have a hard time multiplying two 100-digit integers, an easy task for one of today’s electronic computers, its capability to solve complex to solve complex problems is unparalleled. However, as work continues in this exciting area, molecular computers may impress us once again and challenge the dominance of electronic systems in solving even more problems. After all, the DNA based system of computation has had millions of years to evolve and perfect itself, while the man-made systems have only existed for a small fraction of this span. It is an impressive computer indeed that can spend eons producing new and varied organisms through trial and error until it finally finds a solution - the intelligent species we call human beings.  The computers of tomorrow are on the threshold.

 

DNA computing is where silicon computing was the year after the transistor was invented. It does not do anything useful yet, but who knows what might happen if you play with it.

 

DNA computing is not a here-and-now-practical technology; it’s a pie-in-the-sky research project. It has astounding possibilities, but it’s going to take a lot of good ideas, hard work and luck to realize its potential.

Other Algorithms where DNA Computing would be preferred

 

A paper by Dr. Adleman showed how the same principles can be applied to breaking the Data Encryption Standard (DES). He described in detail a library of operations, which are useful when working with a molecular computer. He estimated that given one arbitrary (plain-text, cipher-text) pair, one can recover the DES key in about 4 months of work. Furthermore he showed that under chosen plain-text attack, it is possible to recover the DES key in one day using some preprocessing. This method can be generalized to break any cryptosystem, which uses keys of length less then 64 bits.

 

Another hard solvable problem is the dividing of a big number to its prime numbers. This may be used to break the public key encryption. There are lots of problems which come from the graph theory or from operations research that are also hard solvable problems and are worth solving with DNA computing.

Conclusion

The idea of DNA Based computing is to subvert the mechanisms produced by evolution and use them to do data processing we want to do.

 

Advantages

 

  1. By DNA computing, people can get and analyze the information of materials. Through DNA computing, we can find all the genes in the DNA sequence and to develop tools for using this information in the study of some fields, such as biology, medicine biology, physics, and so on. The team from HP and U.C.L.A. has found a way to build circuits using chemical processes, making the switches as small as a molecule. Tim Gardner, a graduate student at Boston University, recently made a genetic system that can store a single bit of information—either a 1 or a 0.
  2. More parallel: for some problem too big to fit or run in a silicon machine, DNA computer, which be with pure parallel power or massive memory, will be able to do a computation quickly than a powerful supercomputer.
  3. By creating DNA computing, some fields are combined together to reach a desirable goal, by the way, this is improving those fields and some new fields come out.

 

 

Disadvantages of DNA Computing:

  1. Slow: algorithms proposed so far use really slow molecular-biological operations. Each primitive operation takes hours when you run them with a small test tube of DNA. Scale up to the vast amounts of DNA we're talking about, and they may slow down dramatically.

 

  1. Hydrolysis: the DNA molecules can fracture. Over the six months you're computing, your DNA system is gradually turning to water. DNA molecules can break – meaning a DNA molecule, which was part of your computer, is fracture by time.

 

  1. Unreliable: every operation you want to do in your DNA computer is random. The components in the DNA computer are probabilistic. Because there are some noisy components, the computing sometimes is not reliable. If a tiny subcircuit is supposed to give the answer "1," it may yield that answer 90 percent of the time and "0" the rest of the time. To make DNA computing work, we have to figure out how to build a reliable computer out of noisy components.

 

  1. Not transmittable: the model of the DNA computer is concerned as a highly parallel computer, with each DNA molecule acting as a separate process or. In a standard multiprocessor a Connection-buses transmit information from one processor to the next. But the problem of transmitting information from one molecule to another in a DNA computer has yet to be solved. Current DNA algorithms compute successfully without passing any information, but this limits their flexibility.

 

  1. Not practical: DNA computing is not a here and now practical technology. It just is a pie-in-the-sky research project.

 

  1. No generality: Some concrete algorithms are just for solving some concrete problems. Every algorithm has some constraints on it.

 

 

References

 

1.         Molecular Computation of Solutions to Combinatorial Problems

Nov. 11th 1994

                                                                …………. By Leonard Adleman

 

 

2.         Breaking DES Using a Molecular Computer

            Priceton CS Tech Report 1996

                                                                ………… By C. Dunworth et al

 

3.            On the Computational Power of DNA

    …………. By D. Boneh and R. Lipton

 

 

4.            Web Sites:

http://www.englib.cornell.edu/scitech/w96/DNA.html

http://www.bis.med.jhmi.edu/Dan/DOE/prim1.html

http://www.utu.fi/ml/mat/tucs/dna.html

http://crypto.stanford.edu/~dabo/biocomp.html

http://www.accessexcellence.org/AB/GG/human.html

http://www.msci.memphis.edu/~garzonm/bmc.html