DNA- and RNA-Based Computing Systems. Группа авторовЧитать онлайн книгу.

be noted that the given procedure eliminates a large number of unsatisfying literal strings, which makes it easier to deduce the correct solution from the analysis of the remaining satisfying literal strings. The procedure is useful for large size problems. However, it also has a risk of missing some literal strings due to experimental errors that may lead to an erroneous solution to the given SAT problem.

For the given illustrative SAT problem [(x₁∨x₂) ∧ (¬x₂∨x₃)], a single‐stranded sequence for all literals is ligated to form the literal strings (x₁, ¬x₂), (x₁, x₃), (x₂, ¬x₂), and (x₂, x₃). The ligated strings are shown in Figure 2.5. These ligated strings will be subjected to restriction enzyme digestion, which eliminates the string (x₂, ¬x₂), and finally, the remaining literal strings are (x₁, ¬x₂), (x₁, x₃), and (x₂, x₃). From these remaining three literal strings, a solution to the given SAT problem is deduced by mathematical analysis. It is to be mentioned that only one literal string is eliminated for the given problem as it has only (2)² [= (number of literals in each clause)^{number of clauses}] equal to four literal strings. Though for the given illustrative example the search space is reduced only by 25% (as it removes only one literal string), for bigger problems the reduction in search space is significant. Sakamoto et al. [5] illustrated the benefit of the method by solving 6‐variable, 10‐clause problem [= (x₁ ∨x₂ ∨¬ x₃) ∧(x₁ ∨x₃ ∨ x₄) ∧(x₁ ∨¬ x₃ ∨¬ x₄) ∧(¬ x₁ ∨¬ x₃ ∨x₄) ∧(x₁ ∨¬ x₃ ∨x₅) ∧(x₁ ∨x₄ ∨¬ x₆) ∧(¬ x₁ ∨x₃ ∨ x₄) ∧(x₁ ∨x₃ ∨¬ x₄) ∧(¬ x₁ ∨¬ x₃ ∨¬ x₄) ∧(¬ x₁ ∨ x₃ ∨¬ x₄)] having three literals in each clause. This example has 3¹⁰ = 59 049 literal strings out of which only 24 literal strings were found to be satisfying the condition. Thus, ∼99.95% of the search space is eliminated using the above methodology to finally obtain the correct solution just by analyzing 24 literal strings. One of the advantages of this methodology is the use of just one step, unlike sequential elimination steps used in earlier models.

Illustration of four literal strings ligated for the DNA hairpin formation-based computation.

Figure 2.5 Illustration of four literal strings for the DNA hairpin formation‐based computation.

2.2.5 Ouyang's Model

Ouyang et al. [11] solved a maximal clique problem using DNA computing method. A maximal clique is the largest subset in the graph in which each vertex is connected to every other vertex in the subset. In maximal clique problem, the maximum size of the clique in terms of the vertices has to be evaluated. For example (Figure 2.6a), the graph has five vertices, and eight edges where the vertices (5, 4, 2, 1) is the largest clique; thus the maximum size of the clique is four. Ouyang et al. [11] solved the maximal clique problem using DNA computing as follows.

Illustration of the (left) five-node super graph has five vertices, and eight edges where the vertices (5, 4, 2, 1) is the largest clique and (right) its complementary graph that is used to solve the maximal clique problem.

Figure 2.6 (a) The five‐node graph and (b) its complementary graph used to solve the maximal clique problem.

First, all possible cliques in the graph of N vertices are represented by an N‐digit binary string. In the clique, if the vertex is present, then it is represented by “1,” and if the vertex is absent, then it is represented by “0.” For the case of 5‐vertex graph (shown in Figure 2.6a), a clique involving {5, 4, 2} vertices is represented by a binary string as {11010}. Initially, all possible combinations of this N‐digit binary number are generated. Some of the vertices in the graph are not connected by the edges. A graph of such unconnected vertices is referred to as the complementary graph (see Figure 2.6b). In the next step, the combinations comprising the edges present in the complementary graph are removed. For the given illustrative example (Figure 2.6b), the combinations with {cc11c} and {c11cc} are removed from the data pool (c ɛ {0, 1}). Lastly, find out the binary number having the largest number of “1,” which represents the size of the maximal clique. This procedure is performed using the DNA sequences as follows.

Each bit in the above binary string corresponds to a vertex and is represented as a DNA sequence having three parts. In this, the second part corresponds to the vertex number, whereas the first and the third part correspond to the position number. Further, these vertices have to be connected in sequential order (e.g. V₁–V_N) as represented in the binary string. In order to have such connection, the value “0” for the first vertex is represented as P₁V₁⁰P₂, whereas the value “1” is represented by P₁V₁¹P₂. Since the next vertex V₂ has to be connected to V₁, its “0” value is represented by the complementary sequence images , whereas its value “1” is represented by the sequence images . Similarly, the next vertex V₃ will be connected to V₂ by P₃V₃⁰P₄ and P₃V₃¹P₄ for “0” and “1” values. Thus, all odd‐numbered vertices are represented by P_iV_i⁰P_i+1 and P_iV_i¹P_i+1 for “0” and “1” values, whereas those for even‐numbered vertices are represented by images and images , respectively. Further, all sequences with images have 10 nucleotides representing the second part, whereas these are absent in images . The ssDNA sequences corresponding to each bit combine in a sequential manner to form a dsDNA representing all combination of 0 or 1 for each vertex. Initially, all the sequences starting with P₁ and ending with P_N are amplified using PCR. The complementary graph is removed from the data pool by treating the DNA solutions using the restriction enzyme repeatedly for each edge present in the corresponding graph. For each edge, the DNA solution is divided into two equal parts in two tubes, and the restriction enzymes corresponding to the constituting vertices are added in the respective tubes. After restriction enzyme digestion, solutions of both tubes are added, and the process is repeated for all the edges of the corresponding graph. After this, a PCR is performed with starting P₁ and ending P_N. The amplified solutions are then analyzed using gel electrophoresis to obtain the maximal clique for the given supergraph. On the gel electrophoresis, the largest clique is corresponding to the shortest DNA strands.

Скачать книгу