-
Notifications
You must be signed in to change notification settings - Fork 0
/
cpu_tour.html
366 lines (338 loc) · 17.7 KB
/
cpu_tour.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
<div style="width:700px; height:400px">
<!-- from http://devolab.cse.msu.edu/software/avida/doc/cpu_tour.html -->
<div class="revision" style="text-align:right">
Revised 2006-09-05 DMB
</div>
<div align="center">
<h1>A Guided Tour of an Ancestor and its Hardware</h1>
</div>
<p>
This document describes the structure of the classic virtual CPU and an
example organism running on it.
</p>
<h2>The Virtual CPU Structure</h2>
<p>
The virtual CPU, which is the default "body" or "hardware" of the organisms,
contains the following set of components, (as further illustrated in the
figure below).
</p>
<ul>
<li>A <strong style="color: #006666">memory</strong> that consists of a sequence
of instructions, each associated with a set of flags to denote if the
instruction has been executed, copied, mutated, etc.</li>
<li>An <strong style="color: #880088">instruction pointer</strong> (IP) that
indicates the next site in the memory to be executed.</li>
<li>Three <strong style="color: #FF8888">registers</strong> that can be used
by the organism to hold data currently being manipulated. These are often
operated upon by the various instructions, and can contain arbitrary
32-bit integers.</li>
<li>Two <strong style="color: #008800">stacks</strong> that are used for storage.
The organism can theoretical store an arbitrary amount of data in the
stacks, but for practical purposes we currently limit the maximum stack
depth to ten.</li>
<li>An <strong style="color: #CCCC00">input buffer</strong> and an
<strong style="color: #CCCC00">output buffer</strong> that the organism
uses to receive information, and return the processed results.</li>
<li>A <strong style="color: #8888FF">Read-Head</strong>, a
<strong style="color: #8888FF">Write-Head</strong>, and a
<strong style="color: #8888FF">Flow-Head</strong> which are used to
specify positions in the CPU's memory. A copy command reads from the
Read-Head and writes to the Write-Head. Jump-type statements move the IP
to the Flow-Head.</li>
</ul>
<div align="center">
<img src="images/cpu2.gif" />
</div>
<p> </p>
<h2>Instruction Set Configuration</h2>
<p>
The instruction set in Avida is loaded on startup from a configuration file
specified in the <kbd>avida.cfg</kbd> file. This allows selection of different
instruction sets without recompiling the source code, as well as allowing different
sized instruction sets to be specified. It is not possible to alter the behavior
of individual instructions or add new instructions without recompiling Avida;
such activities have to be done directly in the source code.
<p>
The available instructions are listed in the <kbd>inst_set.*</kbd> files with
a <code>1</code> or a <code>0</code> next to an instruction to indicate if it
should or should not be included. Changing the instruction set to be used
simply involves adjusting these flags.
<p>
The instructions were created with three things in mind:
<ul>
<li>
To be as complete as possible (both in a "Turing complete" sense -- that
is, it can compute any computable function -- and, more practically, to ensure that
simple operations only require a few instructions).
</li>
<li>
For each instruction to be as robust and versatile as possible; all instructions
should take an "appropriate" action in any situation where they can be
executed.
</li>
<li>
To have as little redundancy as possible between instructions. (Several instructions
have been implemented that are redundant, but such combinations will typically not
be turned on simultaneously for a run.)
</li>
</ul>
<p>
One major concept that differentiates this virtual assembly language from its
real-world counterparts is in the additional uses of <code>nop</code> instructions
(no-operation commands).
These have no direct effect on the virtual CPU when executed, but often
modify the effect of any instruction that precedes them. In a sense, you
can think of them as purely regulatory genes. The default instruction set
has three such <code>nop</code> instructions: <code>nop-A</code>, <code>nop-B</code>,
and <code>nop-C</code>.
</p>
<p>
The remaining instructions can be seperated into three classes. The first
class is those few instructions that are unaffected by nops. Most of these
are the "biological" instructions involved directly in the replication
process. The second class of instructions is those for which a <code>nop</code>
changes the head or register affected by the previous command. For example,
an <code>inc</code> command followed by the instruction <code>nop-A</code> would
cause the contents of the <code>AX</code> register to be incremented, while an
<tt>inc</tt> command followed by a <tt>nop-B</tt> would increment <code>BX</code>.
</p>
<p>
The notation we use in instruction definitions to describe that a default
component (that is, a register or head) can be replaced due to a nop command
is by surrounding the component name with <code>?</code>'s. The component listed is the
default one to be used, but if a <code>nop</code> follows the command, the
component it represents in this context will replace this default.
If the component between the question marks is a register than a
subsequent <code>nop-A</code> represents the <code>AX</code> register,
<code>nop-B</code> is <code>BX</code>, and <code>nop-C</code> is <code>CX</code>.
If the component listed is a head (including the instruction pointer) then a
<code>nop-A</code> represents the Instruction Pointer, <code>nop-B</code> represents
the Read-Head, and <code>nop-C</code> is the Write-Head. Currently
the Flow-Head has no <code>nop</code> associated with it.
</p>
<p>
The third class of instructions are those that use a series of <tt>nop</tt>
instructions as a template (label) for a command that needs to reference
another position in the code, such as
<code>h-search</code>. If <code>nop-A</code> follows a search command, it scans for
the first complementary template (<code>nop-B</code>) and moves the Flow-Head
there. Templates may be composed of more than a single <code>nop</code>
instruction. A series of nops is typically abbreviated to the associated
letter and separated by colons. This the sequence "<code>nop-A nop-A nop-C</code>"
would be displayed as "<code>A:A:C</code>".
</p>
<p>
The label system used in Avida allows for an arbitrary number of nops. By
default, we have three: <code>nop-A</code>'s complement is <code>nop-B</code>,
<code>nop-B</code>'s is <code>nop-C</code>, and <code>nop-C</code>'s is <code>nop-A</code>.
Likewise, some instructions talk about the complement of a register or head
-- the same pattern is used in those cases. So if an instruction tests if
<code>?BX?</code> is equal to its complement, it will test if
<code>BX == CX</code> by default, but if it is followed by a
<code>nop-C</code> it will test if <code>CX == AX</code>.
</p>
<p> </p>
<h2>Instruction Set Reference</h2>
<p>
An abbreviated description of the 26
default instructions is below.
<table>
<tr><th valign=top>(a-c) <td><code>nop-A</code>, <code>nop-B</code>, <br> and <code>nop-C</code>
<td valign=top>No-operation instructions; these modify other instructions.</tr>
<tr><th>(d) <td><code>if-n-equ</code>
<td>Execute next instruction only-if ?BX? does not equal its complement</tr>
<tr><th>(e) <td><code>if-less</code>
<td>Execute next instruction only if ?BX? is less than its complement</tr>
<tr><th>(f) <td><code>if-label</code>
<td>Execute the next instruction only if the given template complement was just copied</tr>
<tr><th>(g) <td><code>mov-head</code>
<td>Move the ?IP? to the same position as the Flow-Head</tr>
<tr><th>(h) <td><code>jmp-head</code>
<td>Move the ?IP? by a fixed amount found in CX</tr>
<tr><th>(i) <td><code>get-head</code>
<td>Write the position of the ?IP? into CX</tr>
<tr><th>(j) <td><code>set-flow</code>
<td>Move the Flow-Head to the memory position specified by ?CX?</tr>
<tr><th>(k) <td><code>shift-r</code>
<td>Shift all the bits in ?BX? one to the right</tr>
<tr><th>(l) <td><code>shift-l</code>
<td>Shift all the bits in ?BX? one to the left</tr>
<tr><th>(m) <td><code>inc</code>
<td>Increment ?BX?</tr>
<tr><th>(n) <td><code>dec</code>
<td>Decrement ?BX?</tr>
<tr><th>(o) <td><code>push</code>
<td>Copy the value of ?BX? onto the top of the current stack</tr>
<tr><th>(p) <td><code>pop</code>
<td>Remove a number from the current stack and place it in ?BX?</tr>
<tr><th>(q) <td><code>swap-stk</code>
<td>Toggle the active stack</tr>
<tr><th>(r) <td><code>swap</code>
<td>Swap the contents of ?BX? with its complement.</tr>
<tr><th>(s) <td><code>add</code>
<td>Calculate the sum of BX and CX; put the result in ?BX?</tr>
<tr><th>(t) <td><code>sub</code>
<td>Calculate the BX minus CX; put the result in ?BX?</tr>
<tr><th>(u) <td><code>nand</code>
<td>Perform a bitwise NAND on BX and CX; put the result in ?BX?</tr>
<tr><th>(v) <td><code>h-copy</code>
<td>Copy an instruction from the Read-Head to the Write-Head and advance both.</tr>
<tr><th>(w) <td><code>h-alloc</code>
<td>Allocate memory for an offspring</tr>
<tr><th>(x) <td><code>h-divide</code>
<td>Divide off an offspring located between the Read-Head and Write-Head.</tr>
<tr><th>(y) <td><code>IO</code>
<td>Output the value ?BX? and replace it with a new input</tr>
<tr><th>(z) <td><code>h-search</code>
<td>Find a complement template and place the Flow-Head after it.</tr>
</table>
<h3>An Example Ancestor</h3>
The following organism is stored in the file <kbd>organism.heads.15</kbd>,
which you should find in the <kbd>support/config/misc/</kbd> directory. This is a
simplified version of <kbd>organism.default</kbd> and
<kbd>organism.heads.100</kbd>, of lengths 50 and 100 respectively (each has
additional instructions placed before the copy loop)
<p>
<table>
<tr><td colspan=2><font color="#886600"># --- Setup ---</font><br>
<tr><td><b><code>h-alloc</code></b> <td><font color="#886600"># Allocate extra space at the end of the genome to copy the offspring into.</font><br>
<tr><td><b><tt>h-search </tt></b> <td><font color="#886600"># Locate an
<tt>A:B</tt> template (at the end of the organism) and
place the Flow-Head after it</font><br>
<tr><td><b><tt>nop-C</tt></b> <td><font color="#886600">#</font><br>
<tr><td><b><tt>nop-A</tt></b> <td><font color="#886600">#</font><br>
<tr><td><b><tt>mov-head</tt></b> <td><font color="#886600"># Place the
Write-Head at the Flow-Head (which is at beginning of
offspring-to-be).</font><br>
<tr><td><b><tt>nop-C</tt></b> <td><font color="#886600"># [ Extra
<tt>nop-C</tt> commands can be placed here w/o harming the
organism! ]</font><br>
<tr><td>
<tr><td colspan=2><font color="#886600"># --- Copy Loop ---</font><br>
<tr><td><b><tt>h-search</tt></b> <td><font color="#886600"># No template,
so place the Flow-Head on the next line code</font><br>
<tr><td><b><code>h-copy</code></b> <td><font color="#886600"># Copy a single
instruction from the read head to the write head (and advance both
heads!)</font><br>
<tr><td><b><code>if-label</code></b> <td><font color="#886600"># Execute the
line following this template <i>only if</i> we have just copied an
<code>A:B</code> template.</font><br>
<tr><td><b><code>nop-C</code></b> <td><font color="#886600">#</font><br>
<tr><td><b><code>nop-A</code></b> <td><font color="#886600">#</font><br>
<tr><td><b><code>h-divide</code></b> <td><font color="#886600"># ...Divide off offspring! (note if-statement above!)</font><br>
<tr><td><b><code>mov-head</code></b> <td><font color="#886600"># Otherwise,
move the IP back to the Flow-Head at the beginning of the copy
loop.</font><br>
<tr><td><b><code>nop-A</code></b> <td><font color="#886600"># End label.</font><br>
<tr><td><b><code>nop-B</code></b> <td><font color="#886600"># End label.</font><br>
</table>
<p>
This program begins by allocating extra space for its offspring. The
exact amount of space does not need to be specified -- it will allocate as
much as it is allowed to. The organism will then do a search for the end
of its genome (where this new space was just placed) so that it will know
where to start copying. First the Flow-Head is placed there, and then the
Write-Head is moved to the same point.
</p>
<p>
It is after this initial setup and before the actual copying process
commences that extra <code>nop</code> instructions can be included. The only
caveat is that you need to make sure that you don't duplicate any templates
that the program will be searching for, or else it will no longer function
properly. The easiest thing to do is insert a long sequence of <code>nop-C</code>
instructions.
</p>
<p>
Next we have the beginning of the "copy loop". This segement of code starts
off with an <code>h-search</code> command with no template following it. In
such as case, the Flow-Head is placed on the line immediately following the
search. This head will be used to designate the place that the IP keeps
returning to with each cycle of the loop.
</p>
<p>
The <code>h-copy</code> command will copy a single instruction from the Read-Head
(still at the very start of the genome, where it begins) to the Write-Head
(which we placed at the beginning of the offspring). With any copy command
there is a user-specified chance of a copy mutation. If one occurs, the
Write-Head will place a random instruction rather than the one that it
gathered from the Read-Head. After the copy occurs (for better or worse),
both the Read-Head and the Write-Head are advanced to the next instruction
in the genome. It is for this reason that a common mutation we see happening
will place a long string of h-copy instruction one after another.
</p>
<p>
The next command, <code>if-label</code> (followed by a <code>nop-C</code> and a
<code>nop-A</code>) tests to see if the complement of <code>C:A</code> is the most
thing copied. That is, if the two most
recent instructions copied were a <code>nop-A</code> followed by a <code>nop-B</code>
as is found at the end of the organism. If so, we are done! Execute the
next instruction which is <code>h-divide</code> (when this occurs, the read and
write heads will surround the portion of memory to be split off as the
offspring's genome). If not, then we need to keep
going. Skip the next instruction and move on to the <code>mov-head</code> which
will move the head specified by the <code>nop</code> that follows (in this case
<code>nop-A</code> which is the Instruction Pointer) to the Flow-Head at the
beginning of the copy loop.
</p>
<p>
This process will continue until all of the lines of code have been copied,
and an offspring is born.
</p>
<h3>An Example Logic Gene</h3>
<p>
Here is a short example program to demonstrate one way for an organism to
perform the "OR" logic operation. This time I'm only going to show the
contents of the registers after each command because the functionality of
the individual instructions should be clear, and the logic itself won't
be helped much by a line-by-line explanation in English.
<p>
<center>
<table width=75%>
<tr><th bgcolor="#CCCCFF" width=11%>Line #
<th bgcolor="#CCCCFF" width=34%>Instruction
<td bgcolor="#CCCCFF" width=11%><b>AX</b>
<td bgcolor="#CCCCFF" width=11%><b>BX</b>
<td bgcolor="#CCCCFF" width=11%><b>CX</b>
<td bgcolor="#CCCCFF" width=11%><b>Stack</b>
<td bgcolor="#CCCCFF" width=11%><b>Output</b>
<tr><th bgcolor="#FF88FF"> 1 <th bgcolor="#88FFFF">IO
<td bgcolor="#FFBBBB">? <td bgcolor="#FFBBBB">X <td bgcolor="#FFBBBB">?
<td bgcolor="#AAFFAA">? <td bgcolor="#FFFF88"> ?
<tr><th bgcolor="#FF88FF"> 2 <th bgcolor="#88FFFF">push
<td bgcolor="#FFBBBB">? <td bgcolor="#FFBBBB">X <td bgcolor="#FFBBBB">?
<td bgcolor="#AAFFAA">X, ? <td bgcolor="#FFFF88">
<tr><th bgcolor="#FF88FF"> 3 <th bgcolor="#88FFFF">pop
<td bgcolor="#FFBBBB">? <td bgcolor="#FFBBBB">X <td bgcolor="#FFBBBB">X
<td bgcolor="#AAFFAA">? <td bgcolor="#FFFF88">
<tr><th bgcolor="#FF88FF"> 4<th bgcolor="#88FFFF">nop-C <td colspan=5>
<tr><th bgcolor="#FF88FF"> 5 <th bgcolor="#88FFFF">nand
<td bgcolor="#FFBBBB">~X <td bgcolor="#FFBBBB">X <td bgcolor="#FFBBBB">X
<td bgcolor="#AAFFAA">? <td bgcolor="#FFFF88">
<tr><th bgcolor="#FF88FF"> 6<th bgcolor="#88FFFF">nop-A <td colspan=5>
<tr><th bgcolor="#FF88FF"> 7 <th bgcolor="#88FFFF">IO
<td bgcolor="#FFBBBB">~X <td bgcolor="#FFBBBB">Y <td bgcolor="#FFBBBB">X
<td bgcolor="#AAFFAA">? <td bgcolor="#FFFF88"> X
<tr><th bgcolor="#FF88FF"> 8 <th bgcolor="#88FFFF">push
<td bgcolor="#FFBBBB">~X <td bgcolor="#FFBBBB">Y <td bgcolor="#FFBBBB">X
<td bgcolor="#AAFFAA">Y, ? <td bgcolor="#FFFF88">
<tr><th bgcolor="#FF88FF"> 9 <th bgcolor="#88FFFF">pop
<td bgcolor="#FFBBBB">~X <td bgcolor="#FFBBBB">Y <td bgcolor="#FFBBBB">Y
<td bgcolor="#AAFFAA">? <td bgcolor="#FFFF88">
<tr><th bgcolor="#FF88FF">10<th bgcolor="#88FFFF">nop-C <td colspan=5>
<tr><th bgcolor="#FF88FF">11 <th bgcolor="#88FFFF">nand
<td bgcolor="#FFBBBB">~X <td bgcolor="#FFBBBB">~Y <td bgcolor="#FFBBBB">Y
<td bgcolor="#AAFFAA">? <td bgcolor="#FFFF88">
<tr><th bgcolor="#FF88FF">12 <th bgcolor="#88FFFF">swap
<td bgcolor="#FFBBBB">Y <td bgcolor="#FFBBBB">~Y <td bgcolor="#FFBBBB">~X
<td bgcolor="#AAFFAA">? <td bgcolor="#FFFF88">
<tr><th bgcolor="#FF88FF">13<th bgcolor="#88FFFF">nop-C <td colspan=5>
<tr><th bgcolor="#FF88FF">14 <th bgcolor="#88FFFF">nand
<td bgcolor="#FFBBBB">Y <td bgcolor="#FFBBBB">X or Y <td bgcolor="#FFBBBB">~X
<td bgcolor="#AAFFAA">? <td bgcolor="#FFFF88">
<tr><th bgcolor="#FF88FF">15 <th bgcolor="#88FFFF">IO
<td bgcolor="#FFBBBB">Y <td bgcolor="#FFBBBB">Z <td bgcolor="#FFBBBB">~X
<td bgcolor="#AAFFAA">? <td bgcolor="#FFFF88"> X or Y
</table>
</center>
<hr />
</div>