forked from orangeduck/BuildYourOwnLisp
-
Notifications
You must be signed in to change notification settings - Fork 1
/
chapter9_s_expressions.html
523 lines (369 loc) · 33.8 KB
/
chapter9_s_expressions.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
<h1>S-Expressions <small>• Chapter 9</small></h1>
<h2 id='lists_and_lisps'>Lists and Lisps</h2> <hr/>
<div class='pull-right alert alert-warning' style="margin: 15px; text-align: center;">
<img src="/static/img/lisp.png" alt="lisp" class="img-responsive" width="253px" height="300px"/>
<p><small>ALL CAPS • SO RIGHT YET SO WRONG.</small></p>
</div>
<p>Lisps are famous for having little distinction between data and code. They use the same structures to represent both. This allows them to do many powerful things which other languages cannot do. If we want this power for our programming language we're going to have to separate out the process of <em>reading</em> in input, and <em>evaluating</em> the input we have stored.</p>
<p>The final result of this chapter will only differ slightly in behaviour from the previous chapter. This is because we are going to spend time changing how things work internally. This is called <em>re-factoring</em> and it will make our life a lot easier later on. Like preparation for a meal, just because we're not putting food onto plates it doesn't mean we're wasting time. Sometimes the anticipation is even better than eating!</p>
<p>To store the program we will need to create an internal list structure that is built up recursively of numbers, symbols, and other lists. In Lisp, this structure is commonly called an S-Expression standing for <em>Symbolic Expression</em>. We will extend our <code>lval</code> structure to be able to represent it. The evaluation behaviour of S-Expressions is the behaviour typical of Lisps, that we are used to so far. To evaluate an S-Expression we look at the first item in the list, and take this to be the operator. We then look at all the other items in the list, and take these as operands to get the result.</p>
<p>By introducing S-Expressions we'll finally be entering the world of Lisp.</p>
<h2 id='pointers'>Pointers</h2> <hr/>
<p>In C no concept of lists can be explored without dealing properly with pointers. Pointers are a famously misunderstood aspect of C. They are difficult to teach because while being conceptually very simple, they come with a lot of new terminology, and often no clear use-case. This makes them appear far more monstrous than they are. Luckily for us, we have a couple of ideal use-cases, both of which are extremely typical in C, and will likely end up being how you use pointers 90% of the time.</p>
<p>The reason we need pointers in C is because of how function calling works. When you call a function in C the arguments are always passed <em>by value</em>. This means <em>a copy</em> of them is passed to the function you call. This is true for <code>int</code>, <code>long</code>, <code>char</code>, and user-defined <code>struct</code> types such as <code>lval</code>. Most of the time this is great but occasionally it can cause issues.</p>
<p>A common problem occurs when we have a large struct containing many other sub structs we wish to pass around. Every time we call a function we must create another copy of it. Suddenly the amount of data that needs to be copied around just to call a function can become huge!</p>
<p>A second problem is this. When we define a <code>struct</code>, it is always a fixed size. It has a limited number of fields, and each field is itself a fixed size. If I want to call a function with just <em>a list of things</em>, where the number of <em>things</em> varies from call to call, clearly I can't use a <code>struct</code> to do this.</p>
<p>To get around these issues the developers of C (or y'know...someone) came up with a clever idea. They imagined computer memory as a single huge list of bytes. In this list each byte can be given a global index, or position. A bit like a house number. The first byte is numbered <code>0</code>, the second is <code>1</code>, etc.</p>
<p>In this case, all the data in the computer, including the structs and variables used in the currently running program, start at some index in this huge list. If, rather than copying the data itself to a function, we instead copy a number representing the <em>index</em> at where this data starts, the function being called can look up any amount of data it wants.</p>
<p>By using <em>addresses</em> instead of the actual data, we can allow a function to access and modify some location in memory without having to copy any data. Functions can also use pointers to do other stuff, like output data to some address given as input.</p>
<p>Because the total size of computer memory is fixed, the number of bytes needed to represent an address is always the same. But if we keep track of it, the number of bytes the address points to can grow and shrink. This means we can create a variable sized data-structure and still <em>pass</em> it to a function, which can inspect and modify it.</p>
<p>So a pointer is just a number. A number representing the starting index of some data in memory. The type of the pointer hints to us, and the compiler, what type of data might be accessible at this location.</p>
<p>We can declare pointer types by suffixing existing ones with the <code>*</code> character. We've seen some examples of this already with <code>mpc_parser_t*</code>, <code>mpc_ast_t*</code>, or <code>char*</code>.</p>
<p>To create a pointer to some data, we need to get its index, or <em>address</em>. To get the address of some data we use the <em>address of</em> operator <code>&</code>. Again you've seen this before when we passed in a pointer to <code>mpc_parse</code> so it would output into our <code>mpc_result_t</code>.</p>
<p>Finally to get the data at an address, called <em>dereferencing</em>, we use the <code>*</code> operator on the left-hand side of a variable. To get the data at the field of a pointer to a struct we use the arrow <code>-></code>. This you saw in chapter 7.</p>
<h2 id='the_stack_and_the_heap'>The Stack & The Heap</h2> <hr/>
<p>I said that memory can be visualised of as one long list of bytes. Actually it is better to imagine it split into two sections. These sections are called <em>The Stack</em> and <em>The Heap</em>.</p>
<p>Some of you may have heard tales of these mysterious locations, such as <em>"the stack grows down but the heap grows up"</em>, or <em>"there can be many stacks, but only one heap"</em>. These sorts of things don't matter much. Dealing with the stack and the heap in C can be complex, but it doesn't have to be a mystery. In essence they are just two sections of memory used for two different tasks.</p>
<h3>The Stack</h3>
<div class='pull-right alert alert-warning' style="margin: 15px; text-align: center;">
<img src="/static/img/building.png" alt="building" class="img-responsive" width="397px" height="318px"/>
<p><small>The Stack • Like what you do with bricks.</small></p>
</div>
<p>The Stack is the memory where your program lives. It is where all of your temporary variables and data structures exist as you manipulate and edit them. Every time you call a function a new area of the stack is put aside for it to use. Into this area are put local variables, copies of any arguments passed to the function, as well as some bookkeeping data such as who the caller was, and what to do when finished. When the function is done the area it used is unallocated, ready for use again by someone else.</p>
<p>I like to think of the stack as a building site. Each time we need to do something new we corner off a section of space, enough for our tools and materials, and set to work. We can still go to other parts of the site, or go off-site, if we need certain things, but all our work is done in this section. Once we are done with some task, we take what we've constructed to a new place and clean up that section of the space we've been using to make it.</p>
<h3>The Heap</h3>
<div class='pull-right alert alert-warning' style="margin: 15px; text-align: center;">
<img src="/static/img/storage.png" alt="storage" class="img-responsive" width="272px" height="300px"/>
<p><small>The Heap • U LOCK. KEEP KEY.</small></p>
</div>
<p>The Heap is a section of memory put aside for storage of objects with a longer lifespan. Memory in this area has to be manually allocated and deallocated. To allocate new memory the <code>malloc</code> function is used. This function takes as input the number of bytes required, and returns back a pointer to a new block of memory with that many bytes set aside.</p>
<p>When done with the memory at that location it must be released again. To do this the pointer received from <code>malloc</code> should be passed to the <code>free</code> function.</p>
<p>Using the Heap is trickier than the Stack because it requires the programmer to remember to call <code>free</code> and to call it correctly. If he or she doesn't, the program may continuously allocate more and more memory. This is called a <em>memory leak</em>. An easy rule to avoid this is to ensure for each <code>malloc</code> there is a corresponding (and only one corresponding) <code>free</code>. If this can always be ensured the program should be handling The Heap correctly.</p>
<p>I imagine the Heap like a huge U-Store-It. We can call up the reception with <code>malloc</code> and request a number of boxes. With these boxes we can do what we want, and we know they will persist no matter how messy the building site gets. We can take things to and from the U-Store-It and the building site. It is useful to store materials and large objects which we only need to retrieve once in a while. The only problem is we need to remember to call the receptionist again with <code>free</code> when we are done. Otherwise soon we'll have requested all the boxes, have no space, and run up a huge bill.</p>
<h2 id='parsing_expressions'>Parsing Expressions</h2> <hr/>
<p>Because we're now thinking in S-Expressions, and not Polish Notation we need to update our parser. The syntax for S-Expressions is simple. It is just a number of other Expressions between parentheses, where an Expression can be a Number, Operator, or other S-Expression . We can modify our existing parse rules to reflect this. We also are going to rename our <code>operator</code> rule to <code>symbol</code>. This is in anticipation of adding more operators, variables and functions later.</p>
<pre><code data-language='c'>mpc_parser_t* Number = mpc_new("number");
mpc_parser_t* Symbol = mpc_new("symbol");
mpc_parser_t* Sexpr = mpc_new("sexpr");
mpc_parser_t* Expr = mpc_new("expr");
mpc_parser_t* Lispy = mpc_new("lispy");
mpca_lang(MPCA_LANG_DEFAULT,
" \
number : /-?[0-9]+/ ; \
symbol : '+' | '-' | '*' | '/' ; \
sexpr : '(' <expr>* ')' ; \
expr : <number> | <symbol> | <sexpr> ; \
lispy : /^/ <expr>* /$/ ; \
",
Number, Symbol, Sexpr, Expr, Lispy);</code></pre>
<p>We should also remember to cleanup these rules on exit.</p>
<pre><code data-language='c'>mpc_cleanup(5, Number, Symbol, Sexpr, Expr, Lispy);</code></pre>
<h2 id='expression_structure'>Expression Structure</h2> <hr/>
<p>We need a way to store S-Expressions as <code>lval</code>. This means we'll also need to store <em>Symbols</em> and <em>Numbers</em>. We're going to add two new <code>lval</code> types to the <code>enum</code>. The first is <code>LVAL_SYM</code>, which we're going to use to represent operators such as <code>+</code>. The second new type is <code>LVAL_SEXPR</code> which we're going to use to represent S-Expressions.</p>
<pre><code data-language='c'>enum { LVAL_ERR, LVAL_NUM, LVAL_SYM, LVAL_SEXPR };</code></pre>
<p>S-Expressions are variable length <em>lists</em> of other values. As we learnt at the beginning of this chapter we can't create variable length structs, so we are going to need to use a pointer. We are going to create a pointer field <code>cell</code> which points to a location where we store a list of <code>lval*</code>. More specifically pointers to the other individual <code>lval</code>. Our field should therefore be a double pointer type <code>lval**</code>. A <em>pointer to <code>lval</code> pointers</em>. We will also need to keep track of how many <code>lval*</code> are in this list, so we add an extra field <code>count</code> to record this.</p>
<p>To represent symbols we're going to use a string. We're also going to change the representation of errors to a string. This means we can store a unique error message rather than just an error code. This will make our error reporting better and more flexible, and we can get rid of the original error <code>enum</code>. Our updated <code>lval</code> struct looks like this.</p>
<pre><code data-language='c'>typedef struct lval {
int type;
long num;
/* Error and Symbol types have some string data */
char* err;
char* sym;
/* Count and Pointer to a list of "lval*" */
int count;
struct lval** cell;
} lval;</code></pre>
<div class="alert alert-warning">
<p><strong>Are there ever pointers to pointers to pointers?</strong></p>
<p>There is an old programming joke which says you can rate C programmers by how many stars are on their pointers.</p>
<p>Beginner's programs might only use <code>char*</code> or the odd <code>int*</code>, so they were called <em>one star programmers</em>. Most intermediate programs contain double pointer types such as <code>lval**</code>. These programmers are therefore called <em>two star programmers</em>. To spot a triple pointer is something special. You would be viewing the work of someone grand and terrible, writing code not meant to be read with mortal eyes. As such being called a <em>three star programmer</em> is rarely a compliment.</p>
<p>As far as I know, a quadruple pointer has never been seen in the wild.</p>
</div>
<div class="alert alert-warning">
<p><strong>What is that <code>struct</code> keyword doing there?</strong></p>
<p>Our new definition of <code>lval</code> needs to contain a reference to itself. This means we have to slightly change how it is defined. Before we open the curly brackets we can put the name of the struct, and then refer to this inside the definition using <code>struct lval</code>. Even though a struct can refer to its own type, it must only contain pointers to its own type, not its own type directly. Otherwise the size of the struct would refer to itself, and grow infinite in size when you tried to calculate it!</p>
</div>
<h2 id="constructors_and_destructors">Constructors & Destructors</h2> <hr/>
<p>We can change our <code>lval</code> construction functions to return pointers to an <code>lval</code>, rather than one directly. This will make keeping track of <code>lval</code> variables easier. For this we need to use <code>malloc</code> with the <code>sizeof</code> function to allocate enough space for the <code>lval</code> struct, and then to fill in the fields with the relevant information using the arrow operator <code>-></code>.</p>
<p>When we construct an <code>lval</code> its fields may contain pointers to other things that have been allocated on the heap. This means we need to be careful. Whenever we are finished with an <code>lval</code> we also need to delete the things it points to on the heap. We will have to make a rule for ourselves. Whenever we free the memory allocated for an <code>lval</code>, we also free all the things it points to.</p>
<pre><code data-language='c'>/* Construct a pointer to a new Number lval */
lval* lval_num(long x) {
lval* v = malloc(sizeof(lval));
v->type = LVAL_NUM;
v->num = x;
return v;
}
</code></pre>
<pre><code data-language='c'>/* Construct a pointer to a new Error lval */
lval* lval_err(char* m) {
lval* v = malloc(sizeof(lval));
v->type = LVAL_ERR;
v->err = malloc(strlen(m) + 1);
strcpy(v->err, m);
return v;
}
</code></pre>
<pre><code data-language='c'>/* Construct a pointer to a new Symbol lval */
lval* lval_sym(char* s) {
lval* v = malloc(sizeof(lval));
v->type = LVAL_SYM;
v->sym = malloc(strlen(s) + 1);
strcpy(v->sym, s);
return v;
}
</code></pre>
<pre><code data-language='c'>/* A pointer to a new empty Sexpr lval */
lval* lval_sexpr(void) {
lval* v = malloc(sizeof(lval));
v->type = LVAL_SEXPR;
v->count = 0;
v->cell = NULL;
return v;
}
</code></pre>
<div class="alert alert-warning">
<p><strong>What is <code>NULL</code>?</strong></p>
<p><code>NULL</code> is a special constant that points to memory location <code>0</code>. In many places it is used as a convention to signify some non-value or empty data. Above we use it to specify that we have a data pointer, but that it doesn't point to any number of data items. You will see <code>NULL</code> crop up a lot more later on so always remember to look out for it.</p>
</div>
<div class="alert alert-warning">
<p><strong>Why are you using <code>strlen(s) + 1</code>?</strong></p>
<p>In C strings are <em>null terminated</em>. This means that the final character of them is always the zero character <code>\0</code>. This is a convention in C to signal the end of a string. It is important that all strings are stored this way otherwise programs will break in nasty ways.</p>
<p>The <code>strlen</code> function only returns the number of bytes in a string <em>excluding</em> the null terminator. This is why we need to add one, to ensure there is enough allocated space for it all!</p>
</div>
<p>We now need a special function to delete <code>lval*</code>. This should call <code>free</code> on the pointer itself to release the memory acquired from <code>malloc</code>, but more importantly it should inspect the type of the <code>lval</code>, and release any memory pointed to by its fields. If we match every call to one of the above construction functions with <code>lval_del</code> we can ensure we will get no memory leaks.</p>
<pre><code data-language="c">void lval_del(lval* v) {
switch (v->type) {
/* Do nothing special for number type */
case LVAL_NUM: break;
/* For Err or Sym free the string data */
case LVAL_ERR: free(v->err); break;
case LVAL_SYM: free(v->sym); break;
/* If Sexpr then delete all elements inside */
case LVAL_SEXPR:
for (int i = 0; i < v->count; i++) {
lval_del(v->cell[i]);
}
/* Also free the memory allocated to contain the pointers */
free(v->cell);
break;
}
/* Free the memory allocated for the "lval" struct itself */
free(v);
}</code></pre>
<h2 id='reading_expressions'>Reading Expressions</h2> <hr/>
<p>First we are going to <em>read</em> in the program and construct an <code>lval*</code> that represents it all. Then we are going to <em>evaluate</em> this <code>lval*</code> to get the result of our program. This first stage should convert the <em>abstract syntax tree</em> into an S-Expression, and the second stage should evaluate this S-Expression using our normal Lisp rules.</p>
<p>To complete the first stage we can recursively look at each node of the tree, and construct different <code>lval*</code> types depending on the <code>tag</code> and <code>contents</code> fields of the node.</p>
<p>If the given node is tagged as a <code>number</code> or <code>symbol</code>, then we use our constructors to return an <code>lval*</code> directly for those types. If the given node is the <code>root</code>, or an <code>sexpr</code>, then we create an empty S-Expression <code>lval</code> and slowly add each valid sub-expression contained in the tree.</p>
<p>To add an element to an S-Expression we can create a function <code>lval_add</code>. This function increases the count of the Expression list by one, and then uses <code>realloc</code> to reallocate the amount of space required by <code>v->cell</code>. This new space can be used to store the extra <code>lval*</code> required. Using this new space it sets the final value of the list with <code>v->cell[v->count-1]</code> to the value <code>lval* x</code> passed in.</p>
<div class="alert alert-warning">
<p><strong>Don't Lisps use <a href="http://en.wikipedia.org/wiki/Cons">Cons cells</a>?</strong></p>
<p>Other Lisps have a slightly different definition of what an S-Expression is. In most other Lisps S-Expressions are defined inductively as either an <em>atom</em> such as a symbol of number, or two other S-Expressions joined, or <em>cons</em>, together.</p>
<p>This naturally leads to an implementation using <em>linked lists</em>, a different data structure to the one we are using. I choose to represent S-Expressions as a variable sized array in this book for the purposes of simplicity, but it is important to be aware that the official definition, and typical implementation are both subtly different.</p>
</div>
<pre><code data-language='c'>lval* lval_read_num(mpc_ast_t* t) {
errno = 0;
long x = strtol(t->contents, NULL, 10);
return errno != ERANGE ?
lval_num(x) : lval_err("invalid number");
}
</code></pre>
<pre><code data-language='c'>lval* lval_read(mpc_ast_t* t) {
/* If Symbol or Number return conversion to that type */
if (strstr(t->tag, "number")) { return lval_read_num(t); }
if (strstr(t->tag, "symbol")) { return lval_sym(t->contents); }
/* If root (>) or sexpr then create empty list */
lval* x = NULL;
if (strcmp(t->tag, ">") == 0) { x = lval_sexpr(); }
if (strstr(t->tag, "sexpr")) { x = lval_sexpr(); }
/* Fill this list with any valid expression contained within */
for (int i = 0; i < t->children_num; i++) {
if (strcmp(t->children[i]->contents, "(") == 0) { continue; }
if (strcmp(t->children[i]->contents, ")") == 0) { continue; }
if (strcmp(t->children[i]->tag, "regex") == 0) { continue; }
x = lval_add(x, lval_read(t->children[i]));
}
return x;
}</code></pre>
<pre><code data-language="c">lval* lval_add(lval* v, lval* x) {
v->count++;
v->cell = realloc(v->cell, sizeof(lval*) * v->count);
v->cell[v->count-1] = x;
return v;
}
</code></pre>
<h2 id='printing_expressions'>Printing Expressions</h2> <hr/>
<p>We are now so close to trying out all of our new changes. We need to modify our print function to print out S-Expressions types. Using this we can double check that the <em>reading</em> phase is working correctly by printing out the S-Expressions we read in and verifying they match those we input.</p>
<p>To print out S-Expressions we can create another function that loops over all the sub-expressions of an expression and prints these individually separated by spaces, in the same way they are input.</p>
<pre><code data-language="c">void lval_expr_print(lval* v, char open, char close) {
putchar(open);
for (int i = 0; i < v->count; i++) {
/* Print Value contained within */
lval_print(v->cell[i]);
/* Don't print trailing space if last element */
if (i != (v->count-1)) {
putchar(' ');
}
}
putchar(close);
}
</code></pre>
<pre><code data-language='c'>void lval_print(lval* v) {
switch (v->type) {
case LVAL_NUM: printf("%li", v->num); break;
case LVAL_ERR: printf("Error: %s", v->err); break;
case LVAL_SYM: printf("%s", v->sym); break;
case LVAL_SEXPR: lval_expr_print(v, '(', ')'); break;
}
}
void lval_println(lval* v) { lval_print(v); putchar('\n'); }</code></pre>
<div class="alert alert-warning">
<p><strong>I can't declare these functions because they call each other.</strong></p>
<p>The <code>lval_expr_print</code> function calls the <code>lval_print</code> function and vice-versa. There is no way we can order them in the source file to resolve this dependency. Instead we need to <em>forward declare</em> one of them. This is declaring a function without giving it a body. It lets other functions call it, while allowing you to define it properly later on. To write a forward declaration, write the function definition but instead of the body put a semicolon <code>;</code>. In this example we should put <code>void lval_print(lval* v);</code> somewhere in the source file before <code>lval_expr_print</code>.</p>
<p>You'll definitely run into this later, and I won't always alert you to it, so try to remember how to fix it!</p>
</div>
<p>In our main loop, we can remove the evaluation for now, and instead try reading in the result and printing out what we have read.</p>
<pre><code data-language='c'>lval* x = lval_read(r.output);
lval_println(x);
lval_del(x);</code></pre>
<p>If this is successful you should see something like the following when entering input to your program.</p>
<pre><code data-language='lispy'>lispy> + 2 2
(+ 2 2)
lispy> + 2 (* 7 6) (* 2 5)
(+ 2 (* 7 6) (* 2 5))
lispy> * 55 101 (+ 0 0 0)
(* 55 101 (+ 0 0 0))
lispy></code></pre>
<h2 id='evaluating_expressions'>Evaluating Expressions</h2> <hr/>
<p>The behaviour of our evaluation function is largely the same as before. We need to adapt it to deal with <code>lval*</code> and our more relaxed definition of what constitutes an expression. We can think of our evaluation function as a kind of transformer. It takes in some <code>lval*</code> and transforms it in some way to some new <code>lval*</code>. In some cases it can just return exactly the same thing. In other cases it may modify the input <code>lval*</code> and return it. In many cases it will delete the input, and return something completely different. If we are going to return something new we must always remember to delete the <code>lval*</code> we get as input.</p>
<p>For S-Expressions we first evaluate all the children of the S-Expression. If any of these children are errors we return the first error we encounter using a function we'll define later called <code>lval_take</code>.</p>
<p>If the S-Expression has no children we just return it directly. This corresponds to the empty expression, denoted by <code>()</code>. We also check for single expressions. These are expressions with only one child such as <code>(5)</code>. In this case we return the single expression contained within the parenthesis.</p>
<p>If neither of these are the case we know we have a valid expression with more than one child. In this case we separate the first element of the expression using a function we'll define later called <code>lval_pop</code>. We then check this is a <em>symbol</em> and not anything else. If it is a symbol we check what symbol it is, and pass it, and the arguments, to a function <code>builtin_op</code> which does our calculations. If the first element is not a symbol we delete it, and the values passed into the evaluation function, returning an error.</p>
<p>To evaluate all other types we just return them directly back.</p>
<pre><code data-language="c">lval* lval_eval_sexpr(lval* v) {
/* Evaluate Children */
for (int i = 0; i < v->count; i++) {
v->cell[i] = lval_eval(v->cell[i]);
}
/* Error Checking */
for (int i = 0; i < v->count; i++) {
if (v->cell[i]->type == LVAL_ERR) { return lval_take(v, i); }
}
/* Empty Expression */
if (v->count == 0) { return v; }
/* Single Expression */
if (v->count == 1) { return lval_take(v, 0); }
/* Ensure First Element is Symbol */
lval* f = lval_pop(v, 0);
if (f->type != LVAL_SYM) {
lval_del(f); lval_del(v);
return lval_err("S-expression Does not start with symbol!");
}
/* Call builtin with operator */
lval* result = builtin_op(v, f->sym);
lval_del(f);
return result;
}
</code></pre>
<pre><code data-language="c">lval* lval_eval(lval* v) {
/* Evaluate Sexpressions */
if (v->type == LVAL_SEXPR) { return lval_eval_sexpr(v); }
/* All other lval types remain the same */
return v;
}</code></pre>
<p>There are two functions we've used and not defined in the above code. These are <code>lval_pop</code> and <code>lval_take</code>. These are general purpose functions for manipulating S-Expression <code>lval</code> types which we'll make use of here and in the future.</p>
<p>The <code>lval_pop</code> function extracts a single element from an S-Expression at index <code>i</code> and shifts the rest of the list backward so that it no longer contains that <code>lval*</code>. It then returns the extracted value. Notice that it doesn't delete the input list. It is like taking an element from a list and popping it out, leaving what remains. This means both the element popped and the old list need to be deleted at some point with <code>lval_del</code>.</p>
<p>The <code>lval_take</code> function is similar to <code>lval_pop</code> but it deletes the list it has extracted the element from. This is like taking an element from the list and deleting the rest. It is a slight variation on <code>lval_pop</code> but it makes our code easier to read in some places. Unlike <code>lval_pop</code>, only the expression you take from the list needs to be deleted by <code>lval_del</code>.</p>
<pre><code data-language="c">lval* lval_pop(lval* v, int i) {
/* Find the item at "i" */
lval* x = v->cell[i];
/* Shift memory after the item at "i" over the top */
memmove(&v->cell[i], &v->cell[i+1],
sizeof(lval*) * (v->count-i-1));
/* Decrease the count of items in the list */
v->count--;
/* Reallocate the memory used */
v->cell = realloc(v->cell, sizeof(lval*) * v->count);
return x;
}
</code></pre>
<pre><code data-language="c">lval* lval_take(lval* v, int i) {
lval* x = lval_pop(v, i);
lval_del(v);
return x;
}</code></pre>
<p>We also need to define the evaluation function <code>builtin_op</code>. This is like the <code>eval_op</code> function used in our previous chapter but modified to take a single <code>lval*</code> representing a list of all the arguments to operate on. It needs to do some more rigorous error checking. If any of the inputs are a non-number <code>lval*</code> we need to return an error.</p>
<p>First it checks that all the arguments input are numbers. It then pops the first argument to start. If there are no more sub-expressions and the operator is subtraction it performs unary negation on this first number. This makes expressions such as <code>(- 5)</code> evaluate correctly.</p>
<p>If there are more arguments it constantly pops the next one from the list and performs arithmetic depending on which operator we're meant to be using. If a zero is encountered on division it deletes the temporary <code>x</code>, <code>y</code>, and the argument list <code>a</code>, and returns an error.</p>
<p>If there have been no errors the input arguments are deleted and the new expression returned.</p>
<pre><code data-language="c">lval* builtin_op(lval* a, char* op) {
/* Ensure all arguments are numbers */
for (int i = 0; i < a->count; i++) {
if (a->cell[i]->type != LVAL_NUM) {
lval_del(a);
return lval_err("Cannot operate on non-number!");
}
}
/* Pop the first element */
lval* x = lval_pop(a, 0);
/* If no arguments and sub then perform unary negation */
if ((strcmp(op, "-") == 0) && a->count == 0) {
x->num = -x->num;
}
/* While there are still elements remaining */
while (a->count > 0) {
/* Pop the next element */
lval* y = lval_pop(a, 0);
if (strcmp(op, "+") == 0) { x->num += y->num; }
if (strcmp(op, "-") == 0) { x->num -= y->num; }
if (strcmp(op, "*") == 0) { x->num *= y->num; }
if (strcmp(op, "/") == 0) {
if (y->num == 0) {
lval_del(x); lval_del(y);
x = lval_err("Division By Zero!"); break;
}
x->num /= y->num;
}
lval_del(y);
}
lval_del(a); return x;
}</code></pre>
<p>This completes our evaluation functions. We just need to change <code>main</code> again so it passes the input through this evaluation before printing it.</p>
<pre><code data-language='c'>lval* x = lval_eval(lval_read(r.output));
lval_println(x);
lval_del(x);
</code></pre>
<p>Now you should now be able to evaluate expressions correctly in the same way as in the previous chapter. Take a little breather and have a play around with the new evaluation. Make sure everything is working correctly, and the behaviour is as expected. In the next chapter we're going to make great use of these changes to implement some cool new features.</p>
<pre><code data-language='lispy'>lispy> + 1 (* 7 5) 3
39
lispy> (- 100)
-100
lispy>
()
lispy> /
/
lispy> (/ ())
Error: Cannot operate on non-number!
lispy></code></pre>
<h2>Reference</h2> <hr/>
<references />
<h2>Bonus Marks</h2> <hr/>
<div class="alert alert-warning">
<ul class="list-group">
<li class="list-group-item">› Give an example of a variable in our program that lives on <em>The Stack</em>.</li>
<li class="list-group-item">› Give an example of a variable in our program that points to <em>The Heap</em>.</li>
<li class="list-group-item">› What does the <code>strcpy</code> function do?</li>
<li class="list-group-item">› What does the <code>realloc</code> function do?</li>
<li class="list-group-item">› What does the <code>memmove</code> function do?</li>
<li class="list-group-item">› How does <code>memmove</code> differ from <code>memcpy</code>?</li>
<li class="list-group-item">› Extend parsing and evaluation to support the remainder operator <code>%</code>.</li>
<li class="list-group-item">› Extend parsing and evaluation to support decimal types using a <code>double</code> field.</li>
</ul>
</div>
<h2>Navigation</h2>
<table class="table" style='table-layout: fixed;'>
<tr>
<td class="text-left"><a href="chapter8_error_handling"><h4>‹ Error Handling</h4></a></td>
<td class="text-center"><a href="contents"><h4>• Contents •</h4></a></td>
<td class="text-right"><a href="chapter10_q_expressions"><h4>Q-Expressions ›</h4></a></td>
</tr>
</table>