-
Notifications
You must be signed in to change notification settings - Fork 3
/
index-old.html
729 lines (714 loc) · 57 KB
/
index-old.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
<!doctype html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>COCO: Comparing Continuous Optimizers — COCO: COmparing Continuous Optimizers</title>
<link rel="stylesheet" href="_static/bizstyle.css" type="text/css" />
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT: './',
VERSION: '0.9',
COLLAPSE_INDEX: false,
FILE_SUFFIX: '.html',
HAS_SOURCE: true
};
</script>
<script type="text/javascript" src="_static/jquery.js"></script>
<script type="text/javascript" src="_static/underscore.js"></script>
<script type="text/javascript" src="_static/doctools.js"></script>
<script type="text/javascript" src="_static/bizstyle.js"></script>
<link rel="top" title="COCO: COmparing Continuous Optimizers" href="#" />
<meta name="viewport" content="width=device-width,initial-scale=1.0">
<!--[if lt IE 9]>
<script type="text/javascript" src="_static/css3-mediaqueries.js"></script>
<![endif]-->
</head>
<body role="document">
<div class="related" role="navigation" aria-label="related navigation">
<h3>Navigation</h3>
<ul>
<li class="right" style="margin-right: 10px">
<a href="genindex.html" title="General Index"
accesskey="I">index</a></li>
<li class="nav-item nav-item-0"><a href="#">COCO: COmparing Continuous Optimizers</a> »</li>
</ul>
</div>
<div class="sphinxsidebar" role="navigation" aria-label="main navigation">
<div class="sphinxsidebarwrapper">
<h3><a href="#">Table Of Contents</a></h3>
<ul>
<li><a class="reference internal" href="#">COCO: A platform for Comparing Continuous Optimizers in a Black-Box Setting</a><ul>
<li><a class="reference internal" href="#introduction">Introduction</a><ul>
<li><a class="reference internal" href="#why-coco">Why COCO?</a></li>
<li><a class="reference internal" href="#terminology">Terminology</a></li>
</ul>
</li>
<li><a class="reference internal" href="#functions-instances-and-problems">Functions, Instances, and Problems</a></li>
<li><a class="reference internal" href="#runtime-and-target-values">Runtime and Target Values</a><ul>
<li><a class="reference internal" href="#restarts-and-simulated-restarts">Restarts and Simulated Restarts</a></li>
<li><a class="reference internal" href="#aggregation">Aggregation</a></li>
</ul>
</li>
<li><a class="reference internal" href="#general-code-structure">General Code Structure</a></li>
<li><a class="reference internal" href="#test-suites">Test Suites</a></li>
</ul>
</li>
</ul>
<div role="note" aria-label="source link">
<h3>This Page</h3>
<ul class="this-page-menu">
<li><a href="_sources/index.txt"
rel="nofollow">Show Source</a></li>
</ul>
</div>
<div id="searchbox" style="display: none" role="search">
<h3>Quick search</h3>
<form class="search" action="search.html" method="get">
<input type="text" name="q" />
<input type="submit" value="Go" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
<p class="searchtip" style="font-size: 90%">
Enter search terms or a module, class or function name.
</p>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
</div>
</div>
<div class="document">
<div class="documentwrapper">
<div class="bodywrapper">
<div class="body" role="main">
<div class="section" id="coco-a-platform-for-comparing-continuous-optimizers-in-a-black-box-setting">
<h1>COCO: A platform for Comparing Continuous Optimizers in a Black-Box Setting<a class="headerlink" href="#coco-a-platform-for-comparing-continuous-optimizers-in-a-black-box-setting" title="Permalink to this headline">¶</a></h1>
<SMALL><I>To cite or access this document as <tt>pdf</tt>:</I><BR>
N. Hansen, A. Auger, O. Mersmann, T. Tušar, and D. Brockhoff (2016).
<A HREF="http://arxiv.org/pdf/1603.08785">
COCO: A platform for Comparing Continuous Optimizers in a Black-Box Setting</A>.
<I>ArXiv e-prints</I>,
<A HREF="http://arxiv.org/abs/1603.08785">arXiv:1603.08785</A>.</SMALL><p><a class="reference external" href="https://github.com/numbbo/coco">COCO</a> is a platform for Comparing Continuous Optimizers in a black-box
setting.
It aims at automatizing the tedious and repetitive task of
benchmarking numerical optimization algorithms to the greatest possible
extent.
We present the rationals behind the development of the platform
as a general proposition for a guideline towards better benchmarking.
We detail underlying fundamental concepts of
<a class="reference external" href="https://github.com/numbbo/coco">COCO</a> such as the definition of
a problem, the idea of instances, the relevance of target values, and runtime
as central performance measure.
Finally, we give a quick overview of the basic
code structure and the currently available test suites.</p>
<div class="section" id="introduction">
<h2>Introduction<a class="headerlink" href="#introduction" title="Permalink to this headline">¶</a></h2>
<p>We consider the continuous black-box optimization or search problem to minimize</p>
<div class="math">
<p><img src="_images/math/24791c42af3ae21a1f4b802374fd1c7cc7d997b2.png" alt="f: X\subset\mathbb{R}^n \to \mathbb{R}^m \qquad n,m\ge1"/></p>
</div><p>such that for the <img class="math" src="_images/math/c58e530eae38c5636539b02e8c92a338cc4cee12.png" alt="l" style="vertical-align: 0px"/> constraints</p>
<div class="math">
<p><img src="_images/math/45c8329e3bdc745a003418154b28c9c32fb3ef6d.png" alt="g: X\subset\mathbb{R}^n \to \mathbb{R}^l \qquad l\ge0"/></p>
</div><p>we have <img class="math" src="_images/math/ce6cd1adefd2891c416be6dc7ea4427b9c11da09.png" alt="g_i(\x)\le0" style="vertical-align: -4px"/> for all <img class="math" src="_images/math/3e903bbfeaf63c82eb7c86e73002505dbe6d3b20.png" alt="i=1\dots l" style="vertical-align: -1px"/>.
More specifically, we aim to find, as quickly as possible, one or several solutions <img class="math" src="_images/math/d1304f0b7773d247ff0207c9e036d817d003555b.png" alt="\x" style="vertical-align: 0px"/> in the search space <img class="math" src="_images/math/aa55e6775cdca96cc08ce73fa59a73133fd2d510.png" alt="X" style="vertical-align: 0px"/> with <em>small</em> value(s) of <img class="math" src="_images/math/5654ffc1269ad01c63d539d9c7c99d6dcfa7d039.png" alt="f(\x)\in\mathbb{R}^m" style="vertical-align: -4px"/> that satisfy all above constraints <img class="math" src="_images/math/8f6db44bda25a42ce182b1b908fe26305168addc.png" alt="g" style="vertical-align: -4px"/>.
We generally consider <em>time</em> to be the number of calls to the function <img class="math" src="_images/math/f0307af130a9303d3def5b2c46714d19118bac36.png" alt="f" style="vertical-align: -4px"/>.</p>
<p>A continuous optimization algorithm, also known as <em>solver</em>, addresses the
above problem.
Here, we assume that <img class="math" src="_images/math/aa55e6775cdca96cc08ce73fa59a73133fd2d510.png" alt="X" style="vertical-align: 0px"/> is known, but no prior knowledge about <img class="math" src="_images/math/f0307af130a9303d3def5b2c46714d19118bac36.png" alt="f" style="vertical-align: -4px"/> or
<img class="math" src="_images/math/8f6db44bda25a42ce182b1b908fe26305168addc.png" alt="g" style="vertical-align: -4px"/> is available to the algorithm.
That is, <img class="math" src="_images/math/f0307af130a9303d3def5b2c46714d19118bac36.png" alt="f" style="vertical-align: -4px"/> and <img class="math" src="_images/math/8f6db44bda25a42ce182b1b908fe26305168addc.png" alt="g" style="vertical-align: -4px"/> are considered as a black-box which the algorithm can
query with solutions <img class="math" src="_images/math/8cc5181a2a64f34c353f32d2f4c7192934fac9f1.png" alt="\x\in\mathbb{R}^n" style="vertical-align: -1px"/> to get the respective values
<img class="math" src="_images/math/dc91e4306455a7fc24732bdd4a7f47c6bf7a07a5.png" alt="f(\x)" style="vertical-align: -4px"/> and <img class="math" src="_images/math/23426dc6e04a390b1cd17330c99cf38e9014834b.png" alt="g(\x)" style="vertical-align: -4px"/>.</p>
<p>From these prerequisits, benchmarking optimization algorithms seems to be a
rather simple and straightforward task. We run an algorithm on a collection of
problems and display the results. However, under closer inspection,
benchmarking turns out to be surprisingly tedious, and it appears to be
difficult to get results that can be meaningfully interpreted beyond the
standard claim that one algorithm is better than another on some problems. <a class="footnote-reference" href="#id8" id="id4">[1]</a>
Here, we offer a conceptual guideline for benchmarking
continuous optimization algorithms which tries to address this challenge and
has been implemented within the <a class="reference external" href="https://github.com/numbbo/coco">COCO</a> framework. <a class="footnote-reference" href="#id9" id="id5">[2]</a></p>
<p>The <a class="reference external" href="https://github.com/numbbo/coco">COCO</a> framework provides the practical means for an automatized
benchmarking procedure. Installing <a class="reference external" href="https://github.com/numbbo/coco">COCO</a> (in a shell) and benchmarking an
optimization algorithm, say, the function <code class="docutils literal"><span class="pre">fmin</span></code> from <code class="docutils literal"><span class="pre">scipy.optimize</span></code>
in Python, becomes as simple <a class="footnote-reference" href="#id12" id="id6">[3]</a> as</p>
<div class="code bash highlight-python"><div class="highlight"><pre>$ ### get and install the code
$ git clone https://github.com/numbbo/coco.git # get coco using git
$ cd coco
$ python do.py run-python # install Python experimental module cocoex
$ python do.py install-postprocessing # install post-processing :-)
</pre></div>
</div>
<div class="code bash highlight-python"><div class="highlight"><pre>$ ### (optional) run an example from the shell
$ cp code-experiments/build/python/example_experiment.py .
$ python example_experiment.py # run the current "default" experiment
$ python -m bbob_pproc exdata/... # run the post-processing
$ open ppdata/index.html # browse results
</pre></div>
</div>
<div class="code python highlight-python"><div class="highlight"><pre><span class="ch">#!/usr/bin/env python</span>
<span class="sd">"""Python script to benchmark fmin of scipy.optimize"""</span>
<span class="kn">from</span> <span class="nn">numpy.random</span> <span class="kn">import</span> <span class="n">rand</span>
<span class="kn">import</span> <span class="nn">cocoex</span>
<span class="k">try</span><span class="p">:</span> <span class="kn">import</span> <span class="nn">cocopp</span> <span class="c1"># new (future) name</span>
<span class="k">except</span> <span class="ne">ImportError</span><span class="p">:</span> <span class="kn">import</span> <span class="nn">bbob_pproc</span> <span class="kn">as</span> <span class="nn">cocopp</span> <span class="c1"># old name</span>
<span class="kn">from</span> <span class="nn">scipy.optimize</span> <span class="kn">import</span> <span class="n">fmin</span>
<span class="n">suite</span> <span class="o">=</span> <span class="n">cocoex</span><span class="o">.</span><span class="n">Suite</span><span class="p">(</span><span class="s2">"bbob"</span><span class="p">,</span> <span class="s2">"year: 2016"</span><span class="p">,</span> <span class="s2">""</span><span class="p">)</span>
<span class="n">budget_multiply</span> <span class="o">=</span> <span class="mf">1e4</span> <span class="c1"># use 1e1 or even 2 for a quick first test run</span>
<span class="n">observer</span> <span class="o">=</span> <span class="n">cocoex</span><span class="o">.</span><span class="n">Observer</span><span class="p">(</span><span class="s2">"bbob"</span><span class="p">,</span> <span class="s2">"result_folder: myoptimizer-on-bbob"</span><span class="p">)</span>
<span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">suite</span><span class="p">:</span> <span class="c1"># loop over all problems</span>
<span class="n">observer</span><span class="o">.</span><span class="n">observe</span><span class="p">(</span><span class="n">p</span><span class="p">)</span> <span class="c1"># prepare logging of necessary data</span>
<span class="n">fmin</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">p</span><span class="o">.</span><span class="n">initial_solution</span><span class="p">)</span> <span class="c1"># disp=False would silence fmin output</span>
<span class="k">while</span> <span class="p">(</span><span class="ow">not</span> <span class="n">p</span><span class="o">.</span><span class="n">final_target_hit</span> <span class="ow">and</span> <span class="c1"># apply restarts, if so desired</span>
<span class="n">p</span><span class="o">.</span><span class="n">evaluations</span> <span class="o"><</span> <span class="n">p</span><span class="o">.</span><span class="n">dimension</span> <span class="o">*</span> <span class="n">budget_multiplier</span><span class="p">):</span>
<span class="n">fmin</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">p</span><span class="o">.</span><span class="n">lower_bounds</span> <span class="o">+</span> <span class="p">(</span><span class="n">rand</span><span class="p">(</span><span class="n">p</span><span class="o">.</span><span class="n">dimension</span><span class="p">)</span> <span class="o">+</span> <span class="n">rand</span><span class="p">(</span><span class="n">p</span><span class="o">.</span><span class="n">dimension</span><span class="p">))</span> <span class="o">*</span>
<span class="p">(</span><span class="n">p</span><span class="o">.</span><span class="n">upper_bounds</span> <span class="o">-</span> <span class="n">p</span><span class="o">.</span><span class="n">lower_bounds</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span><span class="p">)</span>
<span class="n">cocopp</span><span class="o">.</span><span class="n">main</span><span class="p">(</span><span class="s1">'exdata/myoptimizer-on-bbob'</span><span class="p">)</span> <span class="c1"># invoke data post-processing</span>
</pre></div>
</div>
<p>After the Python script has been executed, the file <code class="docutils literal"><span class="pre">ppdata/index.html</span></code> can be used
to browse the resulting data.</p>
<p>The <a class="reference external" href="https://github.com/numbbo/coco">COCO</a> framework provides</p>
<blockquote>
<div><ul class="simple">
<li>an interface to several languages in which the benchmarked optimizer
can be written, currently C/C++, Java, Matlab/Octave, Python</li>
<li>several benchmark suites or testbeds, currently all written in C</li>
<li>data logging facilities via the <code class="docutils literal"><span class="pre">Observer</span></code></li>
<li>data post-processing in Python and data browsing through <code class="docutils literal"><span class="pre">html</span></code></li>
<li>article LaTeX templates.</li>
</ul>
</div></blockquote>
<p>The underlying philosophy of <a class="reference external" href="https://github.com/numbbo/coco">COCO</a> is to provide everything that experimenters
need to setup and implement if they want to benchmark a given algorithm
implementation <em>properly</em>.
A desired side effect of reusing the same framework is that data collected
over years or even decades can be effortlessly compared. <a class="footnote-reference" href="#id13" id="id7">[4]</a>
So far, the framework has been successfully used to benchmark far over a
hundred different algorithms by dozens of researchers.</p>
<table class="docutils footnote" frame="void" id="id8" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id4">[1]</a></td><td>One common major flaw is to get no
indication of <em>how much</em> better an algorithm is.
That is, the results of benchmarking often provide no indication of
<em>relevance</em>;
the main output is often hundreds of tabulated numbers interpretable on
an ordinal (ranking) scale only.
Addressing a point of a common confusion, <em>statistical</em> significance is only
a secondary and by no means sufficient condition for <em>relevance</em>.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="id9" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id5">[2]</a></td><td>Confer to <a class="reference external" href="https://www.github.com/numbbo/coco">the code basis</a> on Github and the <a class="reference external" href="http://numbbo.github.io/coco-doc/C/">C API documentation</a> for
implementation details.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="id12" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id6">[3]</a></td><td>See also <a class="reference external" href="https://github.com/numbbo/coco/blob/master/code-experiments/build/python/example_experiment.py"><code class="docutils literal"><span class="pre">example_experiment.py</span></code></a> which runs
out-of-the-box as a benchmarking Python script.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="id13" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id7">[4]</a></td><td>See <a class="reference external" href="https://numbbo.github.io/data-archive/bbob/">here</a> to access all data submitted
to the <a class="reference external" href="https://numbbo.github.io/workshops/">BBOB 2009 GECCO workshop</a>.</td></tr>
</tbody>
</table>
<div class="section" id="why-coco">
<h3>Why <a class="reference external" href="https://github.com/numbbo/coco">COCO</a>?<a class="headerlink" href="#why-coco" title="Permalink to this headline">¶</a></h3>
<p>Appart from diminishing the time burden and the pitfalls, bugs
or omissions of the repetitive coding task for experimenters, our aim is to
provide a <em>conceptual guideline for better benchmarking</em>. Our setup and
guideline has the following defining features.</p>
<ol class="arabic">
<li><p class="first">Benchmark functions are</p>
<ol class="arabic simple">
<li>used as black boxes for the algorithm, however they
are explicitly known to the scientific community.</li>
<li>designed to be comprehensible, to allow a meaningful
interpretation of performance results.</li>
<li>difficult to “defeat”, that is, they do not
have artificial regularities that can easily be (intentionally or unintentionally)
exploited by an algorithm. <a class="footnote-reference" href="#id26" id="id18">[5]</a></li>
<li>scalable with the input dimension <a class="reference internal" href="#whi1996" id="id19">[WHI1996]</a>.</li>
</ol>
</li>
<li><p class="first">There is no predefined budget (number of <img class="math" src="_images/math/f0307af130a9303d3def5b2c46714d19118bac36.png" alt="f" style="vertical-align: -4px"/>-evaluations) for running an
experiment, the experimental procedure is <em>budget-free</em> <a class="reference internal" href="#han2016ex" id="id20">[HAN2016ex]</a>.</p>
</li>
<li><p class="first">A single performance measure is used — and thereafter aggregated and
displayed in several ways —, namely <strong>runtime</strong>, <em>measured in
number of</em> <img class="math" src="_images/math/f0307af130a9303d3def5b2c46714d19118bac36.png" alt="f" style="vertical-align: -4px"/>-<em>evaluations</em> <a class="reference internal" href="#han2016perf" id="id21">[HAN2016perf]</a>. This runtime measure has the
advantages to</p>
<ul class="simple">
<li>be independent of the computational platform, language, compiler, coding
styles, and other specific experimental conditions <a class="footnote-reference" href="#id28" id="id22">[6]</a></li>
<li>be independent, as a measurement, of the specific function on which it has
been obtained</li>
<li>be relevant, meaningful and easily interpretable without expert domain knowledge</li>
<li>be quantitative on the ratio scale <a class="footnote-reference" href="#id30" id="id23">[7]</a> <a class="reference internal" href="#ste1946" id="id24">[STE1946]</a></li>
<li>assume a wide range of values</li>
<li>aggregate over a collection of values in a meaningful way <a class="footnote-reference" href="#id31" id="id25">[8]</a>.</li>
</ul>
<p>A <em>missing</em> runtime value is considered as possible outcome (see below).</p>
</li>
<li><p class="first">The display is as comprehensible, intuitive and informative as possible.
We believe that the details matter.
Aggregation over dimension is avoided, because dimension is a parameter
known in advance that can and should be used for algorithm design decisions.
This is possible without significant drawbacks, because all functions are
scalable in the dimension.</p>
</li>
</ol>
<p>We believe however that in the <em>process</em> of algorithm <em>design</em>, a benchmarking
framework like <a class="reference external" href="https://github.com/numbbo/coco">COCO</a> has its limitations.
During the design phase, usually fewer benchmark functions should be used, the
functions and measuring tools should be tailored to the given algorithm and
design question, and the overall procedure should usually be rather informal and
interactive with rapid iterations.
A benchmarking framework then serves to conduct the formalized validation
experiment of the design <em>outcome</em> and can be used for regression testing.</p>
<table class="docutils footnote" frame="void" id="id26" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id18">[5]</a></td><td>For example, the optimum is not in all-zeros, optima are not placed
on a regular grid, most functions are not separable <a class="reference internal" href="#whi1996" id="id27">[WHI1996]</a>. The
objective to remain comprehensible makes it more challenging to design
non-regular functions. Which regularities are common place in real-world
optimization problems remains an open question.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="id28" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id22">[6]</a></td><td>Runtimes measured in <img class="math" src="_images/math/f0307af130a9303d3def5b2c46714d19118bac36.png" alt="f" style="vertical-align: -4px"/>-evaluations are widely
comparable and designed to stay. The experimental procedure
<a class="reference internal" href="#han2016ex" id="id29">[HAN2016ex]</a> includes however a timing experiment which records the
internal computational effort of the algorithm in CPU or wall clock time.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="id30" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id23">[7]</a></td><td>As opposed to a ranking of algorithms based on their solution quality
achieved after a given budget.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="id31" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id25">[8]</a></td><td>With the caveat that the <em>arithmetic average</em> is dominated by large values
which can compromise its informative value.</td></tr>
</tbody>
</table>
</div>
<div class="section" id="terminology">
<h3>Terminology<a class="headerlink" href="#terminology" title="Permalink to this headline">¶</a></h3>
<p>We specify a few terms which are used later.</p>
<dl class="docutils">
<dt><em>function</em></dt>
<dd>We talk about an objective <em>function</em> as a parametrized mapping
<img class="math" src="_images/math/bb3b4cc088220f561d9624697122d49cf057f0b2.png" alt="\mathbb{R}^n\to\mathbb{R}^m" style="vertical-align: -1px"/> with scalable input space, <img class="math" src="_images/math/1e35586c03a7b5545f72c54c110d730953536540.png" alt="n\ge2" style="vertical-align: -3px"/>,
and usually <img class="math" src="_images/math/64a1faffe7792d85f77af7923e09a5c0b82bb1fc.png" alt="m\in\{1,2\}" style="vertical-align: -5px"/>.
Functions are parametrized such that different <em>instances</em> of the
“same” function are available, e.g. translated or shifted versions.</dd>
<dt><em>problem</em></dt>
<dd>We talk about a <em>problem</em>, <a class="reference external" href="http://numbbo.github.io/coco-doc/C/coco_8h.html#a408ba01b98c78bf5be3df36562d99478"><code class="docutils literal"><span class="pre">coco_problem_t</span></code></a>, as a specific <em>function
instance</em> on which an optimization algorithm is run.
A problem
can be evaluated and returns an <img class="math" src="_images/math/f0307af130a9303d3def5b2c46714d19118bac36.png" alt="f" style="vertical-align: -4px"/>-value or -vector and, in case,
a <img class="math" src="_images/math/8f6db44bda25a42ce182b1b908fe26305168addc.png" alt="g" style="vertical-align: -4px"/>-vector.
In the context of performance assessment, a target <img class="math" src="_images/math/f0307af130a9303d3def5b2c46714d19118bac36.png" alt="f" style="vertical-align: -4px"/>- or
indicator-value is added to define a problem. A problem is considered as
solved when the given or the most difficult available target is obtained.</dd>
<dt><em>runtime</em></dt>
<dd>We define <em>runtime</em>, or <em>run-length</em> <a class="reference internal" href="#hoo1998" id="id32">[HOO1998]</a> as the <em>number of
evaluations</em> conducted on a given problem until a prescribed target value is
hit, also referred to as number of <em>function</em> evaluations or <img class="math" src="_images/math/f0307af130a9303d3def5b2c46714d19118bac36.png" alt="f" style="vertical-align: -4px"/>-evaluations.
Runtime is our central performance measure.</dd>
<dt><em>suite</em></dt>
<dd>A test- or benchmark-suite is a collection of problems, typically between
twenty and a hundred, where the number of objectives <img class="math" src="_images/math/862c9e331021da02c140ce6caef6f1d6519f571c.png" alt="m" style="vertical-align: 0px"/> is fixed.</dd>
</dl>
</div>
</div>
<div class="section" id="functions-instances-and-problems">
<h2>Functions, Instances, and Problems<a class="headerlink" href="#functions-instances-and-problems" title="Permalink to this headline">¶</a></h2>
<p>In the <a class="reference external" href="https://github.com/numbbo/coco">COCO</a> framework we consider <strong>functions</strong>, <img class="math" src="_images/math/79f3bb3d79a8de41a5a700d22d90d16bda92171d.png" alt="f_i" style="vertical-align: -4px"/>, for each suite
distinguished by their identifier <img class="math" src="_images/math/4d7143c57d4cf8653a2aa6b4c3c6c97446fc1592.png" alt="i=1,2,\dots" style="vertical-align: -4px"/> .
Functions are further <em>parametrized</em> by the (input) dimension, <img class="math" src="_images/math/273f486f7ce4023c108fff2c278487a34aaa6321.png" alt="n" style="vertical-align: 0px"/>, and the
instance number, <img class="math" src="_images/math/82ae6a6921ae07dd0e556d8402502aaea53a96d4.png" alt="j" style="vertical-align: -4px"/>.
We can think of <img class="math" src="_images/math/82ae6a6921ae07dd0e556d8402502aaea53a96d4.png" alt="j" style="vertical-align: -4px"/> as an index to a continuous parameter vector setting, as it
parametrizes, among others things, translations and rotations. In practice, <img class="math" src="_images/math/82ae6a6921ae07dd0e556d8402502aaea53a96d4.png" alt="j" style="vertical-align: -4px"/>
is the discrete identifier for single instantiations of these parameters.
For a given <img class="math" src="_images/math/862c9e331021da02c140ce6caef6f1d6519f571c.png" alt="m" style="vertical-align: 0px"/>, we then have</p>
<div class="math">
<p><img src="_images/math/870d97f75649775ba4ea7c242dd34e37f50be4c4.png" alt="\finstance_i \equiv f(n, i, j):\R^n \to \mathbb{R}^m \quad
\x \mapsto \finstance_i (\x) = f(n, i, j)(\x)\enspace."/></p>
</div><p>Varying <img class="math" src="_images/math/273f486f7ce4023c108fff2c278487a34aaa6321.png" alt="n" style="vertical-align: 0px"/> or <img class="math" src="_images/math/82ae6a6921ae07dd0e556d8402502aaea53a96d4.png" alt="j" style="vertical-align: -4px"/> leads to a variation of the same function
<img class="math" src="_images/math/fc410bef70dece978b90fa5bb68c683360db785b.png" alt="i" style="vertical-align: 0px"/> of a given suite.
Fixing <img class="math" src="_images/math/273f486f7ce4023c108fff2c278487a34aaa6321.png" alt="n" style="vertical-align: 0px"/> and <img class="math" src="_images/math/82ae6a6921ae07dd0e556d8402502aaea53a96d4.png" alt="j" style="vertical-align: -4px"/> of function <img class="math" src="_images/math/79f3bb3d79a8de41a5a700d22d90d16bda92171d.png" alt="f_i" style="vertical-align: -4px"/> defines an optimization <strong>problem</strong>
<img class="math" src="_images/math/6677d043e7a3bbb44f3c74922a5efc455f7f0730.png" alt="(n, i, j)\equiv(f_i, n, j)" style="vertical-align: -4px"/> that can be presented to the optimization algorithm. Each problem receives again
an index in the suite, mapping the triple <img class="math" src="_images/math/558b03aade774fffd5caced4fd5e81fb9bcc8f1b.png" alt="(n, i, j)" style="vertical-align: -4px"/> to a single
number.</p>
<p>As the formalization above suggests, the differentiation between function (index)
and instance index is of purely semantic nature.
This semantics however is important in how we display and
interpret the results. We interpret <strong>varying the instance</strong> parameter as
a natural randomization for experiments <a class="footnote-reference" href="#id35" id="id33">[9]</a> in order to</p>
<blockquote>
<div><ul>
<li><p class="first">generate repetitions on a function and</p>
</li>
<li><p class="first">average away irrelevant aspects of the function definition, thereby providing</p>
<blockquote>
<div><ul class="simple">
<li>generality which alleviates the problem of overfitting, and</li>
<li>a fair setup which prevents intentional or unintentional exploitation of
irrelevant or artificial function properties.</li>
</ul>
</div></blockquote>
</li>
</ul>
</div></blockquote>
<p>For example, we consider the absolute location of the optimum not a defining
function feature. Consequently, in a typical <a class="reference external" href="https://github.com/numbbo/coco">COCO</a> benchmark suite, instances
with randomized search space translations are presented to the optimizer. <a class="footnote-reference" href="#id36" id="id34">[10]</a></p>
<table class="docutils footnote" frame="void" id="id35" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id33">[9]</a></td><td>Changing or sweeping through a relevant feature of the problem class,
systematically or randomized, is another possible usage of instance
parametrization.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="id36" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id34">[10]</a></td><td>Conducting either several trials on instances with randomized search space
translations or with a randomized initial solution is equivalent, given
that the optimizer behaves translation invariant (disregarding domain
boundaries).</td></tr>
</tbody>
</table>
</div>
<div class="section" id="runtime-and-target-values">
<h2>Runtime and Target Values<a class="headerlink" href="#runtime-and-target-values" title="Permalink to this headline">¶</a></h2>
<p>In order to measure the runtime of an algorithm on a problem, we
establish a hitting time condition.
We prescribe a <strong>target value</strong>, <img class="math" src="_images/math/dc02ab1d58825bc03d8f3711007c26c0beb55232.png" alt="t" style="vertical-align: 0px"/>, which is an <img class="math" src="_images/math/f0307af130a9303d3def5b2c46714d19118bac36.png" alt="f" style="vertical-align: -4px"/>-value or more generally a
quality indicator-value <a class="reference internal" href="#han2016perf" id="id37">[HAN2016perf]</a> <a class="reference internal" href="#bro2016" id="id38">[BRO2016]</a>.
For a single run, when an algorithm reaches or surpasses the target value <img class="math" src="_images/math/dc02ab1d58825bc03d8f3711007c26c0beb55232.png" alt="t" style="vertical-align: 0px"/>
on problem <img class="math" src="_images/math/301c64935fc6bdae853a720871286bccf6cae341.png" alt="(f_i, n, j)" style="vertical-align: -4px"/>, we say it has <em>solved the problem</em> <img class="math" src="_images/math/6abd084e25cb7cb33a31d01b8f76f4382506956c.png" alt="(f_i, n, j, t)" style="vertical-align: -4px"/> — it was successful. <a class="footnote-reference" href="#id42" id="id39">[11]</a></p>
<p>Now, the <strong>runtime</strong> is the evaluation count when the target value <img class="math" src="_images/math/dc02ab1d58825bc03d8f3711007c26c0beb55232.png" alt="t" style="vertical-align: 0px"/> was
reached or surpassed for the first time.
That is, runtime is the number of <img class="math" src="_images/math/f0307af130a9303d3def5b2c46714d19118bac36.png" alt="f" style="vertical-align: -4px"/>-evaluations needed to solve the problem
<img class="math" src="_images/math/6abd084e25cb7cb33a31d01b8f76f4382506956c.png" alt="(f_i, n, j, t)" style="vertical-align: -4px"/>. <a class="footnote-reference" href="#id43" id="id40">[12]</a>
<em>Measured runtimes are the only way how we assess the performance of an
algorithm.</em>
Observed success rates are generally translated into runtimes on a subset of
problems.</p>
<p>If an algorithm does not hit the target in a single run, this runtime remains
undefined — while it has been bounded from below by the number of evaluations
in this unsuccessful run.
The number of available runtime values depends on the budget the
algorithm has explored.
Therefore, larger budgets are preferable — however they should not come at
the expense of abandoning reasonable termination conditions. Instead,
restarts should be done <a class="reference internal" href="#han2016ex" id="id41">[HAN2016ex]</a>.</p>
<table class="docutils footnote" frame="void" id="id42" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id39">[11]</a></td><td>Reflecting the <em>anytime</em> aspect of the experimental setup,
we use the term <em>problem</em> in two meanings: as the problem the
algorithm is benchmarked on, <img class="math" src="_images/math/301c64935fc6bdae853a720871286bccf6cae341.png" alt="(f_i, n, j)" style="vertical-align: -4px"/>, and as the problem, <img class="math" src="_images/math/6abd084e25cb7cb33a31d01b8f76f4382506956c.png" alt="(f_i, n, j, t)" style="vertical-align: -4px"/>, an algorithm may
solve by hitting the target <img class="math" src="_images/math/dc02ab1d58825bc03d8f3711007c26c0beb55232.png" alt="t" style="vertical-align: 0px"/> with the runtime, <img class="math" src="_images/math/29a2a393300437c22847c898bec704867934d5e9.png" alt="\mathrm{RT}(f_i, n, j, t)" style="vertical-align: -4px"/>, or may fail to solve.
Each problem <img class="math" src="_images/math/301c64935fc6bdae853a720871286bccf6cae341.png" alt="(f_i, n, j)" style="vertical-align: -4px"/> gives raise to a collection of dependent problems <img class="math" src="_images/math/6abd084e25cb7cb33a31d01b8f76f4382506956c.png" alt="(f_i, n, j, t)" style="vertical-align: -4px"/>.
Viewed as random variables, the events <img class="math" src="_images/math/29a2a393300437c22847c898bec704867934d5e9.png" alt="\mathrm{RT}(f_i, n, j, t)" style="vertical-align: -4px"/> given <img class="math" src="_images/math/301c64935fc6bdae853a720871286bccf6cae341.png" alt="(f_i, n, j)" style="vertical-align: -4px"/> are not
independent events for different values of <img class="math" src="_images/math/dc02ab1d58825bc03d8f3711007c26c0beb55232.png" alt="t" style="vertical-align: 0px"/>.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="id43" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id40">[12]</a></td><td>Target values are directly linked to a problem, leaving the burden to
define the targets with the designer of the benchmark suite.
The alternative, namely to present the obtained <img class="math" src="_images/math/f0307af130a9303d3def5b2c46714d19118bac36.png" alt="f" style="vertical-align: -4px"/>- or indicator-values as results,
leaves the (rather unsurmountable) burden to interpret the meaning of these
indicator values to the experimenter or the final audience.
Fortunately, there is an automatized generic way to generate target values
from observed runtimes, the so-called run-length based target values
<a class="reference internal" href="#han2016perf" id="id44">[HAN2016perf]</a>.</td></tr>
</tbody>
</table>
<div class="section" id="restarts-and-simulated-restarts">
<span id="sec-restarts"></span><h3>Restarts and Simulated Restarts<a class="headerlink" href="#restarts-and-simulated-restarts" title="Permalink to this headline">¶</a></h3>
<p>An optimization algorithm is bound to terminate and, in the single-objective case, return a recommended
solution, <img class="math" src="_images/math/d1304f0b7773d247ff0207c9e036d817d003555b.png" alt="\x" style="vertical-align: 0px"/>, for the problem, <img class="math" src="_images/math/301c64935fc6bdae853a720871286bccf6cae341.png" alt="(f_i, n, j)" style="vertical-align: -4px"/>. <a class="footnote-reference" href="#id54" id="id45">[13]</a>
The algorithm solves thereby all problems <img class="math" src="_images/math/6abd084e25cb7cb33a31d01b8f76f4382506956c.png" alt="(f_i, n, j, t)" style="vertical-align: -4px"/> for which <img class="math" src="_images/math/9132780939852d3f51a824c5ac4fecc302013d72.png" alt="f(\x)\le t" style="vertical-align: -4px"/>.
Independent restarts from different, randomized initial solutions are a simple
but powerful tool to increase the number of solved problems <a class="reference internal" href="#har1999" id="id46">[HAR1999]</a> — namely by increasing the number of <img class="math" src="_images/math/dc02ab1d58825bc03d8f3711007c26c0beb55232.png" alt="t" style="vertical-align: 0px"/>-values, for which the problem <img class="math" src="_images/math/301c64935fc6bdae853a720871286bccf6cae341.png" alt="(f_i, n, j)" style="vertical-align: -4px"/>
was solved. <a class="footnote-reference" href="#id55" id="id47">[14]</a>
Independent restarts tend to increase the success rate, but they generally do
not <em>change</em> the performance <em>assessment</em>, because the successes materialize at
greater runtimes <a class="reference internal" href="#han2016perf" id="id48">[HAN2016perf]</a>.
Therefore, we call our approach <em>budget-free</em>.
Restarts however “<em>improve the reliability, comparability, precision, and “visibility” of the measured results</em>” <a class="reference internal" href="#han2016ex" id="id49">[HAN2016ex]</a>.</p>
<p><em>Simulated restarts</em> <a class="reference internal" href="#han2010" id="id50">[HAN2010]</a> <a class="reference internal" href="#han2016perf" id="id51">[HAN2016perf]</a> are used to determine a runtime for unsuccessful runs.
Semantically, <em>this is only valid if we can interpret different
instances as random repetitions</em>.
Resembling the bootstrapping method <a class="reference internal" href="#efr1994" id="id52">[EFR1994]</a>, when we face an unsolved problem,
we draw uniformly at random a new <img class="math" src="_images/math/82ae6a6921ae07dd0e556d8402502aaea53a96d4.png" alt="j" style="vertical-align: -4px"/> until we find an instance such that <img class="math" src="_images/math/6abd084e25cb7cb33a31d01b8f76f4382506956c.png" alt="(f_i, n, j, t)" style="vertical-align: -4px"/>
was solved. <a class="footnote-reference" href="#id56" id="id53">[15]</a>
The evaluations done on the first unsolved problem and on all subsequently
drawn unsolved problems are added to the runtime on the last problem and
are considered as runtime on the originally unsolved problem.
This method is applied if a problem instance was not solved and is
(only) available if at least one problem instance was solved.
It allows to directly compare algorithms with different success probabilities.</p>
<table class="docutils footnote" frame="void" id="id54" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id45">[13]</a></td><td>More specifically, we use the anytime scenario where we consider
at each evaluation the evolving quality indicator value.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="id55" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id47">[14]</a></td><td>The quality indicator is always defined such that for a given problem <img class="math" src="_images/math/301c64935fc6bdae853a720871286bccf6cae341.png" alt="(f_i, n, j)" style="vertical-align: -4px"/> the
number of acquired runtime values <img class="math" src="_images/math/29a2a393300437c22847c898bec704867934d5e9.png" alt="\mathrm{RT}(f_i, n, j, t)" style="vertical-align: -4px"/> (hitting a target indicator value <img class="math" src="_images/math/dc02ab1d58825bc03d8f3711007c26c0beb55232.png" alt="t" style="vertical-align: 0px"/>)
is monotonously increasing with the used budget. Considered as random
variables, these runtimes are not independent.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="id56" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id53">[15]</a></td><td>More specifically, we consider the problems <img class="math" src="_images/math/ae95f31e6a988bc22c9d77572fc9468269b004f6.png" alt="(f_i, n, j, t(j))" style="vertical-align: -4px"/> for
all benchmarked instances <img class="math" src="_images/math/82ae6a6921ae07dd0e556d8402502aaea53a96d4.png" alt="j" style="vertical-align: -4px"/>. The targets <img class="math" src="_images/math/bcfd2d20d1baefb0af5e251bb6f3e3a9e4e9a928.png" alt="t(j)" style="vertical-align: -4px"/> depend on the instance
in a way to make the problems comparable.</td></tr>
</tbody>
</table>
</div>
<div class="section" id="aggregation">
<h3>Aggregation<a class="headerlink" href="#aggregation" title="Permalink to this headline">¶</a></h3>
<p>A typical benchmark suite consists of about 20–100 functions with 5–15 instances for each function. For each instance, up to about 100 targets are considered for the
performance assessment. This means we consider at least <img class="math" src="_images/math/d5cebea34deef8a28fa883338aa8bb10f9629976.png" alt="20\times5=100" style="vertical-align: -1px"/>, and
up to <img class="math" src="_images/math/16ec9066ffdaaab3f0437d90ad41128e491c406f.png" alt="100\times15\times100=150\,000" style="vertical-align: -1px"/> runtimes for the performance assessment.
To make them amenable to the experimenter, we need to summarize these data.</p>
<p>Our idea behind an aggregation is to make a statistical summary over a set or
subset of <em>problems of interest over which we assume a uniform distribution</em>.
From a practical perspective this means to have no simple way to distinguish
between these problems and to select an optimization algorithm accordingly — in
which case an aggregation for a single algorithm would not be helpful —
and that we face each problem with similar probability.
We do not aggregate over dimension, because dimension can and
should be used for algorithm selection.</p>
<p>We have several ways to aggregate the resulting runtimes.</p>
<blockquote>
<div><ul class="simple">
<li>Empirical (cumulative) distribution functions (ECDF). In the domain of
optimization, ECDF are also known as <em>data profiles</em> <a class="reference internal" href="#mor2009" id="id57">[MOR2009]</a>. We
prefer the simple ECDF over the more innovative performance profiles
<a class="reference internal" href="#mor2002" id="id58">[MOR2002]</a> for two reasons.
ECDF (i) do not depend on other (presented) algorithms, that is, they are
unconditionally comparable across different publications, and (ii) let us
distinguish, for the considered algorithm, in a natural way easy problems from
difficult problems. <a class="footnote-reference" href="#id60" id="id59">[16]</a>
We usually display ECDF on the log scale, which makes the area
above the curve and the <em>difference area</em> between two curves a meaningful
conception.</li>
<li>Averaging, as an estimator of the expected runtime. The average runtime
is often plotted against dimension to indicate scaling with dimension.
The <em>arithmetic</em> average is only meaningful if the underlying distribution of
the values is similar.
Otherwise, the average of log-runtimes, or <em>geometric</em> average,
is recommended.</li>
<li>Restarts and simulated restarts, see Section <a class="reference internal" href="#sec-restarts"><span>Restarts and Simulated Restarts</span></a>, do not
aggregate runtimes in the literal meaning (they are literally defined only when <img class="math" src="_images/math/dc02ab1d58825bc03d8f3711007c26c0beb55232.png" alt="t" style="vertical-align: 0px"/> was
hit). They aggregate, however, time data to eventually supplement, if applicable,
all missing runtime values.</li>
</ul>
</div></blockquote>
<table class="docutils footnote" frame="void" id="id60" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id59">[16]</a></td><td>When reading a performance profile, a question immediately crossing ones
mind is often whether a large runtime difference is observed mainly because
one algorithm solves the problem very quickly.
This question cannot be answered from the profile.
The advantage (i) over data profiles disappears when using run-length based
target values <a class="reference internal" href="#han2016perf" id="id61">[HAN2016perf]</a>.</td></tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="general-code-structure">
<h2>General Code Structure<a class="headerlink" href="#general-code-structure" title="Permalink to this headline">¶</a></h2>
<p>The code basis of the <a class="reference external" href="https://github.com/numbbo/coco">COCO</a> code consists of two parts.</p>
<dl class="docutils">
<dt>The <em>experiments</em> part</dt>
<dd>defines test suites, allows to conduct experiments, and provides the output
data. The <a class="reference external" href="http://numbbo.github.io/coco-doc/C">code base is written in C</a>, and wrapped in different languages
(currently Java, Python, Matlab/Octave). An amalgamation technique is used
that outputs two files <code class="docutils literal"><span class="pre">coco.h</span></code> and <code class="docutils literal"><span class="pre">coco.c</span></code> which suffice to run
experiments within the <a class="reference external" href="https://github.com/numbbo/coco">COCO</a> framework.</dd>
<dt>The <em>post-processing</em> part</dt>
<dd>processes the data and displays the resulting runtimes. This part is
entirely written in Python and heavily depends on <a class="reference external" href="http://matplotlib.org/"><code class="docutils literal"><span class="pre">matplotlib</span></code></a> <a class="reference internal" href="#hun2007" id="id63">[HUN2007]</a>.</dd>
</dl>
</div>
<div class="section" id="test-suites">
<h2>Test Suites<a class="headerlink" href="#test-suites" title="Permalink to this headline">¶</a></h2>
<p>Currently, the <a class="reference external" href="https://github.com/numbbo/coco">COCO</a> framework provides three different test suites.</p>
<dl class="docutils">
<dt><code class="docutils literal"><span class="pre">bbob</span></code></dt>
<dd>contains 24 functions in five subgroups <a class="reference internal" href="#han2009fun" id="id64">[HAN2009fun]</a>.</dd>
<dt><code class="docutils literal"><span class="pre">bbob-noisy</span></code></dt>
<dd>contains 30 noisy problems in three subgroups <a class="reference internal" href="#han2009noi" id="id65">[HAN2009noi]</a>,
currently only implemented in the <a class="reference external" href="https://numbbo.github.io/coco/oldcode/bboball15.03.tar.gz">old code basis</a>.</dd>
<dt><code class="docutils literal"><span class="pre">bbob-biobj</span></code></dt>
<dd>contains 55 bi-objective (<img class="math" src="_images/math/35a82f7fa7e77be3728336c5bd59dc655eb19b57.png" alt="m=2" style="vertical-align: 0px"/>) functions in 15 subgroups <a class="reference internal" href="#tus2016" id="id66">[TUS2016]</a>.</dd>
</dl>
<H2>Acknowledgments</H2><p>The authors would like to thank Raymond Ros, Steffen Finck, Marc Schoenauer,
Petr Posik and Dejan Tušar for their many invaluable contributions to this work.</p>
<p>The authors also acknowledge support by the grant ANR-12-MONU-0009 (NumBBO)
of the French National Research Agency.</p>
<H2>References</H2><table class="docutils citation" frame="void" id="bro2016" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id38">[BRO2016]</a></td><td>D. Brockhoff, T. Tušar, D. Tušar, T. Wagner, N. Hansen, A. Auger, (2016).
<a class="reference external" href="http://numbbo.github.io/coco-doc/bbob-biobj/perf-assessment">Biobjective Performance Assessment with the COCO Platform</a>. <em>ArXiv e-prints</em>, <a class="reference external" href="http://arxiv.org/abs/1605.01746">arXiv:1605.01746</a>.</td></tr>
</tbody>
</table>
<table class="docutils citation" frame="void" id="han2016perf" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label">[HAN2016perf]</td><td><em>(<a class="fn-backref" href="#id21">1</a>, <a class="fn-backref" href="#id37">2</a>, <a class="fn-backref" href="#id44">3</a>, <a class="fn-backref" href="#id48">4</a>, <a class="fn-backref" href="#id51">5</a>, <a class="fn-backref" href="#id61">6</a>)</em> N. Hansen, A. Auger, D. Brockhoff, D. Tušar, T. Tušar (2016).
<a class="reference external" href="http://numbbo.github.io/coco-doc/perf-assessment">COCO: Performance Assessment</a>. <em>ArXiv e-prints</em>, <a class="reference external" href="http://arxiv.org/abs/1605.03560">arXiv:1605.03560</a>.</td></tr>
</tbody>
</table>
<table class="docutils citation" frame="void" id="han2010" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id50">[HAN2010]</a></td><td>N. Hansen, A. Auger, R. Ros, S. Finck, and P. Posik (2010).
Comparing Results of 31 Algorithms from the Black-Box Optimization Benchmarking BBOB-2009. Workshop Proceedings of the GECCO Genetic and Evolutionary Computation Conference 2010, ACM, pp. 1689-1696.</td></tr>
</tbody>
</table>
<table class="docutils citation" frame="void" id="han2009fun" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id64">[HAN2009fun]</a></td><td>N. Hansen, S. Finck, R. Ros, and A. Auger (2009).
<a class="reference external" href="https://numbbo.github.io/`/downloads/download16.00/bbobdocfunctions.pdf">Real-parameter black-box optimization benchmarking 2009: Noiseless functions definitions</a>. <a class="reference external" href="https://hal.inria.fr/inria-00362633">Research Report RR-6829</a>, Inria, updated February 2010.</td></tr>
</tbody>
</table>
<table class="docutils citation" frame="void" id="han2009noi" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id65">[HAN2009noi]</a></td><td>N. Hansen, S. Finck, R. Ros, and A. Auger (2009).
<a class="reference external" href="https://numbbo.github.io/coco/oldcode/bbobdocnoisyfunctions.pdf">Real-Parameter Black-Box Optimization Benchmarking 2009: Noisy Functions Definitions</a>. <a class="reference external" href="https://hal.inria.fr/inria-00369466">Research Report RR-6869</a>, Inria, updated February 2010.</td></tr>
</tbody>
</table>
<table class="docutils citation" frame="void" id="han2016ex" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label">[HAN2016ex]</td><td><em>(<a class="fn-backref" href="#id20">1</a>, <a class="fn-backref" href="#id29">2</a>, <a class="fn-backref" href="#id41">3</a>, <a class="fn-backref" href="#id49">4</a>)</em> N. Hansen, T. Tušar, A. Auger, D. Brockhoff, O. Mersmann (2016).
<a class="reference external" href="http://numbbo.github.io/coco-doc/experimental-setup/">COCO: The Experimental Procedure</a>, <em>ArXiv e-prints</em>, <a class="reference external" href="http://arxiv.org/abs/1603.08776">arXiv:1603.08776</a>.</td></tr>
</tbody>
</table>
<table class="docutils citation" frame="void" id="hun2007" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id63">[HUN2007]</a></td><td>J. D. Hunter (2007). <a class="reference external" href="http://matplotlib.org/">Matplotlib</a>: A 2D graphics environment,
<em>Computing In Science & Engineering</em>, 9(3): 90-95.</td></tr>
</tbody>
</table>
<table class="docutils citation" frame="void" id="efr1994" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id52">[EFR1994]</a></td><td>B. Efron and R. Tibshirani (1994). <em>An introduction to the
bootstrap</em>. CRC Press.</td></tr>
</tbody>
</table>
<table class="docutils citation" frame="void" id="har1999" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id46">[HAR1999]</a></td><td>G. R. Harik and F. G. Lobo (1999). A parameter-less genetic
algorithm. In <em>Proceedings of the Genetic and Evolutionary Computation
Conference (GECCO)</em>, volume 1, pages 258-265. ACM.</td></tr>
</tbody>
</table>
<table class="docutils citation" frame="void" id="hoo1998" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id32">[HOO1998]</a></td><td>H. H. Hoos and T. Stützle (1998). Evaluating Las Vegas
algorithms: pitfalls and remedies. In <em>Proceedings of the Fourteenth
Conference on Uncertainty in Artificial Intelligence (UAI-98)</em>,
pages 238-245.</td></tr>
</tbody>
</table>
<table class="docutils citation" frame="void" id="mor2009" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id57">[MOR2009]</a></td><td>J. Moré and S. Wild (2009).
Benchmarking Derivative-Free Optimization Algorithms. <em>SIAM J. Optimization</em>, 20(1):172-191.</td></tr>
</tbody>
</table>
<table class="docutils citation" frame="void" id="mor2002" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id58">[MOR2002]</a></td><td>D. Dolan and J. J. Moré (2002).
Benchmarking Optimization Software with Performance Profiles. <em>Mathematical Programming</em>, 91:201-213.</td></tr>
</tbody>
</table>
<table class="docutils citation" frame="void" id="ste1946" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id24">[STE1946]</a></td><td>S.S. Stevens (1946).
On the theory of scales of measurement. <em>Science</em> 103(2684), pp. 677-680.</td></tr>
</tbody>
</table>
<table class="docutils citation" frame="void" id="tus2016" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id66">[TUS2016]</a></td><td>T. Tušar, D. Brockhoff, N. Hansen, A. Auger (2016).
<a class="reference external" href="http://numbbo.github.io/coco-doc/bbob-biobj/functions/">COCO: The Bi-objective Black Box Optimization Benchmarking (bbob-biobj)
Test Suite</a>, <em>ArXiv e-prints</em>, <a class="reference external" href="http://arxiv.org/abs/1604.00359">arXiv:1604.00359</a>.</td></tr>
</tbody>
</table>
<table class="docutils citation" frame="void" id="whi1996" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label">[WHI1996]</td><td><em>(<a class="fn-backref" href="#id19">1</a>, <a class="fn-backref" href="#id27">2</a>)</em> D. Whitley, S. Rana, J. Dzubera, K. E. Mathias (1996).
Evaluating evolutionary algorithms. <em>Artificial intelligence</em>, 85(1), 245-276.</td></tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
</div>
<div class="clearer"></div>
</div>
<div class="related" role="navigation" aria-label="related navigation">
<h3>Navigation</h3>
<ul>
<li class="right" style="margin-right: 10px">
<a href="genindex.html" title="General Index"
>index</a></li>
<li class="nav-item nav-item-0"><a href="#">COCO: COmparing Continuous Optimizers</a> »</li>
</ul>
</div>
<div class="footer" role="contentinfo">
Last updated on Jul 28, 2016.
Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.3.5.
</div>
</body>
</html>