-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
493 lines (460 loc) · 24.5 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
<html>
<head>
<!--
<link rel="stylesheet" type="text/css" href="http://akottr.github.io/css/reset.css" />
<link rel="stylesheet" type="text/css" href="http://akottr.github.io/css/akottr.css" />
-->
<link rel="stylesheet" href="css/themes/smoothness/jquery-ui-1.10.3.custom.min.css">
<link rel="stylesheet" href="css/style.css">
<link href='http://fonts.googleapis.com/css?family=Berkshire+Swash' rel='stylesheet' type='text/css'>
<link rel="stylesheet" type="text/css" href="css/dragtable.css" />
<script src="js/lib/jquery-1.10.2.min.js"></script>
<script src="js/lib/jquery-ui-1.10.3.custom.min.js"></script>
<script src="js/lib/jquery.scrollIntoView.js"></script>
<script src="js/lib/jquery.dragtable.js"></script>
<script src="js/lib/d3.js" charset="utf-8"></script>
<script src="http://ajax.aspnetcdn.com/ajax/jquery.validate/1.11.1/jquery.validate.js"></script>
<script src="js/app/table_input.js"></script>
<script src="js/app/opt_deg_codons.js"></script>
<script src="js/app/main.js"></script>
<script src="js/app/plot.js"></script>
</head>
<body>
<h1>SwiftLib</h1>
<h2>A web-based tool for rapid optimization of degenerate codons</h2>
<div id="accordion">
<h3><a href="#">Details</a></h3>
<div>
<span class=subheader>Purpose</span>
<p>
This program optimizes a degenerate codon library to cover the desired
set of amino acids at several positions while staying within a diversity
limit for the library. It is a fast way to generate small libraries.
</p>
<p>
The typical case for which we imagine SwiftLib to be useful is this: imagine
that you have constructed 100 redesigned models of a particular
protein where you allowed 15 residues to mutate. For position 82 (one of
the residues you allowed to mutate), alanine appeared 10 times, arginine
35 times, lysine 25 times, leucine 20 times, asparagine 5 times, tryptophan
3 times, and valine twice and similar distributions were seen at the other
14 positions. (You would enter all of these counts in SwiftLib's table below.)
Let's say you're aiming to create a library for yeast display
and want to ensure that you don't exceed a (DNA) diversity of 1e7.
In this case, you would rather have a library that excluded tryptophan at
position 82 than one that excluded arginine. Trypotophan's exclusion represents
<i>error</i>: you wanted something, but you couldn't get it. The goal of
this program is to find the assignment of degenerate codons within the
given diversity limit that yields the minimal error over all positions being
randomized.
</p>
<p>
Additionally, it is possible to ask the algorithm to allow multiple degenerate
codons at one or more positions. You simply indicate the primer boundaries and
how many oligos you are willing to buy. The algorithm will choose which positions
to use multiple degenerate codons at to get the best coverage. The number of oligos
that must be purchased to cover the randomized residues
that lie inside the same primer boundaries is the product of the number of degenerate
codons used at each of those positions. E.g. if there are three residues that are part
of the same stretch using, 2, 3, and 4 degenerate codons, then to cover all combinations,
2*3*4 = 24 oligos would have to be purchased. One of the more expensive parts of
considering multiple degenerate codons is enumerating all combinations of degenerate
codons; allowing <i>i</i> degenerate codons at a single position requires looking at
<i>(15<sup>3</sup>)<sup>i</sup></i> combinations. For this reason we do not recommend
using more than 4 degenerate codons at any one position.
</p>
<span class=subheader>Algorithm</span>
<p>
The program works by dynamic programming. If all of the errors are integers,
then it is possible to ask, for each position, what degenerate codon has
the smallest diversity given that it produces a given error. The smallest
library for positions [1..<i>i</i>] given a particular error can be readily
computed using a simple recurrence. The best library is the one with the smallest
error with a diversity below the desired cap. The running time is <i>O(n<sup>2</sup>m<sup>2</sup>)</i> for
<i>n</i> positions and <i>m</i> error gradations. In the case above, n=15, and
m=100 (the maximum error would be 100 given by a codon that doesn't contain any
of the desired amino acids; the larger the maximum error the longer the running time).
Analysis for the multiple-degenerate-codon algorithm is slightly more complicated
and will be published shortly.
</p>
<p>
Privacy? This program is implemented in JavaScript and therefore runs inside your browser.
It does not send any data anywhere. You do not need to worry about anyone decoding the
library you're creating.
</p>
<span class=subheader>Input</span>
<p>
The input for SwiftLib is a table of positions for which you would like
to vary in your library, and a numeric preference for each amino acid at
these positions that you would like to favor, or disfavor. So, given the
above example, the input would be a table with 15 columns (1 for each residue),
and the numeric preference for each amino acid would be the number of occurances
of that amino acid at that position. Aside from using positive integers to favor
an amino acid, one can also use negative integers to disfavor an amino acid
at a given positions. Furthermore one can use the '*' and '!' wildcards to
specify that the amino acid is required ('*') or forbidden ('!') at the given position.
Empty fields in the table will be treated as if they contain a '0'
For convenience, SwiftLib allows the creation of this table manually, through a CSV
format, or through a collections of FASTA formatted sequences.
</p>
<span class=subheader>Source Code</span>
<p>
SwiftLib is implemented in JavaScript, so the source code can be downloaded by opening
the JavaScript console in your browser (e.g. on a Mac in Chrome, alt-command-j). If you
are interested in the integer-linear programming (ILP) solution described in the
paper, a tarball with a set of python scripts that can be used to generate the ILP
inputs and for processing the GLPK solver's outputs can be downloaded
<a href="ilp_for_mdc.tar.gz">here</a>.
</p>
<span class=subheader>Citation</span>
<p>
If you use this program in your work, please cite:<br>
Jacobs, Yumerefendi, Kuhlman & Leaver-Fay
<i>SwiftLib: rapid degenerate-codon-library optimization through dynamic programming</i> (2014)
Nucleic Acids Research, doi: 10.1093/nar/gku1323
[<a href="http://nar.oxfordjournals.org/content/early/2014/12/24/nar.gku1323.full?keytype=ref&ijkey=sLSrzFzUHlEi4gz">Link</a>]
[<a href="http://nar.oxfordjournals.org/cgi/reprint/gku1323?ijkey=sLSrzFzUHlEi4gz&keytype=ref">PDF</a>]
</p>
<p>
Please direct questions to: leaverfa at email dot unc dot edu.
</p>
</div>
</div>
<br>
<div id="tabs">
<ul>
<li><a href="#manual_tab">Manual input</a></li>
<li><a href="#csv_tab">CSV input</a></li>
<li><a href="#fasta_tab">FASTA input</a></li>
<li><a href="#clustal_tab">ClustalW input</a></li>
<li><a href="#msf_tab">MSF input</a></li>
</ul>
<div id=manual_tab class=hscroll>
<!--<form id="mainForm" method="get" action="">-->
<div id=table_aacounts class=table_input>
<table class=mytable id=aacounts>
<thead>
<!--
<tr class=firstrow>
<th></th>
<th>Delete</th>
</tr> -->
<tr class=firstrow>
<th></th>
<th class="accept">Drag</th>
</tr>
</thead>
<tbody>
<tr class=firstrow>
<td data-header="resnum">Residue Number:</td>
<td data-header="seqpos1"><input class=seqposcell type="text" name="test" size=4></td>
</tr>
<tr id=primerboundary_row>
<td data-header="Primer" >Primer Boundary</td>
<td><input class=primercell type="text" name="primer" size=4 value="-" ></td>
</tr>
<tr id=maxdcs_row>
<td data-header="maxdcs" >Max DCs</td>
<td><input class=maxdccell type="text" name="maxdc" size=4 value="1" ></td>
<tr>
<td>A (Alanine)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>C (Cysteine)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>D (Apsartic Acid)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>E (Glutamic Acid)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>F (Phenylalanine)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>G (Glycine)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>H (Histidine)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>I (Isoleucine)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>K (Lysine)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>L (Leucine)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>M (Methionine)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>N (Asparagine)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>P (Proline)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>Q (Glutamine)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>R (Arginine)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>S (Serine)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>T (Threonine)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>V (Valine)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>W (Tryptophan)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>Y (Tyrosine)</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
<tr>
<td>STOP</td>
<td>
<input class=aacountcell type="text" name="test" size=4>
</td>
</tr>
</tbody>
</table>
</div>
<div class=instructions>
<span class=subheader>Manual input instructions</span>
<ul>
<li> Each column in the table below represents one position, and each cell in the column (except for the first) represents one amino acid at that position.
<li> The "Add Position" button will add a new column to the right.
<li> The "Delete Position" button will delete the right-most column. All data in the column will be lost when you hit this button, so be careful.
<li> Columns may be dragged left or right by dragging on the word "Drag" at the top of the column.
<li> You must provide a label for each column in the "Res" row. This is for output purposes and any label will do (e.g. 208).
<li> Positive integers should be given in the table for amino acids that you want to appear.
<li> Negative integers can be given for amino acids you want to penalize (but not forbid).
<li> A star (*) can be given for amino acids that you want to require.
<li> An explanation point (!) can be given for amino acids that you want to forbid.
<li> Empty cells will be treated as if they contain a "0" (except for the "Res" row).
<li> Any ill-formatted cell on this page will be highlighted in pink.
<li> An upper bound on the library size must be given. This can be given in scientific notation (e.g. 1e7).
</ul>
<span class=subheader>Multiple degenerate codon instructions</span>
<ul>
<li> Activate the multiple-degenerate-codon algorithm by clicking on the "Allow Mult. Deg. Codons" button below.
This will add two rows to the table.
<li> The user must define primer boundaries in the "Primer Boundary" row.
<li> Indicate the first residue in a primer with the "|" (pipe) symbol.
<li> Indicate that a residue belongs in the same primer as the previous residue with the "-" (dash) symbol.
<li> The maximum number of degenerate codons ("Max DCs") to consider at each position should an integer;
enumerating the combinations of degenerate codons when more than 2 are requested at any position can
be very time consuming. 5 is the recommended maximum.
<li> Allowing multiple degenerate codons means purchasing extra oligos (primers). Indicate the number of
oligos you are willing to purchase total in the "Maximum Primers Total" box below the table. This number
should be more than the number of stretches you have, and will be increased to the number of stretches
you have if a smaller value is given. If you have 5 stretches and say you are willing to purchase 6 oligos,
then the algorithm will only be able to use multiple degenerate codons at a single position.
</ul>
<span class=subheader>Saving Data</span>
<p>
To save the data you have entered in the table below for future sessions, switch to the "CSV Input" tab.
This will display your data in a "comma separated value" (CSV) format. Copy your data
out of the text box that appears and put it in a document for later use. You may load data into the table
by pasting the data from a previously-saved table and clicking on the "Update Table" button (which is
only visible in the "CSV Input" tab). If you get an error message saying that it found 46 columns instead of
23, make sure you clear out the text window before you paste your data in it.
</p>
</div>
<div id=global_settings>
<button id="allow_mult_dcs">Allow Mult. Deg. Codons</button>
<button id="add">Add Position</button>
<button id="delcol">Delete Position</button>
<br/><br/>
<div>
<label for=libsize_upper>*Library size upper limit:</label>
<input id=libsize_upper title="Required field. The maximum allowable DNA diversity for the generated degenerate codons" type=text size=8 value="">
</div>
<div>
<label for=nsolutions># Requested Solutions:</label>
<input id=nsolutions title="Optional field. The number of solutions to returned. The optimal solution will always be returned, so this field can be left blank to indicate that only the optimal solution should be returned." type=text size=8>
</div>
<div>
<label for=stop_codon_penalty>Universal stop codon penalty:</label>
<input id=stop_codon_penalty title='Optional field. Stop codon penalty applied at all positions; overrides penalties given in the table. Must be either empty, an integer, or an exclamation mark ("!").' type=text size=8>
</div>
<div id=max_primers_total_div>
<label for=max_primers_total>Maximum primers total:</label>
<input id=max_primers_total title="How many primers are you willing to purchase to construct your library?" size=8 value="">
</div>
<!-- <div>
<label for=max_primers_per_stretch>Maximum primers for a single stretch:</label>
<input id=max_primers_per_stretch title="How many primers are you willing to purchase for a single stretch? Set this to the same value that you've set for the max primers total if you do not want to put any restriction on the number of primers to be considered for a single stretch." size=8 value="4">
</div> -->
<br>
<button id="launchbutton">Generate library!</button>
</div>
<!--</form>-->
</div>
<div id=csv_tab>
<div id=text_aacounts class=table_input>
<textarea id=csvaacounts rows="22" cols="50" ></textarea>
<br>
<button id="table_from_csv">Update Table</button>
<div id="update_result"></div>
</div>
<div class=instructions>
<span class=subheader>CSV input instructions</span>
<p>
To use the "comma separated value" (CSV) format simply pasted your comma-delimited data in the textbox.
The input text <b>must</b> take the same form as the manual input table (22 rows, the first
being the residue positions, the next 20 being the amino acids in alphabetical order by one-letter
code, the last being the stop penalty for that position). Once completed, click the 'Update Table'
button to populate the table with your inputs.
</p>
<p>
Once populated, one must still set the diversity bounds and the universal stop codon penalty (if desired)
</p>
</div>
</div>
<div id=fasta_tab class=table_input>
<div class=table_input>
<textarea id=fasta rows="30" cols="50"></textarea>
<br>
<button id="table_from_fasta">Update Table</button>
<div id="fasta_errors"></div>
</div>
<div class=instructions>
<span class=subheader>FASTA input instructions</span>
<p>
To use the FASTA input, paste FASTA formatted sequences in the textbox. All sequences
must be the same length. Once completed, clicking the 'Update Table' button with populate
the input table with the frequency of each amino acid at each position in the sequence.
Positions which never vary are excluded from the table.
</p>
<p>
Once populated, one must still set the diversity bounds and the universal stop codon penalty (if desired)
</p>
</div>
</div>
<div id=clustal_tab class=table_input>
<div class=table_input>
<textarea id=clustal rows="30" cols="50"></textarea>
<br>
<button id="table_from_clustal">Update Table</button>
<div id="clustal_errors"></div>
</div>
<div class=instructions>
<span class=subheader>Clustal W input instructions</span>
<p>
To use the the ClustalW input, paste a <a href="http://web.mit.edu/meme_v4.9.0/doc/clustalw-format.html">ClustalW formatted</a>
sequence alignment in the textbox. Please note that any position in the alignment that has only a single Amino Acid at that position
will be removed from the table.
</p>
<p>
Once populated, one must still set the diversity bounds and the universal stop codon penalty (if desired)
</p>
</div>
</div>
<div id=msf_tab class=table_input>
<div class=table_input>
<textarea id=msf rows="30" cols="50"></textarea>
<br>
<button id="table_from_msf">Update Table</button>
<div id="msf_errors"></div>
</div>
<div class=instructions>
<span class=subheader>MSF input instructions</span>
<p>
To use the MSF input, paste a MSF formatted sequence alignment in the textbox.
</p>
<p>
Once populated, one must still set the diversity bounds and the universal stop codon penalty (if desired).
Please note that any position in the alignment that has only a single Amino Acid at that position
will be removed from the table.
</p>
</div>
</div>
</div>
<br>
<br><br>
<div id="update_result"></div>
<div id="plotdiv"></div>
<div id="current_selected_output"></div>
<div id="resultdiv"></div>
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-62410241-2', 'auto');
ga('send', 'pageview');
</script>
</body>
</html>
<!-- Local Variables:
sgml-basic-offset : 4
js-indent-level: 4
indent-tabs-mode: nil
End: -->