forked from SuLab/ab_cluster
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME
40 lines (26 loc) · 1.25 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Compiling
=========
Test platform: Linux, with g++ 4.8.2.
Running make will create an executable named "cluster".
Usage
=====
usage: ./cluster [options...] input.json output.txt
Options:
--min-center-size=INTEGER: set the minimum center size to INTEGER.
Controls the speed-vs-accuracy tradeoff.
Reasonable values range from 10 to N/10000.
(e.g. 10 to 100 when processing 1M antibodies)
Lower values are generally more accurate.
Higher values are generally faster, but values
that are too high run extremely slowly.
--min-center-size=disabled: disable megaclustering.
Warning: very slow and memory hungry!
Don't use with large datasets.
Test files
==========
Here you can download a test input file (and its output file) to give a try:
https://copy.com/9NUMjTVtbarFXmU8
to run this program:
./cluster 1M.json 1M.out
* Input file, 1M.json, contains 1 million Ab sequence objects
* Output file, 1M.out, is a text file with each row as the assigned cluster number for each sequence from the input file (in the same order).