From e8e317d94b1c01daf2656a921e76474026f2678a Mon Sep 17 00:00:00 2001 From: Robert Edgar Date: Sun, 9 Jun 2024 10:12:32 -0700 Subject: [PATCH 1/2] Update README.md --- README.md | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index c03ef1b..f30ff53 100644 --- a/README.md +++ b/README.md @@ -1,22 +1,26 @@ -### Reseek +

-Reseek is a novel protein structure alignment algorithm which doubles sensitivity in protein homolog detection +Reseek is a novel protein structure alignment algorithm which improves sensitivity in protein homolog detection compared to state-of-the-art methods including DALI, TM-align and Foldseek with improved speed over Foldseek, the -fastest previous method. +fastest previous method. + Reseek is based on sequence alignment where each residue in the protein backbone is represented by a letter in a novel “mega-alphabet” of 85,899,345,920 (∼1011) distinct states. This approach enables rapid construction of multiple alignments of thousands of structures using the pair-HMM method in Muscle5. +Method sensitivity was measured on the SCOP40 benchmark using superfamily as the truth standard, focusing +on the regime with false-positive error rates <10 per query, corresponding to E<10 for an ideal E-value. + This is a preview beta release, new features and improved documentation will hopefully follow soon. Feedback is welcome via github Issues.
 All-vs-all alignment (excluding self-hits)
-    reseek -search STRUCTS -mode MODE -output hits.txt 
+    reseek -search STRUCTS -mode MODE -output hits.tsv 
 
-Search query against database
-    reseek -search Q_STRUCTS -db DB_STRUCTS -mode MODE -output hits.txt
+Search query structures against database
+    reseek -search Q_STRUCTS -db DB_STRUCTS -mode MODE -output hits.tsv
 
 Align two structures
     reseek -search NAME1.pdb -db NAME2.pdb -mode MODE -aln aln.txt
@@ -24,12 +28,12 @@ Align two structures
 Output options for -search
    -aln FILE     # Alignments in human-readable format
    -output FILE  # Hits in tabbed text format with 8 fields:
-                 #   Evalue Query Target Qstart Qend Tstart Tend CIGAR
+                 #   Evalue Query Target
                  # (More output formats coming soon)
 
 Search and alignment options
-  -mode MODE     # veryfast|fast|sensitive|verysensitive (required)
-  -evalue E      # Max E-value (default 10)
+  -mode MODE     # veryfast|fast|sensitive (default fast)
+  -evalue E      # Max E-value (default report all alignments)
   -omega X       # Omega accelerator (floating-point)
   -minu U        # K-mer accelerator (integer)
   -gapopen X     # Gap-open penalty (floating-point >= 0, default 1.1)

From 54a716a29beffc613b5df8f9a9c14cc89ca617c2 Mon Sep 17 00:00:00 2001
From: Robert Edgar 
Date: Sun, 9 Jun 2024 10:14:34 -0700
Subject: [PATCH 2/2] Update README.md

---
 README.md | 2 --
 1 file changed, 2 deletions(-)

diff --git a/README.md b/README.md
index f30ff53..3082571 100644
--- a/README.md
+++ b/README.md
@@ -6,8 +6,6 @@ fastest previous method.
 
 Reseek is based on sequence alignment where each residue in the protein backbone is represented by a 
 letter in a novel “mega-alphabet” of 85,899,345,920 (∼1011) distinct states.
-This approach enables rapid construction of multiple alignments of thousands of structures
-using the pair-HMM method in Muscle5.
 
 Method sensitivity was measured on the SCOP40 benchmark using superfamily as the truth standard, focusing
 on the regime with false-positive error rates <10 per query, corresponding to E<10 for an ideal E-value.