-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Repair illegal substitutions to an inversion #3
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3 +/- ##
==========================================
+ Coverage 88.96% 89.04% +0.08%
==========================================
Files 5 5
Lines 1803 1817 +14
Branches 127 127
==========================================
+ Hits 1604 1618 +14
Misses 72 72
Partials 127 127
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for PR. I added some comments. And you should add test cases to clj-hgvs.core-test/repair-hgvs-str-test
.
src/clj_hgvs/repairer.cljc
Outdated
@@ -87,6 +87,12 @@ | |||
(string/replace s #"([-\d+*]+_[-\d+*]+)([A-Z]+)?>([A-Z]+)" "$1del$2ins$3") | |||
s)) | |||
|
|||
;; c.233_234TC>CT -> c.233_234delinsCT | |||
(defn ^:no-doc substitutions->delins* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot understand the necessity of this repair rule. The repair case written as the comment is covered by existing substitutions->delins
.
(require '[clj-hgvs.core :as hgvs]
'[clj-hgvs.repairer :as repairer])
(hgvs/repair-hgvs-str "c.233_234TC>CT" [repairer/substitutions->delins])
;;=> "c.233_234delTCinsCT"
If you want to omit the deletion bases, you can use :show-bases? false
option.
(-> (hgvs/parse "c.233_234delTCinsCT")
(hgvs/format {:show-bases? false}))
;;=> "c.233_234delinsCT"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your review.
I removed substitutions->delins*
function.
src/clj_hgvs/repairer.cljc
Outdated
;; c.2210_2211CA>TG -> c.2210_2211inv | ||
(defn ^:no-doc basis->inv | ||
[s kind] | ||
(let [base-pairs {\A \T \T \A \C \G \G \C} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An inversion mutation is not appeared in protein mutations.
(if (#{:genome :mitochondria :coding-dna :non-coding-dna :circular-dna :rna}
kind)
...
s)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I set the condition for exclude :protein
src/clj_hgvs/repairer.cljc
Outdated
@@ -102,6 +108,19 @@ | |||
(string/replace s #"(del|dup|inv)\d+$" "$1") | |||
s)) | |||
|
|||
;; c.2210_2211CA>TG -> c.2210_2211inv | |||
(defn ^:no-doc basis->inv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to substitutions->inv
and add it to above substitution->delins
in built-in-repairers
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed to substitutions->inv
and added it in built-in-repairers
.
src/clj_hgvs/repairer.cljc
Outdated
[_ region b a] (re-find #"(\d+_\d+)([A-Z]+)>([A-Z]+)" s)] | ||
(if (= a | ||
(->> b | ||
(map base-pairs) | ||
reverse | ||
(apply str))) | ||
(str "c." region "inv") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex does not cover all coordinate patterns. Additionally, I recommend using replace
fn.
(letfn [(revcomp [s]
(->> (reverse s)
(map {\A \T \T \A \C \G \G \C})
(apply str)))]
(string/replace s
#"([-\d+*]+_[-\d+*]+)([A-Z]{2,})>([A-Z]{2,})"
(fn [[s coords del ins]]
(if (= del (revcomp ins))
(str coords "inv")
s))))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed the regex to cover all coordinate patterns, and changed process with using replace
fn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, your fix is perfect, but a test is still missing (#3 (review)).
Could you add c.2210_2211CA>TG
-> c.2210_2211inv
test case to just above the following line?
clj-hgvs/test/clj_hgvs/core_test.cljc
Line 286 in a920313
;; substitution->delins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I plan to merge it after @federkasten approval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay in reviewing.
I have checked. No additional comments. Thanks! 👍
I added two repairer functions to correct illegal hgvs strings.
Could you review this?