-
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for list of bigrams #23
Conversation
This comment has been minimized.
This comment has been minimized.
index.js
Outdated
var value_ = String(value).toLowerCase() | ||
var alt = String(alternative).toLowerCase() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m quite sure this is making your later code never run though?
It does two things: cast to string, and lowercase.
The casting could be done when the value isn’t a string
The lowercase could be done for each bigram maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick review! Ah yeah, there is a bug, but for a different reason -- the later code is run because left
and right
are assigned to the original input params (value
and alternative
), rather than value_
and alt
. However, the bigram inputs wouldn't be case-insensitive. Made another commit that fixes the case sensitivity, but still probably not optimal from a readability perspective. Will try to come up with a more pleasant to read refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I think I've got a more readable solution committed.
readme.md
Outdated
|
||
// bigrams may also be passed as input arguments for improved efficiency | ||
// when analyzing the same strings repeatedly, for example, when | ||
// comparing the text of each file in a directory with the text of | ||
// each file in another directory. | ||
|
||
import {bigram} from 'n-gram' | ||
|
||
const bigramifiedString1 = bigram('abc') // ['ab', 'bc'] | ||
const bigramifiedString2 = bigram('xyz') // ['xy', 'yz'] | ||
|
||
diceCoefficient(bigramifiedString1, bigramifiedString2) // => 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don‘t think this needs to be in the Use section. but it should probably be in the API section, that arrays of strings are allowed, and a note that they should be bigrams?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the explanation to another code section, if that works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some prose suggestions. Rest looks good!
Co-authored-by: Titus <[email protected]>
Co-authored-by: Titus <[email protected]>
Co-authored-by: Titus <[email protected]>
released, thanks! |
This PR implements #22 to skip "bigram-ifying" if an input is already a bigram by checking if the input is an array.
Used nested ternaries for the logic -- would understand if you'd prefer not having those, though.