Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for list of bigrams #23

Merged
merged 9 commits into from
Nov 10, 2021
Merged

Conversation

khxu
Copy link
Contributor

@khxu khxu commented Nov 6, 2021

This PR implements #22 to skip "bigram-ifying" if an input is already a bigram by checking if the input is an array.

Used nested ternaries for the logic -- would understand if you'd prefer not having those, though.

@codecov-commenter

This comment has been minimized.

index.js Outdated
Comment on lines 11 to 12
var value_ = String(value).toLowerCase()
var alt = String(alternative).toLowerCase()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m quite sure this is making your later code never run though?
It does two things: cast to string, and lowercase.
The casting could be done when the value isn’t a string
The lowercase could be done for each bigram maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick review! Ah yeah, there is a bug, but for a different reason -- the later code is run because left and right are assigned to the original input params (value and alternative), rather than value_ and alt. However, the bigram inputs wouldn't be case-insensitive. Made another commit that fixes the case sensitivity, but still probably not optimal from a readability perspective. Will try to come up with a more pleasant to read refactoring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I think I've got a more readable solution committed.

readme.md Outdated
Comment on lines 33 to 44

// bigrams may also be passed as input arguments for improved efficiency
// when analyzing the same strings repeatedly, for example, when
// comparing the text of each file in a directory with the text of
// each file in another directory.

import {bigram} from 'n-gram'

const bigramifiedString1 = bigram('abc') // ['ab', 'bc']
const bigramifiedString2 = bigram('xyz') // ['xy', 'yz']

diceCoefficient(bigramifiedString1, bigramifiedString2) // => 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don‘t think this needs to be in the Use section. but it should probably be in the API section, that arrays of strings are allowed, and a note that they should be bigrams?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the explanation to another code section, if that works.

Copy link
Member

@wooorm wooorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some prose suggestions. Rest looks good!

readme.md Outdated Show resolved Hide resolved
readme.md Outdated Show resolved Hide resolved
readme.md Outdated Show resolved Hide resolved
khxu and others added 3 commits November 9, 2021 08:01
Co-authored-by: Titus <[email protected]>
Co-authored-by: Titus <[email protected]>
Co-authored-by: Titus <[email protected]>
@wooorm wooorm changed the title Allow bigram as input Add support for list of bigrams Nov 10, 2021
@wooorm wooorm merged commit da9b223 into words:main Nov 10, 2021
@wooorm
Copy link
Member

wooorm commented Nov 10, 2021

released, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants