Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src/detector.js: Does not properly drop TLD #5409

Closed
trn1ty opened this issue Sep 7, 2021 · 5 comments
Closed

src/detector.js: Does not properly drop TLD #5409

trn1ty opened this issue Sep 7, 2021 · 5 comments
Assignees
Labels
bug Bug in the software of this repository

Comments

@trn1ty
Copy link
Contributor

trn1ty commented Sep 7, 2021

https://github.com/MetaMask/eth-phishing-detect/blob/master/src/detector.js#L63

domainToParts will return input.split('.').reverse(). domainPartsToFuzzyForm takes this domainParts list, removes the first element, and returns that list reversed and joined with full stops. The issue here is that a transform like domainPartsToFuzzyForm(domainToParts("1.2.3")) will return "1.2" regardless of whether "2.3" is actually the TLD of the website.

This can both result in false negatives and false positives. If I'm fuzzymatching "foo", and I'm on a site "foo.africa.com", "foo.africa.com" won't be flagged as spam because the detector will check the Levenshtein distance between "foo" and "foo.africa" (7). But if I'm fuzzymatching "bar", and I'm on "b.br.com", a false positive will occur because the Levenshtein distance between "bar" and "b.br" is 2. While the latter example isn't too plausible in the wild, the former example could certainly be used to exploit an unaware user - just take an African domain name registered under .com and register it under .africa.com.

I don't see an easy, consistent solution to this besides checking against a live list of TLDs which would really bog the detector down.

@trn1ty
Copy link
Contributor Author

trn1ty commented Sep 7, 2021

Perhaps the user should be more alert, and the naive approach to dropping the TLD can be taken, but the comment should probably be updated to mention the flaws in using that method.

@gravelcycles
Copy link
Contributor

Thanks for bringing this up @devenblake. Let me think through the implications and will get back to you

@trn1ty trn1ty added the bug Bug in the software of this repository label Jan 13, 2022
@AlexHerman1
Copy link
Collaborator

@devenblake is this still relevant?

@legobeat
Copy link
Contributor

legobeat commented Jan 26, 2023

example matching b.example.com is not a false positive here - that part makes sense as is.

The point on non-top "TLDs" (.co.uk would be more familiar) is still valid.

The fate of the fuzzylist in question and expanding it to target new domains should probably be put on hold, at least for now?

@trn1ty
Copy link
Contributor Author

trn1ty commented Jan 29, 2023

It's been a little while!

The fate of the fuzzylist in question and expanding it to target new domains should probably be put on hold, at least for now?

This is a much broader topic than this filed issue covers.

is this still relevant?

Yes, it still works the same way according to a quick check of the detector code though the detector has changed a lot since this issue was filed.

I didn't like how the detector dropped the TLD for a long time but I think it's adequate for a majority of uses so I'm not super opposed to it. I think there's room for better options though.

@legobeat legobeat closed this as not planned Won't fix, can't repro, duplicate, stale Feb 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug in the software of this repository
Projects
None yet
Development

No branches or pull requests

5 participants