Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

binary search prework #7

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

binary search prework #7

wants to merge 1 commit into from

Conversation

janiceshiu
Copy link
Owner

No description provided.

Copy link

@robot-dreams robot-dreams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I haven't seen the info tracking before, that seems very useful.

class BinarySearch():
"""
Hashing and binary search both address the same problem: finding an item in a collection. What are some trade-offs between the two strategies?
* You can only use binary search on something you're able to sort by some sort of metric. You can use hashing on almost anything, possibly? since theoretically you could use pointers to point to whatever it is. and if it's a whole object perhaps you hash the memory location? I'm not sure, actually, and now that I mention it I wonder how that is done for languages that allow say, values to be entire objects.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct.

In Java for example (where the key can be an entire object), you're expected to define a custom hashCode function, which typically looks at the different fields of the object in order to generate the hash value.

"""
Hashing and binary search both address the same problem: finding an item in a collection. What are some trade-offs between the two strategies?
* You can only use binary search on something you're able to sort by some sort of metric. You can use hashing on almost anything, possibly? since theoretically you could use pointers to point to whatever it is. and if it's a whole object perhaps you hash the memory location? I'm not sure, actually, and now that I mention it I wonder how that is done for languages that allow say, values to be entire objects.
* Assuming stuff in the collection doesn't change, hashing might be faster to look stuff up because it's a constant o(1) time amortized, but binary search is generally log n.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you conclude that it's "amortized"?

When might you want to pick one over the other?
If sorting takes (at minimum) O(n log n) time, and binary search takes O(log n) time, under what circumstances might it worth it to sort a collection in order to perform binary search?
* When the number of times you need to sort it is far less than the number of times you need to search. There's probably some mathematical formula one can come up with, but I'm not sure. I suppose if you binary search X times, that's O(X log n) time total. And sorting once is O(n log n). so you have O((X+n) log n) time where X is the number of times you binary search the sorted list and n is the number of items in the list. And you'd probably have to add this up over the number of times you say, re-sort the list. Take into account the number of new items, the number of times you'll search on this newly sorted list, etc. If the list is really small or really large and only looked up once or twice, linear search in O(n) time is probably more worth it. In those cases constants probably come into play.
* I'd really like to know if there is some mathematical way you can calculate whether it is more worth it to sort + search or just do linear search, which is O(n) time, or do hashing.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, you can just roughly say "you break even if you need to do around log n searches", right?

If you want to be more exact, here's one thing you might try:

n log n + X log n = X n
n log n = X (n - log n)
X = (n log n) / (n - log n)

If you want to be REALLY exact you can also introduce the relevant constant factors, but I don't think this is too useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants