-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
binary search prework #7
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I haven't seen the info
tracking before, that seems very useful.
class BinarySearch(): | ||
""" | ||
Hashing and binary search both address the same problem: finding an item in a collection. What are some trade-offs between the two strategies? | ||
* You can only use binary search on something you're able to sort by some sort of metric. You can use hashing on almost anything, possibly? since theoretically you could use pointers to point to whatever it is. and if it's a whole object perhaps you hash the memory location? I'm not sure, actually, and now that I mention it I wonder how that is done for languages that allow say, values to be entire objects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is correct.
In Java for example (where the key can be an entire object), you're expected to define a custom hashCode
function, which typically looks at the different fields of the object in order to generate the hash value.
""" | ||
Hashing and binary search both address the same problem: finding an item in a collection. What are some trade-offs between the two strategies? | ||
* You can only use binary search on something you're able to sort by some sort of metric. You can use hashing on almost anything, possibly? since theoretically you could use pointers to point to whatever it is. and if it's a whole object perhaps you hash the memory location? I'm not sure, actually, and now that I mention it I wonder how that is done for languages that allow say, values to be entire objects. | ||
* Assuming stuff in the collection doesn't change, hashing might be faster to look stuff up because it's a constant o(1) time amortized, but binary search is generally log n. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you conclude that it's "amortized"?
When might you want to pick one over the other? | ||
If sorting takes (at minimum) O(n log n) time, and binary search takes O(log n) time, under what circumstances might it worth it to sort a collection in order to perform binary search? | ||
* When the number of times you need to sort it is far less than the number of times you need to search. There's probably some mathematical formula one can come up with, but I'm not sure. I suppose if you binary search X times, that's O(X log n) time total. And sorting once is O(n log n). so you have O((X+n) log n) time where X is the number of times you binary search the sorted list and n is the number of items in the list. And you'd probably have to add this up over the number of times you say, re-sort the list. Take into account the number of new items, the number of times you'll search on this newly sorted list, etc. If the list is really small or really large and only looked up once or twice, linear search in O(n) time is probably more worth it. In those cases constants probably come into play. | ||
* I'd really like to know if there is some mathematical way you can calculate whether it is more worth it to sort + search or just do linear search, which is O(n) time, or do hashing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, you can just roughly say "you break even if you need to do around log n searches", right?
If you want to be more exact, here's one thing you might try:
n log n + X log n = X n
n log n = X (n - log n)
X = (n log n) / (n - log n)
If you want to be REALLY exact you can also introduce the relevant constant factors, but I don't think this is too useful.
No description provided.