-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add intra_distance_score evaluation #103
Open
mokarakaya
wants to merge
13
commits into
maciejkula:master
Choose a base branch
from
mokarakaya:intra_distance
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 6 commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
bc960a1
add intra_distance_score evaluation
mokarakaya 797c79d
fix styling issues.
mokarakaya 2ca8b43
fix styling
mokarakaya b7118a5
apply review comments for intra_distance_score
mokarakaya 4148715
remove unused imports
mokarakaya a32d76e
revert changes in test_precision_recall
mokarakaya 316f101
styling
mokarakaya 1c35e6d
fix review comments.
mokarakaya 448f891
Merge branch 'master' of https://github.com/maciejkula/spotlight into…
mokarakaya 2cab76f
fix review comments for intra_distance
mokarakaya a62fc2c
fix travis styling build
mokarakaya 3a36f76
fix documentation (test is replaced with user_ids)
mokarakaya 6896c20
intra_distance - calculate lengths only once
mokarakaya File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -201,3 +201,65 @@ def rmse_score(model, test): | |
predictions = model.predict(test.user_ids, test.item_ids) | ||
|
||
return np.sqrt(((test.ratings - predictions) ** 2).mean()) | ||
|
||
|
||
def intra_distance_score(model, test, train, k=10): | ||
""" | ||
Compute IntraDistance@k diversity of a set of recommended items which is defined | ||
as the average pairwise distance of the items in the set. | ||
|
||
In early definitions, it's called average dissimilarity [2] | ||
It's best known as average intra-list distance [1] | ||
|
||
.. [1] Castells, P., Hurley, N.J. and Vargas, S., 2015. | ||
Novelty and diversity in recommender systems. In Recommender Systems Handbook (pp. 881-918). | ||
Springer, Boston, MA. | ||
|
||
.. [2] Hurley, N. and Zhang, M., 2011. | ||
Novelty and diversity in top-n recommendation--analysis and evaluation. | ||
ACM Transactions on Internet Technology (TOIT), 10(4), p.14. | ||
|
||
Distance between items i,j is calculated as; | ||
1 - intersection(i,j) / length(i) * length(j) | ||
|
||
Parameters | ||
---------- | ||
|
||
model: fitted instance of a recommender model | ||
The model to evaluate. | ||
test: :class:`spotlight.interactions.Interactions` | ||
Test interactions. | ||
train: :class:`spotlight.interactions.Interactions`, optional | ||
Train interactions. If supplied, scores of known | ||
interactions will not affect the computed metrics. | ||
k: int or array of int, | ||
The maximum number of predicted items | ||
Returns | ||
------- | ||
|
||
(IntraDistance@k): numpy array of shape (len(users), len(k * (k-1) / 2) | ||
A list of distances between each item in recommendation | ||
list with length k for each test user. | ||
""" | ||
|
||
distances = [] | ||
test = test.tocsr() | ||
mat = train.tocoo().T.tocsr() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the |
||
lengths = mat.getnnz(axis=1) | ||
for user_id, row in enumerate(test): | ||
predictions = -model.predict(user_id) | ||
rec_list = predictions.argsort()[:k] | ||
distance = [ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I personally find nested list comprehensions very confusing. Could we use nested for loops here? |
||
_get_distance(mat, lengths, first_item, second_item) | ||
for i, first_item in enumerate(rec_list) | ||
for second_item in rec_list[(i + 1):] | ||
] | ||
distances.append(distance) | ||
return np.array(distances) | ||
|
||
|
||
def _get_distance(mat, lengths, first_item, second_item): | ||
numerator = np.in1d(mat[first_item].indices, mat[second_item].indices, assume_unique=True).sum() | ||
denominator = lengths[first_item] * lengths[second_item] | ||
distance = numerator / denominator | ||
return 1 - distance |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we need
test
for? For knowing which users to compute the predictions for?Maybe a cleaner way of doing the same would be to allow the user to pass in an optional array of user ids for which the metric should be computed.