-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
flag divergent metadata to try and avoid creating inconsistencies in the library #161
flag divergent metadata to try and avoid creating inconsistencies in the library #161
Conversation
…rack to allow reuse
realised I didn't mention this in the commit message or PR—this check is done on import to the library; it uses the warning functionality added in 0749fad, and generates warnings if any of the artists or the anime title is either missing, or both not in the database and also has a Levenshtein closeness of above 0.7 to another artist/anime already in the library (after converting all to lowercase). in the latter case the warning suggests the anime with the highest Levenshtein closeness, for ease of correcting the metadata. this does make an import somewhat more expensive on the server side, since it now requires generating the full list of artists and anime titles. there is no impact on performance of the rest of the site, however, and all other views remain unchanged (e.g. the track display does not flag up any similar anime). a possible way to improve performance would be to cache the result of calling |
this now has a test suite since the tests ended up being quite long, I moved the existing tests into a directory so I could use a separate file for the something that could potentially be added if desired is flagging tracks being added that do not have a Composer. |
The "which is" carried an implied "[in the library]", but that was very non-obvious because the tense disagrees with the start of the sentence. The new wording is more concise while conveying the same information. A thought that now occurs to me is that in principle there could potentially be a link to the artist page for that artist at that point. Artists are checked individually as parsed by |
ah, that makes sense; @theshillito would a link to the artist page be useful, or are you just gonna dive back into itunes at that point? i read 'flag up artists written in non-canonical order' as meaning that lists of multiple artists (like |
Ah I see; no, the intent was to flag when individual humans had their names written in a different way to how it was already in the library (e.g. a track uploaded as Kana Hanazawa when the library is full of Hanazawa Kana). I don't think I've noticed any problems with long artist lists being shuffled? |
This responds to a request from @euricaeris on Slack for a way to flag tracks on import when they contain metadata that doesn't match something already in the library but almost does, to avoid misgrouping artists and anime.
A couple of supporting changes were needed to do this neatly—pulling out the functions supporting
all-the-anime
andall-the-artists
into theTrack
model, and changingrole_detail
such that if a track title has brackets that look like a role but don't actually match a regex, then the call torole_detail()
doesn't raiseAttributeError
.I'd have preferred not to have to pass
all_anime_titles
andall_artists
in tometadata_consistency_checks
, but couldn't find a neat way of caching it that would be invalidated onceupdate_library
returned.No automated tests, but I tested manually with aThere are now unit tests for (I think) all new functionality and some of the existingsonglibrary.xml
from a few months ago and ironed out the bugs (and it seems to do what was intended.)update_library
functionality and all pass.