Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discogs (and other metadata sources) polluting mb_albumid #604

Open
a9y opened this issue Mar 11, 2014 · 12 comments
Open

Discogs (and other metadata sources) polluting mb_albumid #604

a9y opened this issue Mar 11, 2014 · 12 comments
Labels
bug bugs that are confirmed and actionable

Comments

@a9y
Copy link
Contributor

a9y commented Mar 11, 2014

The discogs plugin sets album_id of the AlbumInfo class to a discogs identifier, which beets.autotag.apply_metadata assigns to mb_albumid. The problem is that everywhere else it's assumed to be a MusicBrainz identifier.

@sampsyo
Copy link
Member

sampsyo commented Mar 11, 2014

This is a great point; reusing the MBID fields is an abuse that can cause weird problems. We should instead set a flexible attribute. Alas, we currently don't have a mechanism for setting flexible attributes from metadata matches, but we should definitely add this. (In fact, doing so could simplify the AlbumInfo/TrackInfo types and related construction/application logic.)

@sampsyo sampsyo added the bug label Mar 11, 2014
@a9y
Copy link
Contributor Author

a9y commented Mar 13, 2014

I played around with a few solutions, but none were ideal.

The first was to store data_source as a flexible attribute, and then rename the mb_XXX fields to something more abstract like ds_XXX. The user then knows to check the value of data_source before working with those fields. The downside here is that it would require a database migration, and any external plugins that used those fields would stop working.

The second idea was to change AlbumInfo and TrackInfo to look more like this (not tested):

class AlbumInfo(object):
    def __init__(self, **kwargs):
        values = {}
        values.update(kwargs)
        object.__setattr__(self, 'values', values)

    def __getattr__(self, item):
        return self.values.get(item, None)

    def __setattr__(self, key, value):
        self.values[key] = value

In this scenario the data-source plugins could set their own fields (to be stored as flexible attributes) when instantiating AlbumInfo. Discogs could set di_albumid, di_trackid, ... Beatport could set bp_albumid, bp_trackid, ...

The advantage here is that you're not limited to just the one data-source.

@sampsyo
Copy link
Member

sampsyo commented Mar 13, 2014

Absolutely; the second approach is definitely the way to go! The AlbumInfo and TrackInfo objects should be able to carry along arbitrary field settings that are then applied in apply_metadata. The only trick is that they should also have some fields that are not just dict keys because some fields are required or need special handling—see the apply_metadata code for those fields.

@dengste
Copy link

dengste commented Aug 29, 2015

I just wanted to add that this is also true for the artist_id. If an album is identified via discogs, it will be tagged with discogs' artist-id. I stumbled upon this because when I import my library into Ampache, it will frequently have the same artist twice - one with the Musicbrainz id, and one with Discogs', and I have to manually fix that.

@skupjoe
Copy link

skupjoe commented Feb 23, 2017

It's also worth noting that the Dicogs data source does not supply mb_trackid information, so (by default) the Duplicates plugin will flag anything that shares null entries for this field as duplicate. Meaning, all Discogs-tagged tracks will be flagged as a duplicate.

The workaround is to additionally add a third parameter to the Duplicate plugin to lastly analyze Title.

@nicolahinssen
Copy link

@sampsyo Asked me to leave my thoughts here. #2763

My suggestion is to create the following fields for Discogs (TXXX is customizable so that wouldn't be a problem).

Old New
TXXX:MusicBrainz Album Artist Id=1560798 TXXX:Discogs Album Artist Id=1560798
TXXX:MusicBrainz Album Id=3175577 TXXX:Discogs Album Id=3175577
TXXX:MusicBrainz Album Release Country=Netherlands TXXX:Discogs Album Release Country=Netherlands
TXXX:MusicBrainz Album Type=Album TXXX:Discogs Album Type=Album
TXXX:MusicBrainz Artist Id=1560798 TXXX:Discogs Artist Id=1560798

@dbogdanov
Copy link
Contributor

Agreed with @nicolahinssen on adding special fields for Discogs. This is a feature I would love to see.

More generally, it will be great to store metadata from other sources (Bandcamp, Beatport, etc.) in the same way. The plugins retrieving the metadata should be able to define special fields themselves. Probably there won't be a consensus on naming of the fields, therefore this naming could be configurable by a user.

@dbogdanov
Copy link
Contributor

The only trick is that they should also have some fields that are not just dict keys because some fields are required or need special handling—see the apply_metadata code for those fields.

Should then all AlbumInfo and TrackInfo fields be kept, while adding a dict too?

@sampsyo
Copy link
Member

sampsyo commented Apr 29, 2018

I would actually be in favor of abandoning all the hand-coded attributes and just moving over complete to an unstructured dict-like object. It could simplify changes in the future quite a bit, and I don't think we get anything out of the current approach to the built-in set of fields.

@bastidest
Copy link

This is causing issues with other software (funkwhale) relying on the MB UUID format. They expect the fields to be valid UUIDs, not integers:

image

@martinkirch
Copy link

Hello,
It seems this bug is still here. It has weird side effects: I discovered it after digging why absubmit fails for almost my whole library. The explanation is absubmit needs an UUID to build the submission URL. I guess it affects a few other plug-ins.

Maybe the simplest solution would be setting flexible attributes, as previously suggested

Discogs could set di_albumid, di_trackid, ... Beatport could set bp_albumid, bp_trackid, ...

?

@Neurrone
Copy link
Contributor

Neurrone commented Feb 3, 2022

Another use case that I'm running into is tagging an audiobook collection with metadata from Audible. Because of the hard-coded fields, I don't think its possible for me to store additional metadata that is not already defined in one of the existing fields.

Wondering if there are any plans to fix this in the near / mid term?

Edit: I realized my comment may not fit this issue, perhaps I should move it to #2866 instead.

@sampsyo sampsyo changed the title Discogs polluting mb_albumid Discogs (and other metadata sources) polluting mb_albumid May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug bugs that are confirmed and actionable
Projects
None yet
Development

No branches or pull requests

9 participants