Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fromfilename: Fails to detect <track> <title> filename patterns #4561

Closed
hsandt opened this issue Nov 26, 2022 · 12 comments · Fixed by #4600
Closed

fromfilename: Fails to detect <track> <title> filename patterns #4561

hsandt opened this issue Nov 26, 2022 · 12 comments · Fixed by #4600
Labels
bug bugs that are confirmed and actionable

Comments

@hsandt
Copy link

hsandt commented Nov 26, 2022

Problem

Running this command in verbose (-vv) mode:

$ beet import path/to/unknown_music_tracks
path/to/unknown_music_tracks (5 items)
No matching release found for 5 tracks.
For help, see: https://beets.readthedocs.org/en/latest/faq.html#nomatch
[S]kip, Use as-is, as Tracks, Group albums, Enter search, enter Id, aBort?
$ u

to Use as-is, will led to this problem:

the track numbers are not recognized. For instance, if I run beet ls:

$ beet ls
 -  - 01 深き森にて
 -  - 02 緋龍の翼
 -  - 03 風の国ルイヴェスタ
 -  - 04 終りなき狂詩曲
 -  - 05 無慈悲な夜に

the track numbers are still part of the files.

If I disable fromfilename plugin, the whole output is empty (no track number nor track name).

Here's a link to the music files that trigger the bug (if relevant):

Use any unknown track, such as a dummy sound file you would create, or rename the folder/files so artist/track cannot be identified (this is however a real use case; I'm simply downloading lesser known tracks from indie bands, etc.)

Setup

  • OS: Ubuntu 20.04
  • Python version: 3.8.10
  • beets version: 1.6.0
  • Turning off plugins made problem go away (yes/no): no

My configuration (output of beet config) is:

directory: ~/path/to/Music
import:
    copy: no
    write: no
plugins: fromfilename

Note: write: yes didn't help. In fact, it didn't even seem to change the track names, but that's another issue. I assume beet ls alone should show expected information.

@sampsyo sampsyo added the needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." label Nov 26, 2022
@sampsyo
Copy link
Member

sampsyo commented Nov 26, 2022

Can you please include the filenames you're importing? The fromfilename plugin relies on very specific patterns in the names to detect various fields.

@hsandt
Copy link
Author

hsandt commented Nov 27, 2022

Folder: 風の国ルイヴェスタ
File names:
01 深き森にて.ogg
02 緋龍の翼.ogg
03 風の国ルイヴェスタ.ogg
04 終りなき狂詩曲.ogg
05 無慈悲な夜に.ogg

(more exactly, I tried both with underscore "01_[name].ogg" which was the original name, and space "01 [name].ogg")

@sampsyo
Copy link
Member

sampsyo commented Nov 28, 2022

Thanks for the extra info!

Something does indeed look odd in our fromfilename list of patterns. We try each pattern sequentially, in order:

for pattern in PATTERNS:

However, our patterns appear in this order:

r'^(?P<title>.+)$',
r'^(?P<track>\d+)[\s.\-_]+(?P<title>.+)$',

That is, we try considering the entire filename as just the title first—and only if that doesn't work, we try splitting it into a track number and a title. It really seems like this should go in the other order.

If you have a moment, any chance you could give this a try with those two lines swapped?

@sampsyo sampsyo changed the title beet import track unknown from MusicBrainz with "Use as is" fails to detect track number even with plugins: fromfilename fromfilename: Fails to detect <track> <title> filename patterns Nov 28, 2022
@sampsyo sampsyo added bug bugs that are confirmed and actionable and removed needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." labels Nov 28, 2022
@wisp3rwind
Copy link
Member

Interestingly, that ordering has been in place for a really long time by now: https://github.com/beetbox/beets/pull/399/files

@hsandt
Copy link
Author

hsandt commented Dec 10, 2022

If you have a moment, any chance you could give this a try with those two lines swapped?

Yes, it works with "[\s.-_]<title>" before "<title>". I also noticed the underscore is recognized as valid separator in this pattern, so I tried again with the original file names, and it works (I had to use a custom format to display the track numbers):

In the folder containing the tracks:

$ beet import .

path/to/unknown_music_tracks (5 items)
No matching release found for 5 tracks.
For help, see: https://beets.readthedocs.org/en/latest/faq.html#nomatch
[S]kip, Use as-is, as Tracks, Group albums, Enter search, enter Id, aBort? u
$ beet ls
 -  - 深き森にて
 -  - 緋龍の翼
 -  - 風の国ルイヴェスタ
 -  - 終りなき狂詩曲
 -  - 無慈悲な夜に
$ beet ls -f '$album - $artist - $track : $title'
 -  - 01 : 深き森にて
 -  - 02 : 緋龍の翼
 -  - 03 : 風の国ルイヴェスタ
 -  - 04 : 終りなき狂詩曲
 -  - 05 : 無慈悲な夜に

So with this, I just need to learn how to apply beet library metadata back to the actual track files, and I should be able to update my music tags very fast!

I'll let you have a look whether there are other lines to reorder while we're at it though, and if the reordering won't have bad side effects we didn't think of (isn't there a list of unit tests with dummy track names to test on?)

@sampsyo
Copy link
Member

sampsyo commented Dec 10, 2022

Indeed; I can't think of any bad side effects—we should probably just do the reordering!

@JOJ0
Copy link
Member

JOJ0 commented Dec 22, 2022

I swapped the order of the regex as suggested here and created a draft PR. I would like to take the opportunity to discuss this change and even further improve this plugin.

This is what we would have now, skipping the first 3 patterns:

  ...
  r'^(?P<track>\d+)[\s.\-_]+(?P<title>.+)$',
  r'^(?P<title>.+)$',
  r'^(?P<track>\d+)\s+(?P<title>.+)$',
  r'^(?P<title>.+) by (?P<artist>.+)$',
  r'^(?P<track>\d+).*$',

Am I mistaken that the regex r'^(?P<title>.+)$ would, if nothing matched before, always match? The last 3 patterns in this list can never be reached. .+ means anything, at least one character. I suggest it should be moved to the very end of the list as a last resort match.

@JOJ0
Copy link
Member

JOJ0 commented Dec 22, 2022

Another idea I'm testing and actually using day to day for quite a few months already is additional info logging within the plugin: 339aed2

It helps to see what fromfilename is actually doing. Right now it is a blackbox that tries to do good but sometimes even worsens and confuses the search terms. At least with better logging, the user would know that with the messed up tags/filenaming present, there is no chance that beets will find anything anywhere. Let's take an example that occurs to me often. This file has not tags at all and the artistname starts with digits:

$ beet import ~/Sync/beets_test_music/bad_tag/60\ minute\ man\ -\ together.mp3
fromfilename: Artist replaced with: 60 minute man 
fromfilename: Bad title replaced with:  together
fromfilename: Track replaced with: 60

/home/jojo/Sync/beets_test_music/bad_tag/60 minute man - together.mp3 (1 items)
Finding tags for album "60 minute man  - ".
Candidates:
1. The Black Eagles - Brimstone And Fire / 60 Minutes Man (23.8%) (album, artist, tracks, ...) (Discogs, Vinyl, UK, Ethnic, ETH 38)
2. The Untouchables - 60 Minute Man / Everybody's Laughing (23.0%) (album, artist, tracks, ...) (Discogs, Vinyl, 1960, US, Madison Records, M 139)
3. The Black Eagles - Brimstone & Fire / 60 Minutes Man (22.6%) (album, artist, tracks, ...) (Discogs, Vinyl, 1974, US, Black Eagle, RK-3)
4. The Trammps - 60 Minute Man (21.8%) (album, artist, tracks, ...) (Discogs, Vinyl, 1976, Netherlands, Buddah Records, BDA - 321)
5. Missy Elliott Featuring Ludacris - One Minute Man (20.3%) (album, missing tracks, artist, ...) (Discogs, Vinyl, 2001, Europe, Elektra, 7559-67244-0)
# selection (default 1), Skip, Use as-is, as Tracks, Group albums,
Enter search, enter Id, aBort, eDit, edit Candidates? 

The logging is info level because I want it that way. I don't see a point in having to always run beets using -v, the cognitive load is just to high when running longer beets import sessions with -v.

I could imagine though that this additional info logging could be enabled with a config directive. I know the output is not pretty right before the importer prompts but it is just very helpful for a user to see why search results are such a mess.

JOJ0 added a commit to JOJ0/beets that referenced this issue Dec 22, 2022
@JOJ0
Copy link
Member

JOJ0 commented Dec 22, 2022

Taking the logging idea a step further (just for learning how this plugin works) using something like that: JOJ0@cc82852, , we see that with a lot of patterns the number in the artist name is considererd to be the track number:

 dev  (beets)    ~/git/beets   all_in  beet import ~/Sync/beets_test_music/bad_tag/60\ minute\ man\ -\ together.mp3
This item: 60 minute man - together
Has a match with pattern: ^(?P<artist>.+)[\-_](?P<title>.+)$
The groupdict: {'artist': '60 minute man ', 'title': ' together'}

fromfilename: Artist replaced with: 60 minute man 
fromfilename: Bad title replaced with:  together
This item: 60 minute man - together
Has a match with pattern: ^(?P<track>\d+)[\s.\-_]+(?P<artist>.+)[\-_](?P<title>.+)$
The groupdict: {'track': '60', 'artist': 'minute man ', 'title': ' together'}

fromfilename: Track replaced with: 60
This item: 60 minute man - together
Has a match with pattern: ^(?P<track>\d+)[\s.\-_]+(?P<title>.+)$
The groupdict: {'track': '60', 'title': 'minute man - together'}

This item: 60 minute man - together
Has a match with pattern: ^(?P<title>.+)$
The groupdict: {'title': '60 minute man - together'}

This item: 60 minute man - together
Has a match with pattern: ^(?P<track>\d+)\s+(?P<title>.+)$
The groupdict: {'track': '60', 'title': 'minute man - together'}

This item: 60 minute man - together
Has a match with pattern: ^(?P<track>\d+).*$
The groupdict: {'track': '60'}


/home/jojo/Sync/beets_test_music/bad_tag/60 minute man - together.mp3 (1 items)
Finding tags for album "60 minute man  - ".
Candidates:

@JOJ0
Copy link
Member

JOJ0 commented Dec 22, 2022

I'm sorry I don't know really where I'm going with this but maybe somewhere around here it could be distinguished that in such a case we shouldn't take the track number as real: https://github.com/beetbox/beets/blob/master/beetsplug/fromfilename.py#L102-L126

Or maybe all the regexes could be more explicitly written to better find out that we have an artist name starting with a digit and it's not the track number.

Not sure, but let's leave my proposal as: What do you think about the logging improvement suggestion? #4561 (comment)

@wisp3rwind
Copy link
Member

I doubt there's a real solution to the "60 minute man" example within the simple design that this feature currently has: There's no way to distinguish being part of the artist name or the track number, since the plugin only really looks at tracks in isolation.

About the logging, I personally wouldn't like this as-is due to interrupting the regular importer UI, as you already note. Maybe let's get this in as debug logs, such that at least this information is available if requested (which already improves upon the current state). Displaying warnings/etc during the import flow is a feature where it'd be really nice to come up with a generic solution (maybe delaying messages until a good point, maybe some terminal control trickery to move them above the current prompt, ...?). I'm not fond of the idea to introduce a configuration option; we already have enough of them to control the actual behavior of beets, let alone to fine-tune logging.

@JOJ0
Copy link
Member

JOJ0 commented Dec 28, 2022

Thanks for taking the time to think about all this!

Looking through these regex patterns again and again I came to a similar thought that it seems not really solveable for artist names starting with digits. I also realised that it's not such of a big deal. Most importantly artist and title are detected correctly, even with the example file from above.

Regarding the logging. Thanks as well, I updated the PR. I will remove Draft state when I thint it's ready for a final review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug bugs that are confirmed and actionable
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants