Skip to content

spotify, deezer: Store IDs in dedicated fields instead of MusicBrainz fields#6349

Open
aaronk6 wants to merge 1 commit intobeetbox:masterfrom
aaronk6:spotify-deezer-fix
Open

spotify, deezer: Store IDs in dedicated fields instead of MusicBrainz fields#6349
aaronk6 wants to merge 1 commit intobeetbox:masterfrom
aaronk6:spotify-deezer-fix

Conversation

@aaronk6
Copy link
Copy Markdown
Contributor

@aaronk6 aaronk6 commented Feb 6, 2026

Description

Previously the Spotify and Deezer plugins wrote their IDs into the MusicBrainz fields. This prevents those files from being imported into Music Assistant, which checks for valid MusicBrainz IDs during import and refuses to import those files (error: “Invalid MusicBrainz identifier”).

I considered fixing this in Music Assistant, but it seemed cleaner to store Spotify and Deezer IDs in separate fields from the start.

This change adds dedicated fields when tagging and leaves the MusicBrainz ID fields untouched:

  • Spotify: spotify_track_id, spotify_album_id, spotify_artist_id
  • Deezer: deezer_track_id, deezer_album_id, deezer_artist_id

I’m curious whether the current behavior is considered a feature or a bug. That is, if there are any cases where populating MusicBrainz ID fields with non-MusicBrainz values is actually useful.

If this turns out to be controversial, making it configurable could be a solution, allowing users to preserve the old behavior.

Looking forward to your thoughts!

To Do

  • Documentation. Could potentially be extended, e.g., with commands to update existing files tagged before this change.
  • Changelog.
  • Tests.

@aaronk6 aaronk6 requested a review from a team as a code owner February 6, 2026 22:53
@github-actions
Copy link
Copy Markdown

github-actions bot commented Feb 6, 2026

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.97%. Comparing base (d65e37c) to head (81cd3da).
⚠️ Report is 461 commits behind head on master.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6349      +/-   ##
==========================================
+ Coverage   68.95%   68.97%   +0.01%     
==========================================
  Files         140      140              
  Lines       18685    18699      +14     
  Branches     3058     3060       +2     
==========================================
+ Hits        12885    12897      +12     
- Misses       5153     5154       +1     
- Partials      647      648       +1     
Files with missing lines Coverage Δ
beetsplug/deezer.py 20.68% <100.00%> (+4.02%) ⬆️
beetsplug/spotify.py 52.55% <100.00%> (+0.40%) ⬆️
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@semohr
Copy link
Copy Markdown
Contributor

semohr commented Mar 26, 2026

This is a long-standing issue and, in my view, an inconsistency that stems from Beets’ shift from MusicBrainz as the primary metadata source to treating it as just one of several common metasource plugins.

You are doing two things here which I would split into two PRs/issues imo.

  1. Introducing changes to AlbumInfo and ItemInfo, which will propagate into the database (see here.
  2. Adding new metadata to the files using mediafile.

Regarding 1

Have you considered migration paths for older databases that currently store Spotify or Deezer IDs in MusicBrainz fields?

If we change this behavior, existing databases would need to be migrated to the new format; otherwise, we risk introducing a significant number of inconsistencies. Planning a robust migration strategy is essential to maintain data integrity across different metadata sources.

For context: this is one reason we haven’t tackled this yet, fully migrating the behavior will likely snowball and require careful coordination with all metasource plugins to ensure consistency.


Regarding 2

This seems like a useful addition if there is software that actually reads these tags and does something meaningful with them. Can you provide examples of software that uses them? If not, I would lean toward not adding the metadata to the files.

What you describe as issue seems more like a bug to me: album_id should not be written as metadata tags if the data_source isn't musicbrainz. Some investigation into why this is happening in the first place and removing/fixing that behavior might be a better way forward.

@JOJ0 JOJ0 added the core Pull requests that modify the beets core `beets` label Apr 4, 2026
@snejus snejus added deezer deezer plugin spotify spotify plugin musicbrainz musicbrainz plugin labels Apr 5, 2026
@aaronk6
Copy link
Copy Markdown
Contributor Author

aaronk6 commented Apr 5, 2026

@semohr Thanks for looking into this! I agree, let’s split this up. I’ll create a new PR that ensures that non-MusicBrainz IDs aren’t written to the mb_… fields anymore. That’s indeed the real issue here, as it causes problems with other clients such as Music Assistant.

For the migration strategy, my suggestion is adding a note to the changelog that recommends running the following command to clear everything from mb_trackid, mb_albumid, mb_artistid, and mb_albumartistid that doesn’t look like a UUID:

RE='^(?![0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$).+'
beet modify --yes mb_trackid! "mb_trackid::$RE"
beet modify --yes mb_albumid! "mb_albumid::$RE"
beet modify --yes mb_artistid! "mb_artistid::$RE"
beet modify --yes mb_albumartistid! "mb_albumartistid::$RE"

In a second step, I’ll see if it’s actually useful to introduce new fields for 3rd-party meta data provider IDs. My gut feeling is that might be good to keep them, even if no client supports them (yet). It could be a nice addition to player clients to link to the respective service.

Let me know your thoughts.

@snejus
Copy link
Copy Markdown
Member

snejus commented Apr 5, 2026

As @semohr mentioned, this is a long-standing and very thorny issue. We currently use mb_albumid, mb_artistid and mb_trackid to store IDs regardless of the data source.

musicbrainz currently populates <data_source>_album_id fields whenever it finds something relevant, but this is the closest we've gotten to 'multi data source' storage. Any further expansion here requires a lot of careful extra thought, because we're talking about album, artist and track IDs.

@aaronk6
Copy link
Copy Markdown
Contributor Author

aaronk6 commented Apr 5, 2026

@snejus Got it, thanks for the explanation. I didn’t realize that musicbrainz is already populating <data_source>_album_id, but I do see the complexity now.

I’ve opened up a new PR that just prevents non-MusicBrainz fields to go into the mb_* fields.

@snejus
Copy link
Copy Markdown
Member

snejus commented Apr 5, 2026

The new PR unfortunately renders beets useless for items tagged with non-musicbrainz data source. I don't see a viable solution to this issue for now (which doesn't require a huge breaking change to our central ID storage mechanism).

I the future, I imagine we will need to introduce fields such as album_id, artist_id and track_id to replace corresponding mb_* fields to act as the source of truth IDs for the current data_source. In addition to this, we will need a way to additionally store corresponding IDs from each data source (potentially a separate table). Only then we can 'free up' mb_* fields for MusicBrainz only.

@aaronk6
Copy link
Copy Markdown
Contributor Author

aaronk6 commented Apr 6, 2026

@snejus Sorry for jumping ahead here, and thanks for the detailed explanation. I’ll hold off for now. If there’s anything I can help with as part of the bigger overhaul, feel free to let me know.

@semohr
Copy link
Copy Markdown
Contributor

semohr commented Apr 9, 2026

Just asking: Wasn't one of the issues also that the mb_id is written to the audio files metadata even when the datasource is not musicbrainz? As I can see this introducing issue for other programs, maybe it is worth investigating. The fix to this issue should also be more contained as it is only about what is written to the audio files metadata.


In my reply above that was 2.

@aaronk6
Copy link
Copy Markdown
Contributor Author

aaronk6 commented Apr 9, 2026

Yes, from my point of view, the fact that non-MusicBrainz IDs end up in the MusicBrainz tags in the file is the main problem here.

Example:

$ mutagen-inspect /music/Artists/Julio\ Lovetrain/2026\ Capri\ \(single\)/01\ Julio\ Lovetrain\ -\ Capri.m4a
-- /music/Artists/Julio Lovetrain/2026 Capri (single)/01 Julio Lovetrain - Capri.m4a
- MPEG-4 audio (ALAC), 222.00 seconds, 1411200 bps (audio/mp4)
… truncated … 
----:com.apple.iTunes:MusicBrainz Album Artist Id=MP4FreeForm(b'158760742', <AtomDataType.UTF8: 1>)
----:com.apple.iTunes:MusicBrainz Album Comment=MP4FreeForm(b'', <AtomDataType.UTF8: 1>)
----:com.apple.iTunes:MusicBrainz Album Id=MP4FreeForm(b'930705291', <AtomDataType.UTF8: 1>)
----:com.apple.iTunes:MusicBrainz Album Release Country=MP4FreeForm(b'', <AtomDataType.UTF8: 1>)
----:com.apple.iTunes:MusicBrainz Album Status=MP4FreeForm(b'', <AtomDataType.UTF8: 1>)
----:com.apple.iTunes:MusicBrainz Album Type=MP4FreeForm(b'single', <AtomDataType.UTF8: 1>)
----:com.apple.iTunes:MusicBrainz Artist Id=MP4FreeForm(b'158760742', <AtomDataType.UTF8: 1>)
----:com.apple.iTunes:MusicBrainz Release Group Id=MP4FreeForm(b'', <AtomDataType.UTF8: 1>)
----:com.apple.iTunes:MusicBrainz Release Track Id=MP4FreeForm(b'', <AtomDataType.UTF8: 1>)
----:com.apple.iTunes:MusicBrainz Track Id=MP4FreeForm(b'3876259001', <AtomDataType.UTF8: 1>)
----:com.apple.iTunes:MusicBrainz Work Id=MP4FreeForm(b'', <AtomDataType.UTF8: 1>)
… truncated …

MusicBrainz Album Artist Id, MusicBrainz Album Id, MusicBrainz Artist Id, and MusicBrainz Track Id are filled with Deezer IDs (same applies to Spotify if that was used for tagging). Music Assistant rejects these files:

2026-04-08 19:34:57.555 ERROR (MainThread) [music_assistant.Filesystem (local disk) [Library]] Error processing Artists/Julio Lovetrain/2026 Capri (single)/01 Julio Lovetrain - Capri.m4a - Invalid MusicBrainz identifier: 158760742

So users who use beets for tagging (with Spotify or Deezer meta data) cannot play their files in Music Assistant. This is where I was coming from when creating this PR.

Want me to create a separate bug report for this?

@snejus
Copy link
Copy Markdown
Member

snejus commented Apr 9, 2026

Want me to create a separate bug report for this?

Yes please!

@aaronk6
Copy link
Copy Markdown
Contributor Author

aaronk6 commented Apr 10, 2026

#6519

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Pull requests that modify the beets core `beets` deezer deezer plugin musicbrainz musicbrainz plugin spotify spotify plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants