this post was submitted on 16 Jun 2025
1 points (100.0% liked)

It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
 
The original post: /r/datahoarder by /u/cheater00 on 2025-06-16 13:47:37.

Hey all, I was wondering if anyone had ideas on how to recognize that a specific youtube URL is a piece of music. Meaning a song, album, ep, live set, etc. I'm trying to write a user script (i.e. a browser addon that runs on the website) that does specific things when music is detected. Specifically I normally watch YT videos on 2-3x speed to save time on spoken word videos, but since it defaults to 2x I have to manually slow down every piece of music.

I thought this would be a good place to ask since 1. a lot of people download YT videos to their drive and 2. for those who do, they might learn something from this thread to help them auto-classify their downloads, making the thread valuable to the community.

I don't care about edge cases like someone blogging for 50% of the time and then switching to music, or like someone's phone recording of a concert. I just want to cover the most common cases, which is someone uploading a full piece of music to youtube. I would like to do it without downloading the audio first, or any cpu-heavy processing. Any ideas?

One thing I thought of was to use the transcripts feature. Some videos have transcripts, others don't, and it's not perfect, but it can help deciding. If a video with music in it has a transcript, the moments where music is played have [Music] on that line. So the algorithm might be something like:

check\_video\_is\_music():
 if is\_a\_short:
 // music shorts are unusual at least in my part of youtube
 return False

if has\_transcript:
 if (more than 40% of lines contain the string [Music]):
 return True
 else:
 // the operator <|> returns the leftmost non-null value
 // if anything else fails we default to True
 check\_music\_keywords() <|> check\_music\_fuzzy() <|> True

check\_music\_keywords():
 // this function will check the title and description for
 // keywords that would specify the video is or isn't music

if title contains one of those as a word "EP", "Album", "Mix", "Live Set", "Concert":
 return True
 if title contains year date between 1950 and 3 years ago:
 return True
 if title contains a YMD string:
 return True
 if description contains decade (like "90s", "2000s", etc):
 return True
 if description contains a music genre descriptor (eg Jazz, Techno, Trance, etc):
 return True
 // a list of the most common music genres can be generated somehow probably

if description contains "News":
 return False

// not sure what other words might be useful to decide "this is definitely
 // not music". happy to hear suggestions. maybe i should analyze the titles
 // of all the channels I subscribe to and check for word frequency and learn
 // from that.

return Null // we couldn't decide either way, continue to other checks

check\_music\_fuzzy():
 if vid\_length < 30 seconds:
 // probably just a short
 return False
 elif vid\_length < 6 minutes:
 // almost all songs are under 6 minutes
 // see [1], [2]
 return True
 elif vid\_length between 6 minutes and 20 minutes
 // probably a youtube video
 return False
 elif vid\_length > 20 minutes
 // few people who make youtube videos longer than 20 minutes disable transcripts
 return True

If anyone has any suggestions on what other algorithms I could use to improve the fuzzy search, I would be very happy to hear that. Or if you have some other way of deciding whether the video is music, eg by using the youtube api in some manner?

Another option I have is to create an FF addon and basically designate a single FF window to opening all the youtube music I'll listen to. Then I can tell that addon to always set youtube videos to 1x speed in that video.

Thanks for any suggestions

[1] https://www.intelligentmusic.org/post/duration-of-songs-how-did-the-trend-change-over-time-and-what-does-it-mean-today

[2] https://www.statista.com/chart/26546/mean-song-duration-of-currently-streamable-songs-by-year-of-release/

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here