Scientific Study Settles Once and For All Which Words are the Most and Least Metal
Ah, “metal.” What is it really? As anyone who frequents metal websites knows all too well, the definition can vary greatly and has been the source of much debate over the decades, with every seemingly clear defining quality (i.e. “distorted guitars”) having an easily identifiable counter-example (i.e. “Opeth”). Shit, something can be “metal” even if it isn’t music, like our YouTube show “That’s So Metal!”
But now some Internet whippersnapper has attempted to define what it is to be metal by that old forgotten bastion of music: lyrics. By analyzing lyrics from 222,623 metal songs by 7,364 bands spread over 22,314 albums culled from DarkLyrics.com, an ex-physicist and data scientist who identifies himself simply as Iain has posted an exhaustive study on his website DegenerateState.org of the most and least metal words in the English language.
Iain’s piece is fascinating, and some of his analyses of metal lyrics — and the formulas used to derive those analyses — are, frankly, a bit over my head.
But this much is easy enough to understand for casual metalheads and data scientists alike: a comparison of metal lyrics to the general English language to produce a list of the most and least metal words:
One approach might be to look at how the relative frequency of words change between the metal lyrics and the English language in general. To do this we need some sort of measure of what “standard” English looks like, and given I’m using NLTK for text processing, an easy comparison is to the brown corpus, a collection of documents published in 1961 covering a range of different genres (although it should be pointed out, no lyrics).
To make a comparison between the two corpora, I define an arbitrary measure of “Metalness”, MwMw for each word ww,
where Nmetalw is the frequency of occurrences of word ww in my corpus of metal lyrics and Nbrownw is the frequency of occurrences of word ww in the Brown corpus. To prevent us being skewed by rare words, we take only words which occur at least five times in each corpus.
The top and bottom 20 metal words are shown in the table below, along with their “Metalness”.
I don’t wanna ruin the fun for you, so head on over to DegenerateState.org to read the results. But I’ll give you this as a little teaser: some of the most metal words include “eternity,” “breathe” and “demons” while some of the last metal words are “secretary,” “university” and “approximately.” Sounds about right!
Thanks: Mark M-R, Max Frank