Couple of days ago I was strolling home listening to Arctic Monkeys, when I realized how cool it would be to visualize all their lyrics in Tableau. Again, the output was ready in my head right when I entered the door of my apartment, but they way leading there was full of roadblocks.
Go back with me to 505 to see how I ended up creating this viz of the most frequent words used in their lyrics! (Most funny is Brick by Brick if you ask me…)
Major obstacle: How to create the database?
The first problem was, that the lyrics are broken down to sections and I’d need one big hell of a line for each song to be able to split the words by the space delimiter. Approximately an hour later I realized, that Notepad++ has a join lines feature, that could do this for me, involving a relatively small amount of manual work.
Next preparation step was to load the .csv file into excel and break the lines into words. Specialized characters needed to be replaced by nothing, but the system refused to identify question marks, so I had to scroll through ~15 000 lines to filter them out (as well as the parts that sticked together despite of my order to separate).
Minor obstacle: What to add as colors to the words?
I decided to code them by sentiment and give red to negative and green to positive words (leaving neutral ones black). The problem was, 9 out of 10 sentiment lexicons were not working, and the one I stuck with had some discrepancies, so I had to override its judgement manually.
Finally it added up, and even though this is not my most complex or beautiful viz, it’s still something I’ve learnt a lot from.