I’ve recently been mulling over the question, “Why does Genius suck so much?” and the implications that it has.

Genius is the playlist generation tool in Apple’s iTunes music software. You choose a song that you’re in the mood for, and it creates an entire playlist of similar songs. Essentially, its a recommender system; if you like x you’ll like y. The problem is that you get a very narrow point of view, with very little genre skipping. and no pleasantly clever surprises.

What sets Genius apart from other song recommender systems is that its essentially powered by the crowds. Apple has the luxury of a rich data set of habits and rating, and it appears to factor heavily into the recommendations. Indeed, algorithmic playlist generators were creating better results years before Genius came on the scene. So, what does this mean for the crowd?

The fact that computers can be better than humans in understanding art is off-putting. I’m still working through this problem, but here are some thoughts toward untangling it.

Ratings data is emotionless. When you rate a song 1 or 5, you’re giving it a universal ‘like’/’dislike’. This data doesn’t factor the mood of the song or the emotion of the listener. This is all very removed for circumstance. As I suggested to Bill Turkel, perhaps such simple crowd-based recommendations are better for high-level suggestions, like artists you may like, but useless at the micro-level (unless that data crowds are contributing is more specific to the topic of recommendations). In contrast, technology can quite effective interpreting the types and patterns of sound which represent an emotion. Certainly it can’t easily understand whether a song is good, but if you want a slow, jazzy rock song, that’s fairly achievable. This is something in which music recommendation is fairly unique, as it is easy to interpret than it would be to interpret thousands of movie plots or millions of book themes.

Despite this, perhaps the most-cited example of a good music recommender is Pandora, which is an internet radio based on the Music Genome Project (MGP). The MGP does use humans to categorize songs, having professionals tag each song with over 400 tags and using an algorithm to weigh the values. Pandora’s success shows that humans are indeed effective at understanding music, given that they’re looking at it in the right way.

There’s also the effect of popular media that makes human-based recommendations unbalanced. If a lot of people like Coldplay, the range of music that it will be recommended for will be broad. This additionally creates an echo loop where popular music simply grows in popularity. Inversely, it is very difficult for new music to enter the loop. If everybody that likes The Strokes like Yeah Yeah Yeahs, the recommender will reinforce this, brushing aside any similar new bands.

However, such problems are limited to the balance of the algorithm. Last.fm, which tracks all of its users’ listened music, is fairly effective in recommending similar music. Also, because of their detailed information on what a user has listened to, they can suggest less listened to songs. Though they don’t offer playlist generation, I wouldn’t put this beyond their abilities.

So where do crowds factor in here? If anything, Pandora suggests that this is best left to professionals. Certainly, you can’t get that sort of exhaustivity with crowds. The answer may lie in reliability. Large groups would be able to make much simpler connections, but on a larger and more verified scale. When I make a playlist with Lou Reed’s Take a Walk on the Wild Side, I always follow it with Urge Overkill’s Girl, You’ll Be A Women Soon.  The songs are linked very little, but there’s something in me that recognizes the similarly cool feeling that I feel. If you could somehow capture millions of these sorts of links, that could lead somewhere.