A few days ago, I successfully defending my MA thesis on crowd motivation. As I polish up a final copy for print, here is my draft. Details forthcoming.

Why Bother? Examining the Motivations of Users in Large-Scale Crowd-Powered Online Initiatives

As this is not the final copy, please do not distribute it. Rather, link back to this page, so that I can provide the final when available.

Everyone has their own stories, no? The time that Wikipedia made you understand something a textbook couldn’t,

Stack Overflow is a programming help site. The site is heavily controlled by its community; as users gain points and achievements for contributing to the site, they gain more administrative power.

I’ll admit, I went through a period where I watched the questions page daily, looking for something I could answer. However, I found  that every single time, somebody gave a better answer much quicker. I’m sure it is eventually discouraging, but in my experience it made me competitive. If you don’t get the first answer, at least get a higher rated response. Programming is an endless process of puzzles and problem-solving, so it’s good to know that there’s a place where one in need can get help within minutes. As Jeff Atwood notes in the tweet above, it’s quality help too.

Clay Shirky has a wonderful talk about this phenomenon, wherein he argues that passion (or ‘love’) is a resource that is unfairly underestimated. He uses the example of support for the open-source programming language perl: in an absence of commercial support, there are more than enough passionate perl users online to fill the need. Watch it below.

In recent months, there have been two large-scale crowdsourcing project launched with the involvement of well-known internet personalities. I’m referring to Hunch—heading by Caterina Fake of Flickr fame— and Kickstarter—advised by Andy Baio of Upcoming.org and Waxy.org fame. While these projects are very different, it seems only appropriate for them to share a post.

Kickstarter is a site that lets projects collect funding pledges from users. One can add a project with a funding goal and set tiered rewards for patrons. You’re not expected to pay your pledge unless the project goal is met. Hunch is a crowdsourced decision tree site. Users creating content add decision-making questions to a decision, ones that affect the outcome of the final recommendation.

Had either of these been released elsewhere, the project would not provide any particularly extraordinary impact. Indeed, the paradox of crowdsourcing is that the idea alone will not sell the project, because the difficult barrier of critical mass. Critical mass, however, is itself dependent on having sold the idea already. In other words, new crowdsourcing project can have a great idea in principle, but potential users are aware that it will be difficult to achieve in practice. Such doubt is self-propogating, because knowing that you have doubts means admitting that others may too, further removing confidence in the objective.

A site like Fake’s former creation Flickr was able to grow a community on the back of a valuable individual experience. The community was not vital to the experience of Flickr. Kickstarter and Hunch, however, have no individual experience. If there was only one user, that user would have little to do (admitting that Hunch’s employee-created content can only go so for).

With strong talent behind both of the sites, there is no shortage of cleverness in their mechanics. However, the key to their first few months lies in the trusted celebrity behind them. Like doubt, confidence is be self-propagated. Baio and Fake have an audience already in place, and seeing that audience can swing cynics from “great idea but wouldn’t work” to thinking “just maybe.”

“The prolonged, indiscriminate reviewing of books is a quite exceptionally thankless, irritating and exhausting job. It not only involves praising trash but constantly inventing reactions towards books about which one has no spontaneous feelings whatever. The reviewer, jaded though he may be, is professionally interested in books, and out of the thousands that appear annually, there are probably fifty or a hundred that he would enjoy writing about.

… Seeing the results, people sometimes suggest that the solution lies in getting book reviewing out of the hands of hacks. Books on specialized subjects out to be dealt with by experts, and on the other hand a good deal of reviewing, especially of novels, might well be done by amateurs. Nearly every book is capable or arousing passionate feeling, if it is only a passionate dislike, in some or other reader, whose ideas about it would surely be worth more than that of a bored professional. But, unfortunately, as every editor knows, that kind of thing is very difficult to organize.”

George Orwell, “Confessions of a Book Reviewer.” Tribune (1946)

Orwell, one would presume, would view aggregate amateur review as a boon, not a threat.

“Scepticism about Wikipedia’s basic viability made some sense back in 2001; there was no way to predict, even with the first rush of articles, that the rate of creation and the average quality would both remain high, but today those objections have taken on the flavor of the apocryphal farmer beholding his first giraffe and exclaiming, ‘Ain’t no such animal!’ Wikipedia’s utility for millions of users has been settled; the interesting questions are elsewhere.”
– Clay Shirky, Here Comes Everybody, p.117

In my work on crowdsourcing, my advisors warn me to be careful of how I speak about Wikipedia around academics, because scholars are still divided on it. Clay Shirky’s quote perfectly encapsulates the situation: if it is clear that it works and that it works well, the question shouldn’t be “does it work?” Rather, we should be asking why it works. Kevin Kelly suggests that Wikipedia is “impossible in theory, but possible in practice“: shouldn’t we be tweaking our theories then? Perhaps then, the issue is that if an expert were to praise Wikipedia as reliable, they undermine society’s need for experts. Larry Sanger, creator/co-founder or Wikipedia, says no, but it’s certainly food for thought.

I’ve recently been mulling over the question, “Why does Genius suck so much?” and the implications that it has.

Genius is the playlist generation tool in Apple’s iTunes music software. You choose a song that you’re in the mood for, and it creates an entire playlist of similar songs. Essentially, its a recommender system; if you like x you’ll like y. The problem is that you get a very narrow point of view, with very little genre skipping. and no pleasantly clever surprises.

What sets Genius apart from other song recommender systems is that its essentially powered by the crowds. Apple has the luxury of a rich data set of habits and rating, and it appears to factor heavily into the recommendations. Indeed, algorithmic playlist generators were creating better results years before Genius came on the scene. So, what does this mean for the crowd?

The fact that computers can be better than humans in understanding art is off-putting. I’m still working through this problem, but here are some thoughts toward untangling it.

Ratings data is emotionless. When you rate a song 1 or 5, you’re giving it a universal ‘like’/’dislike’. This data doesn’t factor the mood of the song or the emotion of the listener. This is all very removed for circumstance. As I suggested to Bill Turkel, perhaps such simple crowd-based recommendations are better for high-level suggestions, like artists you may like, but useless at the micro-level (unless that data crowds are contributing is more specific to the topic of recommendations). In contrast, technology can quite effective interpreting the types and patterns of sound which represent an emotion. Certainly it can’t easily understand whether a song is good, but if you want a slow, jazzy rock song, that’s fairly achievable. This is something in which music recommendation is fairly unique, as it is easy to interpret than it would be to interpret thousands of movie plots or millions of book themes.

Despite this, perhaps the most-cited example of a good music recommender is Pandora, which is an internet radio based on the Music Genome Project (MGP). The MGP does use humans to categorize songs, having professionals tag each song with over 400 tags and using an algorithm to weigh the values. Pandora’s success shows that humans are indeed effective at understanding music, given that they’re looking at it in the right way.

There’s also the effect of popular media that makes human-based recommendations unbalanced. If a lot of people like Coldplay, the range of music that it will be recommended for will be broad. This additionally creates an echo loop where popular music simply grows in popularity. Inversely, it is very difficult for new music to enter the loop. If everybody that likes The Strokes like Yeah Yeah Yeahs, the recommender will reinforce this, brushing aside any similar new bands.

However, such problems are limited to the balance of the algorithm. Last.fm, which tracks all of its users’ listened music, is fairly effective in recommending similar music. Also, because of their detailed information on what a user has listened to, they can suggest less listened to songs. Though they don’t offer playlist generation, I wouldn’t put this beyond their abilities.

So where do crowds factor in here? If anything, Pandora suggests that this is best left to professionals. Certainly, you can’t get that sort of exhaustivity with crowds. The answer may lie in reliability. Large groups would be able to make much simpler connections, but on a larger and more verified scale. When I make a playlist with Lou Reed’s Take a Walk on the Wild Side, I always follow it with Urge Overkill’s Girl, You’ll Be A Women Soon.  The songs are linked very little, but there’s something in me that recognizes the similarly cool feeling that I feel. If you could somehow capture millions of these sorts of links, that could lead somewhere.

The absolute most important part of collective action is the collective. At the same time, it the the most difficult and unpredictable piece of such an effort. For many otherwise good ideas, the lack of a crowd deals a critical blow to their success.

In collecting a crowd, one should consider the incentives that motivate that crowd. However, collecting a crowd in not the only way to gain one. What I’m talking about is repurposing a crowd.

While obviously not an option for everybody, repurposing a crowd offers, in many cases, the best chances for crowdsourced success. This may entail piggybacking functions onto your widely-used product (especially useful if you’re Google or Yahoo—which most of us aren’t) but it can also mean borrowing from somebody else’s (like with Facebook platform). The point comes down to this: it’s hard to provide incentive for users to frequent a new site in their online routine, but it is much easier to utilize the sites that they’re already visiting for other purposes.

Here are three directions to consider.

1. Making an audience into participants

Sometimes you have to make do with what you have. When you’re a local radio station competing with a television network, “what you have” can seem frustratingly limited. When it comes to traffic reporting, the difference may be that the big guys have helicopters in the sky and cameras on busy road. How can you compete with that?

For radio stations, the answer lies in what they do have: an audience actively experiencing the traffic. What has emerged from is traffic reporting based on mobile phone audience tips. This crowdsourced model helps narrow the competitive gap that expensive technologies create.

The magic lies in the clever mobilization of readers. There is already a large, dormant audience armed with the information that a traffic watch needs. Giving that audience a voice is all that is needed to get the information. A similar example is Are You Being Gauged from WNYC.

2. Crowdsource as a feature, not as the main event

Some of the most useful examples of crowdsourcing in the wild formed on the backs of other products. The golden standard here is the tagging feature of Flickr. Users have no rules forced upon them on how to encde their information. They just want to put up their photos, and tagging is something that is simply offered for them to stay organized. However, on the larger scale, all the users that do end up using tags helps create an extremely semantically relevant corpus of images (and, as I’ve mentioned before, the first place you should go when looking for images).

Returning to the earlier example of traffic information, such “incidental crowdsourcing” is being tested in using cellphone tower information to determine how fast traffic is moving. Simply by having one’s cellphone in their cupholder, they’re contributing data. Similar to this are e-commerce recommendation engines, where simply by surfing a site, a user contributes to an algorithm for predicting what similar users would want.

3. Reimagine Popular Actions

In this form of repurposing crowds, the question come down to how one can squeeze extra juice out of something already being used. his is the approach that Luis von Ahn projects take, especially well epitomized in reCaptcha and the ESP Game. If people are already using captchas, why not have them also digitize scanned texts? If people like to relax with an online game, why not also have them encode image metadata?

Before starting a crowd-assisted project, don’t bet on people finding their way to it. Think about how existing groups and communities can be used and you’re much more likely to succeed.

In early June 2007, I shared the following twitter:

Idea: ‘YouShould’, a suggestion site where people write open letter suggestions of ideas for companies, authors, and services

There had been two things on my mind. The first was the potential benefit to consumers that such feedback could allow.  I was inspired by Gmail’s suggestion page, where one can suggest what they would like to see implemented in Gmail next. Google appears to take it seriously, too, listing past suggestions that have already been implemented. The other reason for my idea was that I had been brainstorming for my senior thesis, which was beginning in September. However, once September rolled around, “YouShould” was crushed by the release of the similarly named Should Do This.  While perhaps no exactly what I had imagined, it was pretty darn close.  I don’t believe in reinventing the wheel, so I dropped the project.

Turns out, dropping the project was probably a good idea.  After Should Do This, there came IdeaScale, and CrowdSound, and Suggestion Box, and UserVoice, FeatureList, Fevote and CollabAndRate.com.  All of these had different approach to the same concept: getting feedback from customers.  Turns out I wasn’t alone in the concept.

Unfortunately, as tends to be the rule, none of these services seem to have gained any traction. Interesting on paper, there was not enough return to attract critical mass and make the idea suceed. One reason is that, with unsolicited advice, users do not gain a sense of contribution. One thinks, what are the odds that a company cared enough to seek out these websites?  Users want to offers their thought and suggestions, but they also want to be heard. It’s like that wonderful game my aunt always played with the kids: “who can stay quiet for the longest”. Sneaky, yes, but we certainly stayed quiet for longer than we would have simply for its own sake. This is why general suggestion boards have been failing, and crowd-suggestion businesses has been moving into infrastructure, offering tools that enable business to ask their customers themselves.

How many times have you liked a television show, and found yourself lamented the fact that —unless you’re directly being asked by Nielson or BBM— your patronage does not actually register? The broadcast system that television uses is by definition clunky: it transmits only one way, from one to many, without a direct capacity for information feedback. This simple concept was outlined in the Shannon-Weaver model of communication back in the 40s. However, while the flow of source > encoder > message > channel > decoder > receiver is adequate for describing technology, attempts to apply it to human communication have been notably shortchanged. It simply is not natural to our nature, not reflective of how humans negotiate meaning. The transmission model is not simply limited to delivery of television and radio signals. In a way, our entire consumer culture attempts this few-to-many transmission. Business online, however, exists within a system constructed to be (though not always realized as) many-to-many. Feedback is the nature of the internet. If you’d like to see organic cotton shirts at the Gap, the time investment in doing so would discourage casual contributions. More likely, your feedback would be much more crude, by shopping elsewhere, in which case the Gap is left trying to figure out why you did so. In contrast, a Gmail user conscious of the idea solicitation page can quickly send in a thought when they have it.

“It’s not the cost we’re looking at, it’s how we are making the application better for the consumer” —Jari Pasanen, Nokia VP for innovation acceleration (BusinessWeek.com)

In “How Nokia Users Drive Innovation“, Business Week outlines Nokia’s solicitation of its users for ideas, and the sucess that that have been having. Other companies that do so are Starbucks (My Starbucks Idea), Salesforce (SalesForce IdeaExchange), and Dell (Dell IdeaStorm). In these example, communities have formed around supporting and expanding on ideas. A cynical observer would suggest that these companies are looking for free business advice. The reality, however, is that it is in the best interest of customers to help build better products for themselves. Companies are constantly looking for feedback and those that respond as the people for whom the company adapts to. This idea is nothing new; what has emerged is the persistance and tenacity of users in doing so when given the proper tools.

A while ago, I suggested checking out Jeff Howe‘s book excerpts, and tried to summarize some of the best parts.  As “the best parts” grew quite long, I had to cut the post, leaving later bits unpublished.

With the book out now, I’m digging those back up.  Here’s the excerpt on Chapter 5, where things start getting interest, and some of my thoughts.

Chapter 5: The Rise and Fall of the Firm: Turning Community Into Commerce

Chapter 5 touches upon an oft understated quality of crowds: their natural connection to community.  Crowds are rarely groups of disparate human being.  Rather, they form around common connections in varying degree of community.

Howe explains to us that communities are changing: not for the worse, but toward the more efficient.  A grievance we often hear about modern society is the erosion of neighbourhood communities.  However, geographically-defined communities, in their pre-World War II hey-day, were popular because they were the most accessible common-interest groups (the common-interest being the location).  As new tools became available, humans have found ways of being community members with broader groups, and bound by interests beyond geography.  Thus, the slide of our culture’s individuals into new depths of isolation is not the case.  Rather, our communities are simply moving into less visible ground.  In a great observation that I had not considered, Howe notes that now, with the ease of social connection provided by digital tools, “new types of communities have materialized that are both local and wired at the same time”.

Chapter 5 also looks at the successful online efforts of the Cincinatti Enquirer, particularly through the CincyMoms community blogs.  It is a good look at do-or-die changes in publishing.  There’s also a gem of information I wanted to highlight lest you miss it.  In regards to a reader-submission feature on the Enquirer’s website:

The words “GetPublished” feature prominently on every Enquirer Web page. The results land in Parker’s queue, and they almost never resemble anything commonly considered journalism. “It used to read, “Be a Citizen Journalist,” Parker says. “And no one ever clicked on it. Then we said, ‘Tell Us Your Story,’ and still nothing. For some reason, ‘GetPublished’ were the magic words.” The Enquirer considers the feature to be an unequivocal success.

For a person sorting through this blog, you may have noticed a pattern: I rarely write about services founded on crowdsourcing as a business model.  I write about small experiments, or incidental crowdsourcing, but not on the myriad of crowdsourcing startups that have appeared since this blog began over a year ago.

There’s a reason for this: they rarely interest me.

There’s a time and a place for crowdsourcing, and what I love is when it is used to the achieve something that cannot otherwise be created.  There’s also a soft spot for cleverness in concept.  However, we increasingly see social for social’s sake.

Now, bad ideas are only a fraction of sites. Many others are simply not thought out in a way that the can be successful and, unfortunately for sites built on a foundation of crowds, success and usefulness are invariably linked.

I want crowdsourcing as business to suceed—I really do—but thus far it has been most successful by accident or by incident.  THAT is where the story is: in understanding this phenomenon.  Clearly we have the tools, but are still working on the trade.