Crowdsourcing


“The prolonged, indiscriminate reviewing of books is a quite exceptionally thankless, irritating and exhausting job. It not only involves praising trash but constantly inventing reactions towards books about which one has no spontaneous feelings whatever. The reviewer, jaded though he may be, is professionally interested in books, and out of the thousands that appear annually, there are probably fifty or a hundred that he would enjoy writing about.

… Seeing the results, people sometimes suggest that the solution lies in getting book reviewing out of the hands of hacks. Books on specialized subjects out to be dealt with by experts, and on the other hand a good deal of reviewing, especially of novels, might well be done by amateurs. Nearly every book is capable or arousing passionate feeling, if it is only a passionate dislike, in some or other reader, whose ideas about it would surely be worth more than that of a bored professional. But, unfortunately, as every editor knows, that kind of thing is very difficult to organize.”

George Orwell, “Confessions of a Book Reviewer.” Tribune (1946)

Orwell, one would presume, would view aggregate amateur review as a boon, not a threat.

Advertisements

In early June 2007, I shared the following twitter:

Idea: ‘YouShould’, a suggestion site where people write open letter suggestions of ideas for companies, authors, and services

There had been two things on my mind. The first was the potential benefit to consumers that such feedback could allow.  I was inspired by Gmail’s suggestion page, where one can suggest what they would like to see implemented in Gmail next. Google appears to take it seriously, too, listing past suggestions that have already been implemented. The other reason for my idea was that I had been brainstorming for my senior thesis, which was beginning in September. However, once September rolled around, “YouShould” was crushed by the release of the similarly named Should Do This.  While perhaps no exactly what I had imagined, it was pretty darn close.  I don’t believe in reinventing the wheel, so I dropped the project.

Turns out, dropping the project was probably a good idea.  After Should Do This, there came IdeaScale, and CrowdSound, and Suggestion Box, and UserVoice, FeatureList, Fevote and CollabAndRate.com.  All of these had different approach to the same concept: getting feedback from customers.  Turns out I wasn’t alone in the concept.

Unfortunately, as tends to be the rule, none of these services seem to have gained any traction. Interesting on paper, there was not enough return to attract critical mass and make the idea suceed. One reason is that, with unsolicited advice, users do not gain a sense of contribution. One thinks, what are the odds that a company cared enough to seek out these websites?  Users want to offers their thought and suggestions, but they also want to be heard. It’s like that wonderful game my aunt always played with the kids: “who can stay quiet for the longest”. Sneaky, yes, but we certainly stayed quiet for longer than we would have simply for its own sake. This is why general suggestion boards have been failing, and crowd-suggestion businesses has been moving into infrastructure, offering tools that enable business to ask their customers themselves.

How many times have you liked a television show, and found yourself lamented the fact that —unless you’re directly being asked by Nielson or BBM— your patronage does not actually register? The broadcast system that television uses is by definition clunky: it transmits only one way, from one to many, without a direct capacity for information feedback. This simple concept was outlined in the Shannon-Weaver model of communication back in the 40s. However, while the flow of source > encoder > message > channel > decoder > receiver is adequate for describing technology, attempts to apply it to human communication have been notably shortchanged. It simply is not natural to our nature, not reflective of how humans negotiate meaning. The transmission model is not simply limited to delivery of television and radio signals. In a way, our entire consumer culture attempts this few-to-many transmission. Business online, however, exists within a system constructed to be (though not always realized as) many-to-many. Feedback is the nature of the internet. If you’d like to see organic cotton shirts at the Gap, the time investment in doing so would discourage casual contributions. More likely, your feedback would be much more crude, by shopping elsewhere, in which case the Gap is left trying to figure out why you did so. In contrast, a Gmail user conscious of the idea solicitation page can quickly send in a thought when they have it.

“It’s not the cost we’re looking at, it’s how we are making the application better for the consumer” —Jari Pasanen, Nokia VP for innovation acceleration (BusinessWeek.com)

In “How Nokia Users Drive Innovation“, Business Week outlines Nokia’s solicitation of its users for ideas, and the sucess that that have been having. Other companies that do so are Starbucks (My Starbucks Idea), Salesforce (SalesForce IdeaExchange), and Dell (Dell IdeaStorm). In these example, communities have formed around supporting and expanding on ideas. A cynical observer would suggest that these companies are looking for free business advice. The reality, however, is that it is in the best interest of customers to help build better products for themselves. Companies are constantly looking for feedback and those that respond as the people for whom the company adapts to. This idea is nothing new; what has emerged is the persistance and tenacity of users in doing so when given the proper tools.

I was recently forwarded a link to ReCaptcha, and was stunned to realize that I have never written about it.  Stunned because ReCaptcha was one of the main sparks of my interest in crowdsourcing.

ReCaptcha is a tool out of Carnagie Mellon, headed by Luis von Ahn (mentioned previously here).  To understand reCaptcha, one needs to understand captchas.  A captcha is a human verification tool that displays an image with a string of warped characters.  The task is to write those characters into the input box.  Because of the complexity of image recognition, this task more or less confirms that you are a human and not a bot. Of course, spammers can hire low-wage captcha crackers, but captchas nonetheless introduce an enormous hurdle to online spam and other automated cons.

ReCaptcha is an improvement on the original concept.  Amongst other accessibility improvements, reCaptcha’s primary innovation is that it helps digitize old books.  That right, digitize old books.  Rather than offering randomly warped words, reCaptcha instead offers the user words from scanned books that the computer recognition is having trouble with.  This assists in the various efforts to digitize (and in the process preserve and recover) libraries of aging books.

The brilliance of this cannot be understated.  The tool takes an action that millions of people already need to do, and appropriates that manpower into something useful.  Perhaps the best parallel is to solar energy.  The sun is an energy source that is completely wasted in urban areas.  It is everywhere, constantly beaming this energy onto the earth, and the cleverness of solar cells allow is for people to capture this potential (constant and wasted) and convert it to something greatly useful.  Anybody who has ever been awed by solar energy can understand the exciting potential that reCaptcha represents in technology.

There are mainly two ways in which crowds are utilized in crowdsourced effects. 

The first is what I’d call the “million monkeys” strategy.  Quite simply, this is the appeal to the crowd for the one or few with a commodity —be it information or material– that you need. “With a group this large, I’m bound to find what I need!”  Greater size and diversity offers a bigger box to sift through, but ultimately it is a few individuals that matter, rather than the crowd itself. 

The million monkey strategy is common online today.  Skill auction sites, such as 99Designs, iStockPhoto, and Yahoo! Answers, reward the best provider of solution to a problem while others watch from the sidelines.  Even though the Internet does not provide anything newly achievable, in that the right connection at the right time does not necessitate a gradiose group, the amount of minds online have greatly increased the potential of achieving that ideal connection.

Yet consider the example of 99 Designs, where people offer bounties for their design needs, and then choose the best submission as the winner of the challenge and recipient of the payment.  A few hundred dollars for a job may seem common for an entry-level designer, but that seems much smaller when one considers the discarded man-hours of the unpaid submitters.  Spec (‘speculative’) work is frowned upon by professionals because of its devaluative nature, and this concern can be seen to parallel much million-monkey crowdsourcing: it skins the cow for the leather but leaves the meat to rot.

More exciting is truly collaboratively crowdsourcing, because it represents possibilities for collective creation and problem-solving that have never been seen before.  In the purest form, such crowdsourcing allows thousands to each contribute a small part towards a bigger picture (recall the metaphor of Ten Thousands Cents).  Oftentimes, such crowdsourcing overcomes traditional organization dilemmas, such as costs and management.  For example, to categorize images en masse, as is done both actively with the ESP Game and incidentally with Flickr tagging, could not possibly be done at any reasonable rate of return prior to the arrival of the Internet.

As modern communication technology encourages crowds to grow larger while more streamlined, what new problems will they come to solve?  My communications education has pinned groups as collectively blunt, but now it seems that this is a result of primitive communication techniques; online, the “individual” is much more a part of the whole.  If we continue to repurpose individual minds in new combinations, the results will be something not often characteristic of society: fresh.

As both the million monkey and truly collaborative approaches require the same source—a crowd— projects need not be bound to simply one approach, nor are they.  For example, the ever-popular Threadless has a million monkeys system for t-shirt submissions, while an generally democratic system of voting (mixed with some managerial liberties).  If there is a community with one goal it can be re-purposed for another.  And thus we arrive at a topic for another day: the reworking of existing communities.

Galaxy Zoo logoBill Dunphy recently directed me to Galaxy Zoo, an astronomy project out of Oxford that’s flown under my radar.  Most basically, Galaxy Zoo offers millions of sky charts to the public, asking them to classify galaxies.  Like many “artificial artificial intelligence” tasks, this is something that is immense in scale but hard to computerize.

It’s interesting to note the seriousness with which the researchers approached Galaxy Zoo, in that the content was the primary purpose and rarely is the word “experiment” tossed around in regards to the method.

By all accounts, the study was a success.  Here are some notes culled from the first two papers stemming from the project.

Statistics

As of November 28th 2007, GZ had over 36 million classifications (called ’votes’ herein) for 893,212 galaxies from 85,276 users. (Land et al. 2)

… we are able to produce catalogues of galaxy morphology which agree with those compiled by professional astronomers to an accuracy of better than 10%.” (Lintott et al. 9).

User Reliability

For greater reliability, two methods were employed.  First of all, each galaxy was classified numerous times, and the researchers would “only use objects where one class-type receives a significant majority of the votes.” This technique of independent confirmation is used in most such undertakings, as it limits individual impact by unreliable or malicious users.

There was also a weighted ranking, where “users who tended to agree with the majority have their votes up-weighted, while users who consistently disagreed with the majority have their votes down-weighted” (Land et al. 2).  However, reserachers did not see much difference, and chose to concentrate on the unweighted results.

“More than 99.9% of the galaxies classified as MOSES ellipticals which are in the Galaxy Zoo clean sample are found to be ellipticals by Galaxy Zoo.” (Lintott et al. 2)

Bias and Validity

Numerous types of bias were recorded.  Notably, colour-bias (where more experienced users can classify based on the prior tendencies of the specific colour) and spiral-bias were noted.  The second one, as noted in Galaxy zoo finds people are screwed up, not the Universe, appears to people a product of human psychology, where users would tend to classify a galaxy as rotating counterclockwise, when in theory CW and CCW should be about equal.

To investigate this, the creators ran a bias sample, with occasional monochromatic, vertically-mirrored, and diagonally-mirrored images.  We see this done in Luis von Ahn’s projects, with decoys being used in Recaptcha and ESP Game to help determine reliability.  The GZ researchers note the Hawthorne Effect, in that “users may be more cautious with their classifications if they think that they are being tested for bias” (Lintott et al. 9).  However, considering the example of Recaptcha—which offers one real word and one decoy—perhaps such an effect can be utilized fully.

Participation

To get as many users as possible, simplicity and a low barrier to entry were extremely important considerations in the design.  “Visitors to the site were asked to read a brief tutorial giving examples of each class of galaxy, and then to correctly identify a set of ‘standard’ galaxies. …Those who correctly classified more than 11 of the 15 standards were allowed to proceed to the main part of the site. The bar to entry was kept deliberately low in order to attract as many classifiers to the site as possible” (Lintott et al. 3).

User Prolificacy

The majority of users classified around 30 galaxies.  As the following chart shows, however, some went up to the tens of thousands.  Even though the use of crowds to dissipate individual time obligations is the core purpose of such a system, it is very beneficial to accommodate the “super-users”, who do hundreds of times as much work as the casual user.

Links

Galaxy Zoo

Bad Astronomy: AAS #14: Galaxy zoo finds people are screwed up, not the Universe

Betsy Devine: Ox, Docs Shocks!

Back in January, when I demonstrated the Mechanical Turk to my Crowdsourcing students, I would show to them one particularly cryptic project.  What it was was simply two boxes.  The one on the left held an apparently zoomed-in image, while the one on the right was blank.  With a simple brush, you were asked to redraw the image on the right.  Colours were chosen with a colour dropper, and an adjustable ghost image in the right box made tracing easy.   We all knew that we were creating a larger image, guessing it was an art project, but I did not think it could possibly turn out too effective.

I was wrong.  The results of that project have surfaced, in the form of “Ten Thousand Cents“.  TUrns out we were drawing a one hundred dollar bill.

The total labor cost to create the bill, the artwork being created, and the reproductions available for purchase are all $100. The work is presented as a video piece with all 10,000 parts being drawn simultaneously. The project explores the circumstances we live in, a new and uncharted combination of digital labor markets, “crowdsourcing,” “virtual economies,” and digital reproduction.

This project serves as a brilliant metaphor of the normalizing power of crowds.  When you open up a project to the masses, governance becomes extremely difficult.  Anybody is given the ability to contribute erroneous information.  However, as you gain a larger community of contributors, things balance out despite the fouls.  Consider opinion-based efforts, such as Digg and Travelocity: eventually, the best items shine through.  That is why Wikipedia is so reliable considering the circumstances: because thousands of editors are better than one.  So how is Ten Thousand Cents relevant?

Still Ben, right?

Delores Labs LogoDolores Labs is a new service that help clients crowdsource their projects online.  Specializing in Mechanical Turk, Dolores Labs has put online two fun example studies.

The first is a classification of Sports Illustrated covers over the past thirty years.  Covers were classified by race of the athletes featured and the sport featured.  Having recently led coding for a school study—involving a 2-week census of Digg.com front page stories— I can certainly appreciate how appropriate the Turk is for coding with such straightforward, reliable variables.Dolores Labs Colour Cloud

The second example is even more fascinating.  Providing Turkers with thousands of random colours, Dolores simply asked each colour to be named by the worker.  What resulted is a fascinating dataset of human-interpreted colour descriptions.  You see the common colour names pop up, but more interesting are how the workers utilized language to describe those words that were more difficult to classify.

Essentially, Dolores Labs is a crowdsourcing consulting company.  Even though they provide deeper services than simply advice, their main commodity is the knowledge of how crowdsourcing works.  There are good ways to mobilize crowds and incorrect or useless ways to do so, and as we come to realize that, crowdsourcing moves beyond simply a trend and into a bona-fida tool.  The existence of a group that specializes in understanding the process shows a maturing of crowdsourcing within culture as a viable method for abstract analysis.

Dolores Labs LogoDolores Labs aren’t even the first ones selling their expertise on crowdsourcing. Amsterdam-based CreativeCrowds have been doing a similar thing for a while.  Like Delores Labs, they also give back to the public, not in the form of test data but in their phenomenal blog, CrowdSourcingDirectory.  Both companies are approaching this the right way, and I hope to see more from both in the future.

Next Page »