Profiles


Galaxy Zoo logoBill Dunphy recently directed me to Galaxy Zoo, an astronomy project out of Oxford that’s flown under my radar.  Most basically, Galaxy Zoo offers millions of sky charts to the public, asking them to classify galaxies.  Like many “artificial artificial intelligence” tasks, this is something that is immense in scale but hard to computerize.

It’s interesting to note the seriousness with which the researchers approached Galaxy Zoo, in that the content was the primary purpose and rarely is the word “experiment” tossed around in regards to the method.

By all accounts, the study was a success.  Here are some notes culled from the first two papers stemming from the project.

Statistics

As of November 28th 2007, GZ had over 36 million classifications (called ’votes’ herein) for 893,212 galaxies from 85,276 users. (Land et al. 2)

… we are able to produce catalogues of galaxy morphology which agree with those compiled by professional astronomers to an accuracy of better than 10%.” (Lintott et al. 9).

User Reliability

For greater reliability, two methods were employed.  First of all, each galaxy was classified numerous times, and the researchers would “only use objects where one class-type receives a significant majority of the votes.” This technique of independent confirmation is used in most such undertakings, as it limits individual impact by unreliable or malicious users.

There was also a weighted ranking, where “users who tended to agree with the majority have their votes up-weighted, while users who consistently disagreed with the majority have their votes down-weighted” (Land et al. 2).  However, reserachers did not see much difference, and chose to concentrate on the unweighted results.

“More than 99.9% of the galaxies classified as MOSES ellipticals which are in the Galaxy Zoo clean sample are found to be ellipticals by Galaxy Zoo.” (Lintott et al. 2)

Bias and Validity

Numerous types of bias were recorded.  Notably, colour-bias (where more experienced users can classify based on the prior tendencies of the specific colour) and spiral-bias were noted.  The second one, as noted in Galaxy zoo finds people are screwed up, not the Universe, appears to people a product of human psychology, where users would tend to classify a galaxy as rotating counterclockwise, when in theory CW and CCW should be about equal.

To investigate this, the creators ran a bias sample, with occasional monochromatic, vertically-mirrored, and diagonally-mirrored images.  We see this done in Luis von Ahn’s projects, with decoys being used in Recaptcha and ESP Game to help determine reliability.  The GZ researchers note the Hawthorne Effect, in that “users may be more cautious with their classifications if they think that they are being tested for bias” (Lintott et al. 9).  However, considering the example of Recaptcha—which offers one real word and one decoy—perhaps such an effect can be utilized fully.

Participation

To get as many users as possible, simplicity and a low barrier to entry were extremely important considerations in the design.  “Visitors to the site were asked to read a brief tutorial giving examples of each class of galaxy, and then to correctly identify a set of ‘standard’ galaxies. …Those who correctly classified more than 11 of the 15 standards were allowed to proceed to the main part of the site. The bar to entry was kept deliberately low in order to attract as many classifiers to the site as possible” (Lintott et al. 3).

User Prolificacy

The majority of users classified around 30 galaxies.  As the following chart shows, however, some went up to the tens of thousands.  Even though the use of crowds to dissipate individual time obligations is the core purpose of such a system, it is very beneficial to accommodate the “super-users”, who do hundreds of times as much work as the casual user.

Links

Galaxy Zoo

Bad Astronomy: AAS #14: Galaxy zoo finds people are screwed up, not the Universe

Betsy Devine: Ox, Docs Shocks!

Advertisements

Cumulus weather boxSometimes, Crowdsourcing isn’t the answer. Cute new experiment cumul.us is one of these instances.

What makes it simple and accurate is that it collects weather forecasts from several sources and combines them together to give you a more accurate average, using the idea of the “wisdom of crowds”. In short, cumul.us is the “wisdom of clouds”. Not only is there data from meteorological sources, but people can make predictions themselves.

Sounds okay, right? The problem is that meteorological sources are much more advanced than human predictions. This site’s purpose is akin to having a crowdsourced calculator, where users pitch in on what the actual answer might be if the computer gets it wrong. Meteorological predications are something fully embedded in calculation, and while experts help, the common human touch is absolutely unnecessary. The other function of the site if for users to say what they’ll be wearing (jeans, skirt, etc.) another function that needs no crowd mentality whatsoever.

Crowdsourcing is exciting and shows phenomenal potential for future development of society. However, as cumul.us fails to utilize, a useful crowdsourcing model should follow these rules:

  • the task being replaced is predominantly abstract (cognitive) rather than logical i.e. image semantics
  • the wisdom of the crowds should show improvement over the wisdom of one i.e. Wikipedia

This blog is called “Crowdstorming”, as in crowdsourced brainstorming. What interests me is the ideas that result from the sharing of information and large-scale collaboration on it. Of course, we probably won’t get as blunt an interpretation of ‘crowdsourced brainstorming’ as the Human Brain Cloud. Essentially, it is a collective brain cloud — one of those webbed mind maps that are so effective in brainstorming.

At Human Brain Cloud, users are given a word and asked to submit the first word they think of. Their submission in turn will end up popping up for somebody else, and the ‘game’ (as the author refers to it) keeps branching out. Users can flag words as ‘junk’, ‘not a word’ as as a ‘misspelling’. Scoreboards/charts keep users interested and provide a bit of motivation.

audience_web.jpgBeyond the actually encoding of the data, the processed data is quite slick. The site creates a really cool (and potentially usefully) searchable web. Hmm… that adjective needs more emphasis on : it’s really cool.

The program does suffer in the area of junk. Some users use it for building a web of connected English word, while some of it use it more akin to the lightning round of Family Feud, simply yelling out the most immediate thought. As such, there is a mix of phrases and words, meaning a term such as Peanut Butter will have the redundant branches ‘jelly’ and ‘and jelly‘. Luckily, each word has a legitimacy scale, so when it is flagged it loses points, the trouble words eventually getting weeding out by the community.

A year ago, NYU professor Jay Rosen announced the launch of NewAssignment.net, an initiative in open-source, crowd-created, journalism. Now, in partnership with Wired.com, they are publishing the results of their first project online.

Assignment Zero is their first experiment in “pro-am” (professional/amateur) journalism: journalism run by the public rather than the media. Articles do not by any means follow the wiki model —they are still mainly written individually— but what has been handed to the audience is the power of how these articles come together. It’s more like a social democracy than an anarchy. Assignment Zero is an attempt at journalism without strings: an audience-run newsroom. They think up the stories, choose them, contribute to research, hire reporters, and other such responsibilities. In addition to no tangible interest (perhaps debatable due to the intrinsic nature of those attracted to the project), a crowd model will encourage better journalism through apt journalist assignments and through reputation for quality. Those that produce better work can make more money. Oh, did I mention that journalists are paid? This effort is as much ‘pro’ as it is ‘am’, and for professionals to be such, their profession needs to be day job.

Jay Rosen writes that the project was inspired by Chris Allbritton, a journalist who snuck into Iraq based off the funding of the Internet. His website, Back to Iraq, was born of nearly $15000 in contributions from the audience. Rosen was intrigued by the grander implications of “alert publics [that] hire their own correspondents and share the results with the world, cutting ‘the media’ out entirely” and NewAssignment.net came to life.

Key players in the project are not shying away from Assignment Zero’s flaws. Reviews of the project chiefly cycle through different ways to say that it was a “successful failure” and optimistically defend the “learning” to be the main point of the project. David Cohn’s roundup of reactions says it better than I can, though the actual analysis published as part of AZ provides this insight: “if Assignment Zero failed to clear the especially high bar it set for itself, the fact it produced so large a body of work still speaks to the considerable potential of crowdsourced journalism”. (Did Assignment Zero Fail? A Look Back, and Lessons Learned)

It seems that one of most grievous issues was that of mission creep. Mission creep is the problem of a project growing quantitatively, reducing its reach qualitatively. Overambitious, per se. This problem, one many of us are familiar with, increases immensely with the communal governing of crowdsourcing. I could probably fill a whole post about the topic (link to come!).

For the most part, the projects goal to become the most comprehensive resource of crowdsourcing writing was too broad, and the project suffered from lack of direction. This is especially apparent with the output: 7 essays and 80 interviews, far short of the goal of 80 articles. For the time being, the interviews have no foreseeable future as a basis for expanded work, though their high-quality gives them value as raw material for a “great, synthetic essay”, Rosen suggests.

I encourage you to check out the first five published works of Assignment Zero because, thus far, they’re pretty good. A ‘successful failure’, if you will.

With Assignment Zero wrapping up, the next NewAssignment.net project in the works is OffTheBus, a presidential campaign tracking initiative done in conjunction with The Huffington Post.

As always, you can also go out and help.

Resources:
NewAssignment.Net
Assignment Zero
PressThink, Jay Rosen’s blog
OffTheBus

Earlier this year, Turing Award -winning computer scientist Jim Gray set out on a short sailing trip. He did not return. As search parties tried unsuccessfully to find him, the effort was taken online.

News spread like wildfire, with some of West Coasts smartest people aiding the effort. An imaging satellite was routed to the Pacific route Gray had been on, in response to requests by Google and NASA. Time was urgent, and the satellite has thousands of images. With these two issues in mind, Amazon offered their form of help: they put the data on the Turk.

Amazon Mechanical Turk is a service launched in 2005 as a solution to tasks too difficult or impossible for computers to achieve. Billed as the Artificial Artificial Intelligence, the Turk offers an infrastructure for paid crowdsourcing. A requester is given a simple interface for splitting up the work into smaller tasks, called HITs. The HITs are given a value and posted on the site, where the Turk’s userbase is given the option to accept the HITs or ignore it. Successful completion gives the user monetary credit, which can then be transferred to Amazon credit or simply to a bank account.

In the search for Jim Gray, volunteer Turk workers examined images from the satellite. They were given reference of what his yacht would look like, and photo by photo, they marked whether there was anything unusual in the photos. Each photo was reassessed a number of times.

Photo semantics analysis is a prime use for crowdsourcing. Since it’s inception, Mechanical Turk has hosted many similarly clever projects. Computers have limits, and the theory under which Mechanical Turk was conceived, to create a human-intelligence powered computer, has been the guiding light to its most successful projects. I’ll revisit some of best examples in the future. Of course, the service has also been exploited for less than saintly purposes, something that I will also return to.

Unfortunately, the Jim Gray story did not yield a happy ending. Yet, technology’s ability to mobilize such a large group of eager but physically unable volunteers makes our future seem all the more hopeful.

For more information on Amazon Mechanical Turk, see Mechanical Turk on Wikipedia and the Mechanical Turk FAQ page.