Hackers, Cyborgs, and Wikipedians

A Brief Note on Wikipedia and the Situation in Iran

Posted by: afamigl | June 20, 2009 | 94 Comments |

I’ve been working on a chapter and I’ve let this blog slide. I just wanted to quickly mention something I noticed while checking in on the evolution of the Wikipedia article on the unfolding situation on Iran.

Wikipedia, though built on a philosophy of radical openness and communication, has had to set limits on who can and cannot edit the site as the project has grown to be a larger and more tempting target for vandalism. An exchange between two editor discussing the article on the current protests in Iran suggests that these limits may shape how events are reported on Wikipedia.

The first editor requests that the article be placed under what is called “semi-protection,” in which only users with accounts on wikipedia (and only accounts that have what is called “auto-confirmed” status, meaning they are not brand new and have met certain other basic criteria for trustworthyness) can edit the page. A second editor objects to this request writing:

I don’t see that much vandalism currently, and we should know full well that the only people who can actually speak Farsi or know anything about what is going on are going to be IPs and new accounts, if they can get past Wikipedia restrictions on open proxies and TOR sites that is. So let’s not protect this at all.

This second editor argues that those most qualified to edit this article, presumably Iranian nationals (currently under-represented on Wikipedia, which tends to skew western and white like many net projects), might be blocked from doing so by semi-protection. Furthermore, s/he points out that some with useful information to add to the article might be prevented from contributing by “Wikipedia restrictions on open proxies and TOR sites.” That is, by restrictions that prevent Wikipedia from being edited using techniques that conceal one’s actual IP address. Wikipedia restricts editing by those using these techniques since they could be used by vandals to evade efforts to block their attempts to damage the site. However, as this editor points out, these techniques might be the only means available for Iranian dissidents to evade government censorship and contact the outside world. Thus Wikipedia’s defenses against vandals might inadvertently silence their voices as well.

Not an indictment of Wikipedia, by any means, but something to think about.

under: Uncategorized

Tasty Theory Clusters!

Posted by: afamigl | March 26, 2009 | 36 Comments |

I know some of you Lit-Theory heads out there are hungry for some name-droppy goodness after my last couple of mostly-data posts. Here are two subsections from my theory-frame chapter, where I try to lay out theoretical justification for my research questions:

4.1 The Answers of Cultural Studies and Peer Production to the Problem of Property

In their efforts to confront the perceived failures of both industrial capitalism and state socialism, both Peer Production and Cultural Studies must confront the issue of private property. Both Peer Production and Cultural Studies have often been critical of private property regimes and the inequalities of power and privilege these regimes can serve to create. They differ however, in at least one crucial respect, Peer Production has, for the most part, carefully limited its critique of private property to intellectual property systems, whereas Cultural Studies has tended to mount a broader assault on the notion of private property. This contrast raises important questions for this dissertation to investigate.

It is not difficult to find Peer Production theorists drawing a “bright line,” to use the legal term, between the Intellectual Property regimes they wish to modify or overthrow, and the physical property regimes they wish to leave intact. In Code and Other Laws of Cyberspace, Lessig makes the case that Physical and Intellectual property are fundamentally different:

The law has a good reason, then, to give me an exclusive right over my personal and real property. If it did not, I would have little reason to work to produce it. Or if I did work to produce it, I would then spend a great deal of my time trying to keep you away. It is better for everyone, the argument goes, if I have an exclusive right to my (rightly acquired) property, because then I have an incentive to produce it and not waste all my time trying to defend it.

Things are different with intellectual property. If you “take” my idea, I still have it. If I tell you an idea, you have not deprived me of it. An unavoidable feature of intellectual property is that its consumption, as the economists like to put it, is “non-rivalrous.” Your consumption does not lessen mine.

(Lessig, 1991, p 131)

The special qualities of intellectual property, Lessig goes on to argue, are what warrants treating it differently from physical property. Thus, when Lessig argues that intellectual property should be less strictly protected in favor of nurturing and informational commons, his arguments should be understood to apply solely to intellectual property, they are not, in Lessig’s understanding, to be expanded to apply to property in general.

Yochai Benkler, at the opening of his Wealth of Networks, draws similar limits around the scope of his theory of “social production.” In the industrial age, he writes, societies were forced to make difficult choices between different priorities because of the hard limits of physical property. The different qualities of information and the information based economy are what may allow some of these limits to be surpassed.

Predictions of how well we will be able to feed ourselves are always an important consideration in thinking about whether, for example, to democratize wheat production or make it more egalitarian. Efforts to push workplace democracy have also often floundered on the shoals – real or imagined – of these limits, as have many plans for redistribution in the name of social justice. Market-based, proprietary production has often seemed simply too productive to tinker with. The emergence of the networked information economy promises to expand the horizons of the feasible in political imagination.

(Benkler, 2006, p 8)

Later, Benkler more concretely limits the scope of social production, writing:

There are no noncommercial automobile manufacturers. There are no volunteer steel foundries. You would never choose to have your primary source of bread depend on voluntary contributions from others. Nevertheless, scientists working at noncommercial research institutes funded by nonprofit educational institutions and government grants produce most of our basic science. Widespread cooperative networks of volunteers write the software and standards that run most of the Internet and enable what we do with it. […] What is it about information that explains this difference?

(Benkler, 2006, p 35)

What explains the difference, Benkler goes on to tell us, are exactly the qualities of information that Lessig identifies in his work. Benkler and Lessig, along with other theorists of Peer Production (Boyle, 1996; FIND MORE) draw the same bright line around their critique of private property. The digital commons, for these thinkers, is to be for information only, physical goods should be left to the property regimes that, in their view, have served well to distribute goods and give incentives for production.

Cultural Studies, on the other hand, tends to draw on its Marxist roots to make a more sweeping criticism of private property systems, both intellectual and physical. Take, for example, this passage from Negri and Hardt, where they suggest that the rupture opened up by the difficulty of regulating intellectual property might be fruitfully extended to undermine other private property regimes.

It seems to us that, in fact, today we participate in a more radical and profound communality than has ever been experienced in the history of capitalism. The fact is that we participate in a productive world made up of communication and social networks, interactive services, and common languages. Our economic and social reality is defined less by the material objects that are made and consumed than by co-produced service and relationships. Producing increasingly means constructing cooperation and communicative commonalities.

The concept of private property itself, understood as the exclusive right to use a good and dispose of all wealth that derives from it, becomes increasingly nonsensical in this new situation. There are ever fewer goods that can be possessed and used exclusively in this framework; it is the community that produces and that, while producing, is reproduced and redefined. The foundation of the classic modern conception of private property is thus to a certain extent dissolved in the postmodern mode of production.

One should object, however, that this new social condition of production has not at all weakened the juridical and political regimes of private property. The conceptual crisis of private property does not become a crisis in practice, and instead the regime of private expropriation has tended to be applied universally. This objection would be valid if not for the fact that, in the context of linguistic and cooperative production, labor and the common property tend to overlap. Private property, despite its juridical powers, cannot help becoming an ever more abstract and transcendental concept and thus ever more detached from reality.

(Negri and Hardt, 2000, p 302)

This notion, that the instability of intellectual property does not simply call for information to be treated differently, but should rather lead to a destabilization of all forms of private property is echoed by many authors within the field of Cultural Studies (Wark 2004; Terranova 2004; Ross 2006; Dyer-Witheford, 1999).

In most cases, this desire for a broader destabilization of private property relations seems to stem from Cultural Studies authors’ understanding the relations of production engendered by private property as inherently unjust and exploitative. Ross, for example, argues that, by limiting their critique of property to intellectual property, those involved in the “copyfight” movement to restrict the scope of copyright law ignore the possible effects of privately owned physical infrastructure on the digital commons, as well as the needs of the physical laborers that are involved in the production and maintenance of this infrasturcture. He writes, “Because they are generally ill-disposed to state intervention, FLOSS [for Free, Libre and Open Source Software, yet another name given to FOSS] engineers, programmers, and their advocates have not explored ways of providing a sustainable infrastructure for the gift economy they tend to uphold. Nor have they made it a priority to speak to the interests of less-skilled workers who lie outside of their ranks.” (Ross, 2006, p 747)

Ross’s critique complicates the ethics of Peer Production in an important way, highlighting how it may be blind to some of the ways its vision of a better society enabled by shared information resources could be limited by leaving systems of physical property intact. It is also important to point out two things which, in turn complicate Ross’s critique. First is the fact that, under some circumstances, some Peer Production theorists have made the case for government intervention in the realm of building the physical infrastructure of the information society. For example, Yochai Benkler recently penned an editorial for the left-leaning blog Talking Points Memo in which he urges the Obama administration use government funds to subsidize the construction of greater broadband internet capacity (Benkler, 2009). Second, some who have studied the Peer Production phenomenon have made the case that those involved in the Peer Production movement may limit the scope of their political arguments, not out of an ideological aversion to the state, but out of a pragmatic desire to build new working coalitions across old ideological boundaries. Gabriella Coleman writes that her study of the Debian Linux community suggests that, “Most hackers, however, recognize that

since each developer has personal opinions about politics, it behooves them not to attribute a

universal political message to their work as this may lead to unnecessary project strife and

interfere with the task at hand: the production of superior software. ” (Coleman, 2005, p 8)

This dissertation will attempt to answer several questions about how this tension between the possible dangers of limiting the critique of private property to intellectual property and the possible dangers of expanding this critique plays out in the Peer Production communities of the Wikimedia constellation.

– What effects, if any, does the private ownership of the physical infrastructure of information production have on the wikimedia constellation?

– Do we see evidence of what Nick Dyer-Witherford calls “divisions between immaterial, material and immiserated labor,” within the Peer Production community of the wikimedia constellation? That is, how well or poorly are the interests of those not involved in the production of intellectual property represented?

4.2 Identity, Difference and Peer Production. What are the politics of the Peer Production Heterotopia?

Michel Foucualt’s notion of the heterotopia is one of the many theoretical resources developed by Foucault that Cultural Studies scholars have found widely useful. The heterotopia is a sort of special social space in which ordinary social relationships may be suspended and new possibilities explored. These are distinct from utopias, which are wholly imaginary, in that they are actual, realized social spaces. As Foucault puts it:

There are also, and probably in every culture, in every civilization, real places, actual places, places that are designed into every institution of society, which are sorts of actually realized utopias in which the real emplacements, all the other real emplacements that can be found within the culture are, at the same time, represented, contested, and reversed, sorts of places that are outside all places, although they are actually localizable.

(Foucault, 1984, p 178)

Two things are important to note in this definition. First is the way that the heterotopia provides a sort of mirror for society, a place in which all the other “emplacements,” a word Foucault here defines as a social entity “defined by the relation of proximity between points or elements,”(Foucault, 1984, p 176)¹ of a given society can be, “represented, contested and reversed.” Second is the fact that Foucault stresses that heterotopias, unlike their utopian counterparts, are “actually localizable.” That is to say, a heterotopia must be embedded within, and thus structured by, the society it reflects.

Foucault goes on to list a variety of examples of the heterotopia. In the modern era, he writes, “museums and libraries are heterotopias in which time never ceases to pile up and perch on its own summit, whereas in the seventeenth century, and up to the end of the seventeenth century still, museums and libraries were the expression of an individual choice. By contrast, the idea of accumulating everything, the idea of constituting a sort of general archive, the desire to contain all times, all ages, all forms, all tastes in one place […] all of this belongs to our modernity.” (Foucault, 1984, p 182) This goal of creating a “general archive,” can also be seen reflected in Wikipedia and the Wikimedia constellation. The home page of the Wikimedia Foundation tells visitors they are committed to, “a world in which every single human being can freely share in the sum of all knowledge.” (Wikimedia Foundation, 2009)

Furthermore, Foucault lists a several principles that define what he believes to be the most important qualities of heterotopias. Among these, he writes that, “the heterotopia has the ability to juxtapose in a single real place several emplacements that are incompatible in themselves.” (Foucault, 1984, 181) This ability of heterotopias to bring together otherwise incompatible, or even hostile, elements, has been seized on by other Cultural Studies scholars as a politically interesting feature of the heterotopia. Music scholar Josh Kun has written about the musical form of the heterotopia, what he calls “audiotopias.” These audiotopias are, “sonic spaces of affective utopian longings where several sites normally deemed incompatible are brought together not only in the space of a particular piece of music itself, but in the production of social space and mapping of geographical space that makes music possible.” (Kun qtd in Gopinath, 2005, p 42) Gayatri Gopinath expands this idea in turn, arguing that, by permitting the temporary peaceful coexistence of otherwise incompatible social groups, audiotopias have played an important role in creating spaces of relative freedom for some queer members of the South Asian diaspora. Gopinath describes an, “outdoor Summerstage concert in New York City’s Central Park in July 1999,”

the female Sufi devotional singer Abida Parveen’s powerful stage presence delighted a large, predominately South Asian crowd that, for a brief moment, reterritorialized Central Park into a vibrant space of South Asian public culture. While many of the women in the audience remained seated as Parveen’s voice soared to ever greater heights of ecstasy and devotion, throngs of mostly working-class, young and middle-aged South Asian Muslim men crowded around the stage, singing out lyrics in response to Parveen’s cues, their arms aloft, dancing joyously arm in arm and in large groups. Sufism, a form of Islamic mysticism in which music plays a central role in enabling the individual to commune with the divine, has a long history of homoerotic imagery in its music and poetry. The sanctioned homosociality/homoeroticism of the Qawaali space in effect enabled a group of men from the South Asian Lesbian and Gay Association, gay-identified South Asian men, to dance together with abandon; indeed they were indistinguishable from the hundreds of men surrounding them. […] They were thus producing a “queer audiotopia” to extend Josh Kun’s notion of an audiotopia, in that they were conjuring forth a queer sonic landscape and community of sound that remapped Central Park into a space of queer public culture, the locus of gay male diasporic desire and pleasure.

(Gopinath, 2005, p 59)

Gopinath here demonstrates the political potential of the audiotopia, and thus of the heterotopia more generally. By permitting the coexistence of things that would otherwise be difficult to mix, the heterotopia allowed for a traditionally less-powerful queer community to “poach” a public space for itself.

Yochai Benkler has suggested that Wikipedia may itself be a space in which the historically less powerful may have an opportunity to express themselves in the same space as the historically powerful. Benkler points out that, unlike many other, more commercially oriented encyclopedias, Wikipedia permits the coexistence of a variety of points of view. By way of example, he compares the articles covering Mattel’s Barbie doll on commercial encyclopedias and Wikipedia. With the exception of the Encyclopedia Britannica, Benkler finds that the commercial encyclopedias tend to provide a one-sided view of Barbie stressing the commercial success of the doll and the diversity of her wardrobe. (Benkler, 2006, p 287-288) While the best of these encyclopedias, “includes bibliographic references to critical works about Barbie,” Benkler writes, “the textual references to cultural critique or problems she raises are very slight and quite oblique.” (Benkler, 2006, p 288)

In contrast, both Britannica and Wikipedia provide a more complete examination of the range of different opinions regarding Barbie’s cultural significance and function. Britannica, Benkler tells us, provides, “a tightly written piece that underscores the critique of Barbie, both on body dimensions and its relationship to the body image of girls, and excessive consumerism.” “It also,” Benkler continues, “makes clear the fact that Barbie was the first doll to give girls a play image that was not focused on nurturing and family roles, but was an independent professional adult […]. The article also provides brief references to the role of Barbie in a global market economy.” (Benkler, 2006, p 288) Wikipedia, for its part, contains an article on Barbie that, “provides more or less all the information provided in the Britannica definition […] and adds substantially more material from within Barbie lore itself and a detailed time line of the doll’s history.” (Benkler, 2006, p 288) Benkler goes on to demonstrate how Wikipedia not only presents readers with a diversity of opinion on Barbie, but also provides a space for many different authors to contribute to the Barbie article. The various interactions and negotiations between these authors are preserved on special pages attached to the article known as history and talk pages (more on these later). By permitting this ongoing negotiation and making it available to its readers, Benkler argues that Wikipedia embodies a “social conversation” model of cultural production that can help to ameliorate the ability of corporate interests to force other voices out of commercial Mass Media. (Benkler, 2006, p 289-290)

Thus, Benkler’s case for the benefits of “social conversation” cultural production mirror the political possibilities of the heterotopia as developed by Kun and Gopinath. Working from this point of contact, this dissertation will attempt to examine the politics of the heterotopia as they are really played out in the Wikimedia constellation. Foucault argued that the heterotopia was defined by being “actually localizable,” which I take to mean that the heterotopia, unlike the utopia, is not wholly free to suspend power relations, but rather can rearrange them only partially, contingently, and within limits. Gopinath’s example shows this clearly. The public space “poached” by queer subjects at the concert she describes is temporary, limited, and by definition invisible. To attempt to better understand what the limits of heterotopian political possibility might be within the Wikimedia Constellation, this dissertation will attempt to answer the following questions.

– In what ways can we observe “otherwise incompatible” forms of expression, politics, and identity co-existing within the Wikimedia constellation?

– What are the terms of this coexistence? To what extent are historical power relationships actually suspended within the space of the Wikimedia constellation and how is any such suspension achieved?

1A notion not unlike Latour’s “network.”

under: Diss Fragments

Search and Deletion of Wikipedia, part 3

Posted by: afamigl | March 13, 2009 | 22 Comments |

Besides Speedy Deletion and Proposed Deletion, there is a third, even more deliberative and time intensive process for removing Wikipedia articles. This third and most formal deletion process involves the article being Nominated for Deletion, at which point a section will be created for the article on the page listing articles being considered for deletion (AfD – Articles for Deletion). On the AfD page, editors will debate the relative merits of either keeping the article on wikipedia, or deleting it. This debate is often extensive and can be dense with Wikipedia shorthand, jargon, and abbreviated references to Wikipedia policy pages. This debate, Wikipedia policy tells us, is an attempt to reach “rough consensus” rather that a “majority vote.” (Wikipedia: Guide to deletion) In practice, this means that administrators have considerable latitude to consider the relative merits of arguments made for or against deletion by various editors, rather than being bound to simply follow majority opinion. When an administrator determines that consensus has been reached, he or she will “close” debate on the AfD page as either “Keep” (to retain the article) or “Delete” (to remove it), ending debate. This administrator will then go on to carry out deletion of the page if necessary.

Google and other search engines clearly play an explicit role in many debates on the AfD page. Of the about 30 articles that were Nominated for Deletion out of my sample from October 1 + 2, 2008, 8 have AfD entries in which one or more editors cite Google or other search engine results as evidence for either retaining or deleting an article. In several cases, editors say they make use of Google Scholar, Google Books, Google News and other specialized search products in an attempt to find sources and either establish or discount notability for the subject of the article in question. In two cases, the shorthand “ghits” for “google hits” is used by editors.

Since AfD discussions are retained on Wikipedia indefinately, the AfD process provides a rich supply of data with which I can easily expand my original sample to more extensively study the role of Google in AfD debates. The complete listing of all of the Articles listed for Deletion on October 1, 2008 gives 108 articles that were Nominated for Deletion on that date. (Wikipedia:Articles for deletion/Log/2008 October 1) Of the 108 AfD debates listed, 35 include comments by editors that explicitly reference the use of Google search products as a means of establishing whether or not a given page should be deleted. Interestingly, only 5 AfD debates appear to include discussion of search testing that do not mention Google products, either by name or using the “ghits” shorthand. Looking at the AfD listings for first days of the remaining two months of 2008, November and December, suggests that the October 1 listing is fairly normal for this time period on Wikipedia. The November 1, 2008 and December 1, 2008 AfD listings give 99 AfD debates with 19 explicitly citing Google products and 119 AfD debates with 29 explicitly citing Google products, respectively. (Wikipedia:Articles for deletion/Log/2008 November 1, Wikipedia:Articles for deletion/Log/2008 December 1) In both cases, only a handful of debates discuss search tests without invoking Google at some point.

These numbers would certainly seem to support the broad assertion that Google plays an important role in deciding if material on Wikipedia will be deleted or retained. About 1 in 4 of the 326 total AfD discussions listed on these 3 days had at least one editor invoke Google as evidence. To get a better idea of exactly what role Google was playing in these debates, I closely examined the arguments made by editors in the 35 AfD debates that explicitly invoked Google products in the October 1, 2008 AfD listing. Doing so demonstrates that, while Google clearly is an important element in the AfD process, editors are not simply counting Google hits to determine whether or not a given subject “exists,” rather editors engage in a relatively sophisticated process of reading and interpreting the output of Google searches in the process of making their decisions.

Wikipedia editors may be guided in this process, at least in part, by a section of a Wikipedia essay (NOTE TO TORI: Wikipedia “essays” are on site commentary pieces written by editors that attempt to provide advice on best practices for editing articles and maintaining the site. I should probably define all terms like this in my introduction, perhaps when I am discussing data sources and methodology.) entitled, “Arguments to avoid in deletion discussions.” (Wikipedia:Arguments to avoid in deletion discussions) One Wikipedia editor involved in the AfD discussions of October 1, 2008 explicitly references the subsection of this essay which provides guidance on using Google to provide evidence for AfD debates. This section, which is subtitled “Google test,” provides Wikipedia editors with the following advice:

Although using a search engine like Google can be useful in determining how common or well-known a particular topic is, a large number of hits on a search engine is no guarantee that the subject is suitable for inclusion in Wikipedia. Similarly, a lack of search engine hits may only indicate that the topic is highly specialized or not generally sourceable via the internet. One would not expect to find thousands of hits on an ancient Estonian god. The search-engine test may, however, be useful as a negative test of popular culture topics which one would expect to see sourced via the Internet. A search on an alleged “Internet meme” that returns only one or two distinct sources is a reasonable indication that the topic is not as notable as has been claimed.

Overall, the quality of the search engine results matters more than the raw number. A more detailed description of the problems that can be encountered using a search engine to determine suitability can be found here: Wikipedia:Search engine test.

Note further that searches using Google’s specialty tools, such as Google Book Search, Google Scholar, and Google News are more likely to return reliable sources that can be useful in improving articles than the default Google web search.

(Wikipedia: Arguments to avoid in deletion discussions)

Several things are significant about this language. First, the simple presence of a sub-section within this essay devoted solely to the “Google test,” speaks to how important and prominent a tool Google is for Wikipedia editors. Not only is Google the only search engine mentioned by name in “Arguments to avoid in deletion discussions,” in is the only source mentioned by name. The language provided here endorsing the use of “Google’s specialty tools,” can only serve to increase the influence of services like Google books and Google scholar within Wikipedia.

Second, “Arguments to avoid in deletion discussions,” includes specific language attempting to dissuade Wikipedia editors from using raw Google hits as a means of establishing whether or not a given article should be retained on Wikipedia. Of course, this does not mean that editors never invoke Google hits in deletion debates. For example, an editor calling for the deletion of an article on “Magic Bars,” writes, “No sources to indicate notability. All I could find on Google News were articles about bars where magicians work.” ( Wikipedia:Articles for deletion/Log/2008 October 1) In another example, an editor suggested that an article on “cat repellers” be renamed to “cat repellants” as this term, “gets more G[oogle] hits.” (Wikipedia:Articles for deletion/Log/2008 October 1) However, the discussions often indicate that these arguments are not solely reliant on Google to determine the worthiness of a given article, but rather are connected to doubts editors have based on the text of the article itself. In the case of “magic bars,” the discussion indicates that the article was a recipe for a sort of food, raising doubts among Wikipedia editors that see recipes as outside the scope of Wikipedia’s stated goal of encyclopedic knowledge production. In another case, an article on “Maxbashing” was nominated for deletion on the grounds of, “Fails Notability, Google yields few results. Written more as an advertisement rather than a substantial encyclopedic article.” (Wikipedia:Articles for deletion/Log/2008 October 1) Here the article’s advertisement-like tone (another editor calls the article “self-promotion”) is cited as an important consideration, along with the lack of Google results.

Furthermore, the guidelines provided in “Arguments to avoid in deletion discussions,” do a fairly good job of noting Google’s biases (especially its propensity to give greater weight to recent popular culture) and advising editors where hit counting may or may not be useful. The discussions on the AfD list for October 1, 2008 suggest that editors are taking this guidance under consideration. Many of the AfD entries in which the mere fact of how many Google results a given subject returns is advanced as an argument for either deleting or retaining an article involve subjects drawn from recent popular culture, especially living artists and recently released or upcoming works of popular culture. In one particularly interesting example, an editor arguing for the deletion of an article on the Harvard University “Bionumbers” project writes that the article should be removed, “or now. It appears to be a legit project run by a Harvard lab […] and it seems to be creating some sort of a buzz based on plain google search results […]. But as I understand it, the project is very new and was started in the Spring 2008. A more careful look at the google search result show that there is no sibstantial [sic] coverage yet by reliable sources.” (Wikipedia:Articles for deletion/Log/2008 October 1) Here, a Wikipedia editor is clearly arguing that raw Google hits may be unduly influenced by “buzz” about a very recent topic, and that this bias should be corrected for by a close reading of the search results.

There is one more interesting facet of the “Google test” language included in “Arguments to avoid in deletion discussions.” This language is included in a section on “Notability fallacies,” further demonstrating the link between search engines (especially Google) and Wikipedia editors’ perceived need to establish that subjects are notable enough to warrant inclusion in an encyclopedia. This link is also reinforced by many discussions on the October 1, 2008 AfD list, which draw upon Google while engaging with the issue of notability. In fact, of the 35 discussions on the October 1, 2008 AfD page that explicitly reference Google, only 4 do not center around a debate over the subject’s notability.

These deletion discussions also give important clues as to what might be driving Wikipedia editors’ perceived need to establish that article subjects are sufficiently notable to merit inclusion in Wikipedia. After all, as Wikipedia itself notes, “Wiki is not paper,” that is to say, the physical limits of what may be stored in Wikipedia effectively are much much less pressing than those of a traditional paper Encyclopedia. Unlike a paper Encyclopedia, Wikipedia need not worry about printing costs, how much space it takes up on a shelf, or the ability of readers to find information with only an index of subjects. Instead, Wikipedia is distributed through inexpensive digital means, and readers can use internal and external hyperlinks, Wikipedia’s internal search engine, and the services of external search engines (like Google) to lead them to the information they need. Wikipedia’s official policy, of course, limits this theoretically endless ability to collect and organize information, in the typical flippant prose of Wikipedia policy, “Wikipedia is not an indiscriminate collection of information” (Wikipedia:What Wikipedia is not). The general guideline is that information on Wikipedia should be “encyclopedic,” but since Wikipedia has already expanded to cover many topics ignored by traditional encyclopedias (such as the pages devoted to the major characters from the popular cartoon “Transformers”) clearly the meaning of what is and is not “encyclopedic” is constantly being re-negotiated by Wikipedia editors.

The contents of deletion discussions suggest that Wikipedia editors may be driven to police Wikipedia articles on the grounds of notability by their desire to establish and maintain Wikipedia’s status as a reliable and accurate source of information. While they often cite notability concerns in deleting articles on subjects without a significant presence in reliable, third party news publications, editors are clearly also worried that such articles may be outright hoaxes. In one example, an editor arguing for the deletion of an article on a movie entitled “Tattoos: A Scattered History,” writes that the entries on this movie on the IMDB (Internet Movie Database) should not be counted as establishing the movie’s notability since,

Anyone can add anything they want for IMDB. Someone once wrote that Saw IV would star Jessica Alba and feature Jigsaw’s baby. That stayed up there for at least a week. If anything, it’s worse than Wikipedia as it’s a lot easier to remove false information from Wikipedia than it is for IMDB. On another note, having an IMDB entry doesn’t equal notability…I can think of a lot of IMDB entries that if they were to become articles on Wikipedia they would fail an AFD.

(Wikipedia:Articles for deletion/Log/2008 October 1)

While the nomination for deletion for this article cited concerns about notability, the editor above shows how these concerns connect to editors’ attempts to guard against false information from remaining on Wikipedia. Notability becomes a means for editors to remove suspected falsehoods without violating Wikipedia’s central “Neutral Point of View” policy, which holds that Wikipedia does not, by definition, present “one point of view as ‘the truth'” (Wikipedia: Five pillars). Instead of asserting that a given article is “false,” editors may instead assert that it is simply not notable by virtue of its lacking a presence in large mainstream media sources.

In addition, editors seem to be particularly concerned with preventing Wikipedia from becoming a space in which small artists, businesspeople, and others promote their own projects using information that cannot be confirmed in reliable sources. Several debates on the October 1, 2008 AfD list use the term “Myspace musician” as a derogatory term to describe a musician who lacks recognition outside of self-promotional material posted on sites like the social-networking site Myspace. One such case of self-promotional media being discounted is that of the article on “Carlos Sepuluveda” which was deleted after being nominated on the grounds that, “Clearly non-notable, as a google search turns up nothing aside from Youtube/blogs/Myspace.” (Wikipedia:Articles for deletion/Log/2008 October 1) In another case, an editor writes that an article on the CRG West company should be retained despite the fact that “Much of the GoogleNews hits are press releases, which are not usually considered to count for notability,” since, “there are a few gems buried within.” (Wikipedia:Articles for deletion/Log/2008 October 1) In these examples, we see Wikipedia editors attempting to prevent companies from influencing their coverage on Wikipedia by issuing press releases, and artists from influencing their coverage by self-promotion.

The above cases demonstrate one technique used by Wikipedia editors to critically read Google results, they scan these results for patterns suggesting self-promotion, such as the disproportionate presence within the results of press-releases or information sources like Myspace or IMDB where subjects may be posting information about themselves. This ability of Wikipedia editors to read, rather than simply count, Google search results, tends to undermine some of Arno’s claims as to how Wikipedia editors use Google. In his article, he tells us that he intends to follow up on his original experiment, in which a hoax text was rejected from Wikipedia on the grounds that it’s subject lacked Google results. “Later on,” he writes, “I’ll try to create an other hoax. This time, I’ll make sure I use (fake) sources and there will be something about it to be found on Google. I have to use an other computer, Wikipedia files your IP-address.” (Arno, 2008) Apparently Arno believes such a hoax will have a better chance of being retained on Wikipedia. However, since Wikipedia editors tend to discount the sort of Google results Arno would be able to generate (message board pages, social networking pages, blogs) it seems likely that these Google results would be disregarded and the hoax article would again be deleted. It is perhaps telling that Arno has yet to publish the results of this follow-up.

The ability of Wikipedia editors to read Google results critically guards against those who would attempt to insert false information into Wikipedia, even if would-be hoaxers were equipped with the sort of motivated network that might succeed in generating Google results. It also guards against those who would attempt to crudely manipulate Wikipedia content for profit. Take, for example, Ron Gooden, who describes himself as a “Atlanta-based freelance copywriter and editor,” and advertises on his website that writing “custom Wikipedia articles” is one service he is able to provide (Goodden, 2008) He has also advertised this service on the classified-ads site Craigslist under the headline, “Let Me Put You In Wikipedia,” bragging, “as a long-time Wikipedia contributor I have been able to consistently help individuals, companies and organizations gain the recognition and advantage they deserve from this new-age encyclopedia – and usually within 48 hours of placing their order!” (Gooden, 2009) Goodden’s website provides a sample Wikipedia article he claims responsibility for. This article, which documents New York fashion designer Junko Yoshioka, is still available on Wikipedia, perhaps because it cites major publications such as People magazine and the Spanish newspaper El Mundo. (Junko Yoshioka) It is, however, as of the time of this writing less than 3 months old, and may yet be challenged on notability grounds. The user responsible for this article, presumably Mr. Goodden’s account, dates back to October 2005, but is responsible for creating only 4 articles, including the article on Ms. Yoshioka, and has edited only a handful of others, suggesting Mr. Goodden’s advertisement may overstate his abilities. (User Contributions for ChulaOne)

Mr. Goodden’s bravado aside, the presence of those like him, who might attempt to insert information into Wikipedia for profit, is of concern to Wikipedia editors. The ability of editors to critically read Google results guards against this sort of manipulation of Wikipedia, since editors are checking for patterns in Google results, such as the overwhelming presence of press releases, promotional material, or material from blogs or other easily manipulated sites, which might indicate that a subject or someone hired by a subject was attempting to use Wikipedia for promotional purposes. Unless such an editor is able to produce reliable sources testifying to their significance, it does not seem likely that simply having prominent Google results would provide Wikipedia editors with sufficient evidence to retain an article.

However, while this protects Wikipedia from manipulation by for-profit and self-promoting editors, it also ironically ties Wikipedia to many of the traditional media organizations that it is often seen as being in opposition to. It is the presence or absence of these traditional media organizations within search results that often decides whether or not an article is deleted. For example, in successfully arguing for the retention of an article on the “Milwaukee Ale House,” one editor writes, “there is detailed coverage of the pub in several books. […] There is also substantial coverage in local newspaper, Milwaukee Journal Sentinel: 246 hits in googlenews [sic]” (Wikipedia:Articles for deletion/Log/2008 October 1) This reliance on traditional print media as a means for establishing notability is also seen in the widespread reliance on Google Books and Google Scholar within deletion debates, and the explicit approval given to these tools in the guidelines provided by “Arguments to avoid in deletion discussions.”

While this reliance on print media may make Wikipedia more reliable, it may also prevent it from properly assessing the notability of artists and works of art hailing from some subcultures. The best example of this found in the AfD list for October 1, 2008, is that of the article on a punk band known as “Bankrupt,” which was deleted after a long and passionate discussion. The editor nominating the article for deletion writes, “I still believe the article fails WP:MUSIC because almost all of the links provided are for very niche type sites. Also, none of the albums the band has released have articles. They are on a minor label and haven’t charted as far as I can see. I did a Google search but many were for entirely different bands called Bankrupt.” An editor arguing for retaining the article attempts to refute these accusations writing,

Five new sources have been added, and quotes from reviews suggesting that the band also qualifies for notability criterion no.7. of WP:MUSIC

– 7. Has become the most prominent representative of a notable style or of the local scene of a city; besides – 1. It has been the subject of multiple non-trivial published works whose source is independent from the musician/ensemble itself and reliable

It is not stated here that the reference cannot be a “niche” publication. Several of these publications are considered as reliable sources in the punk community. Ox fanzine is the No.1 punk rock magazine of Germany.

I’ll create pages for the band’s albums. They may be on a minor label, but their recent releases are available worldwide on iTunes and Amazon.

Please note that a band is notable if it meets ANY of the notability criteria, therefore charting is not an obligatory criterion.

Regarding your argument of Google search: please do a search on last.fm. The only band called Bankrupt that comes up with over 15,000 listeners is this one. You can also search MySpace for Bankrupt for similar results.

(Wikipedia:Articles for deletion/Log/2008 October 1)

These arguments, however, fail to convince the other editors, who continue to contend that these sources are unreliable, even after the above editor attempts to explain that, “Ox Fanzine has published 80 issues since 1988, and is the largest punk rock fanzine in Germany. Moloko Plus is another major German punk rock fanzine with over 30 issues released. Distorted Magazine from the UK is a very unique flash-based online magazine with over 20 issues published. Est.hu is a major Hungarian entertainment portal. Southspace.de, thepunksite.com, and kvakpunkrock.cz are all punk music portals with hundreds of reviews published, and having a significant readership. Also, Left Of The Dial (USA) was originally a respected print zine, before the author decided to go on as a blog.” (Wikipedia:Articles for deletion/Log/2008 October 1) One editor arguing for deletion writes, “Running a quick google search turns up the classic ‘myspace own website irrelevant’ results,” suggesting that perhaps here the same critical Google reading skills used by Wikipedia editors to prevent for-profit and self-promotional articles might here be being used to discount sub-cultural sources in the absence of mainstream media acceptance. Ultimately, the article is deleted.

under: Diss Fragments

Search and Deletion on Wikipedia, Part 2

Posted by: afamigl | March 6, 2009 | 21 Comments |

Speedy Deletion is, as I have already mentioned, not the only way to remove an article from the Wikipedia. There also exist two other deletion procedures, both of which are designed to allow for a more deliberative deletion process. One of these processes is known as “Proposed Deletion” and usually abbreviated onsite, in the way so many things in the Wikipedia world are, as PROD. In the Proposed Deletion process, articles that editors believe should be deleted are “tagged” with a special template (that is to say, the template code is added to the article). This template displays a large, red warning box at the head of the article and provides space for the editor proposing deletion to list his or her concerns with the article, as well as information about the deletion process. If another editor objects to the article being deleted, he or she may remove the template. The template asks that an objecting editor also document his or her concerns on the article’s talk page, though in practice this does not always happen. If the template remains on the article for five days without being removed, any administrator may then delete the article.

Of the new articles created on October 1-2 2008, I was able to find evidence that 19 were involved in the Proposed Deletion process at some point. Of these, one article showed direct evidence of having been deleted based on evidence provided by a search test. Five others listed “notability” concerns as at least part of the reasoning behind the proposal for deletion. However, the quantitative data here are probably not very good. Deletion Proposals may be added to an article at any time and, if the proposed deletion is contested, later removed without leaving any evidence that could be found without an exhaustive review of the article history. I was not able to conduct such investigations on the articles captured for my data blog. Therefore, it is possible that other articles in the data set may have been marked as proposed for deletion without my knowledge. Furthermore, not every article deleted after a proposed deletion lists the concerns that initially prompted the deletion proposal in the deletion logs, thus information on these concerns was lost when the page was deleted.

These difficulties aside, the qualitative data provided by the deletion concerns I was able to recover are interesting. The one article proposed for deletion that has evidence of search test use in the comments recorded in the deletion log shows evidence that a variety of non-google specialized search engines may be employed by Wikipedia editors for search tests. The comments section of the entry in the deletion log for the article titled “Count von count band” reads: “WP:PROD, reason was ‘Band with three EPs and no other claim in article of meeting WP:MUSIC. No hits at metacritic; no listing at allmusic.’.” (http://en.wikipedia.org/w/index.php?title=Special:Log&page=Count+von+count+band) This comment states that this article on a musical group was deleted based on a proposed deletion (WP:PROD – for Wikipedia Policy: Proposed Deletion) because an editor felt it did not satisfy Wikipedia’s guidelines for what constitutes a “notable” musical artist or group (the shorthand for this is WP:MUSIC). Among the arguments forwarded to support this are a search test, specifically, the editor notes that this band name does not yield results when he or she searches the specialized Metacritic and Allmusic search engines, which are dedicated to tracking large databases of information regarding popular music. This demonstrates that Google is not the only search tool utilized by Wikipedia editors. It also demonstrates that Google is not the only large corporate player providing tools that Wikipedia editors find valuable, as Metacritic and Allmusic are owned by CBS interactive and the Macrovision corporation, respectively.

To expand on the limited data on the proposed deletion process provided by my initial examination of new articles created on October 1-2 2008, I examined articles listed as being subject to a Proposed Deletion on March 5, 2009. Articles that have the Proposed Deletion template added to them are added to a Wikipedia category, gathering them on a single page for easy inspection, at least until such a time as the template is removed or the page is deleted. When I first accessed the page at 19:40 UTC on March 05, 2009 73 articles were listed as proposed for deletion. Of these, six possesed Proposed Deletion tags that listed a lack of search results as part of the reason the article should be considered for deletion. Four of the six indicated that Google was the search engine employed in the search test, two of these provide links to the relevant Google results so other editors can confirm the findings. One editor went through the trouble of providing links to not only the standard Google search results, but also the results for Google News, Books, and Scholar searches.

Furthermore, these six articles provide more evidence that search testing plays a key role in establishing that a topic has sufficient “notability” to be the subject of a Wikipedia article. In five of six cases in which editors proposed deletion for articles because of a lack of search results, they explicitly stated they were concerned that the subjects of these articles were not notable. In the sixth case, an article purporting to describe a condition called “Samms disease,” the editor proposing deletion writes, “no medicine related links turn up on a Google search with the title. Please provide reliable references.” This comment does not make explicit why the editor finds a lack of Google results to be reason to delete the article, but strongly suggests he or she feels that, in the absence of references provided within the article itself, the lack of Google search results might indicate a hoax. The link between notability and Google search results is made most clearly by the editor proposing that the article on St. Peter’s Syrian Orthodox Church, Auckland be deleted. This editor writes, “A search for references has failed to find significant coverage in reliable sources in order to comply with notability requirements. This has included web searches for news coverage, books, and journals, which can be seen from the following links,” and then proceeds to provide links to the search results for the string “St. Peter’s Syrian Orthodox Church, Auckland” on Google’s standard web search, as well as Google News, Books, and scholar. The editor then goes on to conclude, “Consequently, this article is about a subject that appears to lack sufficient notability.” Since 47 of the 73 articles proposed for deletion on March 5, 2009 list concerns about the notability of subjects as at least part of the reason the articles were being considered for deletion, if editors are, in fact, commonly using Search tests to establish notability for a given subject, a practice which would be in accordance with Wikipedia’s guidelines, then Google and other search engines may have played a role in many more proposed deletions.

under: Uncategorized

Ghits in the Wild

Posted by: afamigl | March 5, 2009 | 11 Comments |

Captured this just a few minutes ago. This screen capture shows a page created today that has since been proposed for deletion. The editor making the proposal uses the small number of Ghits and Gnews (Google News) hits as grounds for establishing the article as non-notable.

Screen Cap is a bad way to do this, but only quick and dirty way to record this… since it is likely to all be deleted!

under: Uncategorized

Sick Day Update

Posted by: afamigl | March 5, 2009 | No Comment |

I’d hoped to have several more pages up here today, but I’ve been struggling with a chest cold. Because some folks are watching this space for timely data, I’m going to post a few quick notes here just to communicate some basic information. I’ll come back tomorrow and pretty this up and expand my thinking.

Articles that are not eligible for the speedy deletion process can be deleted through a more time intensive process. This process involves the article being Nominated for Deletion, at which point a section will be created for the article on the page listing articles being considered for deletion (AfD – Articles for Deletion). On the AfD page, editors will debate the relative merits of either keeping the article on wikipedia, or deleting it. When an administrator determines that consensus has been reached, he or she will “close” the AfD page as either “Keep” (to retain the article) or “Delete” (to remove it), ending debate. This administrator will then go on to carry out deletion of the page if necessary.

Google and other search engines clearly play an explicit role in many deletion debates. Of the about 30 articles that were Nominated for Deletion out of my sample from October 1 + 2, 2008, 8 have AfD entries in which one or more editors cite Google or other search engine results as evidence for either retaining or deleting an article. In several cases, editors say they make use of Google Scholar, Google Books, Google News and other specialized search products in an attempt to find sources and either establish or discount notability for the subject of the article in question. In two cases, the shorthand “ghits” for “google hits” is used by editors.

More qualitative write up on what exactly is said and what I think it means as soon as my lungs clear. For now, here are URLs to the relevant AfD entries for those who are curious:

http://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/A_Loo_with_a_View

http://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/Bionumbers

http://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/Brandahn

http://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/Cat_Repellers

http://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/Jonas_%C3%B6stlund

http://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/ManhattanGMAT

http://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/Tim_Palmer_(NZ)

under: Uncategorized

Wikipedia and Search – Some Quick Numbers

Posted by: afamigl | March 4, 2009 | 10 Comments |

Over the next several days I’m going to be posting some of my findings about and analysis of the relationship between search engines (specifically google) and Wikipedia. On a post on the University of Amseterdam’s “Masters of Media Blog” (here) Arno de Natris suggests that Wikipedia writers use Google as a primary means for determining whether or not something exists. “If you are not on Google,” he writes, “you never existed.” He presents as evidence a hoax text he attempted to insert into Wikipedia, which was refuted via this Google test.

My results suggest there is something to Arno’s argument, but that things may be a bit more complex than that one case might make them appear. Just to get things rolling, I would like to post tonight some quick numbers extracted from my data blog, (here) which is a recording of all (or at least nearly all) of the articles that users attempted to create on Wikipedia in the 24 hour period between 8am October 1, 2008, and 8am October 2, 2008. I recorded these by setting an RSS reader to download Wikipedia’s new article RSS feed once every five minutes for that 24 hour period (it is possible this may have resulted in a few articles being lost). This data is useful because it allows us to track which of the articles created in this 24 hour period were later deleted, even if they were deleted extremely quickly.

There were 1043 attempts to create articles during this 24 hour period. When I rechecked the articles approximately 5 months later, on March 2 2009, I found that 410 articles had been deleted and 633 retained.

Of the 410 articles deleted, the vast majority, about 350, were subject to what is called a “speedy delete” process. A bit of explanation as to what that means. Deleting an article from wikipedia is a bit of a big deal. Wiki’s great strength is that it keeps records of everything, every edit to an article, so as to make the Wiki process transparent, and to allow for bad edits to be reversed. For this reason, deleting a well-established article requires a quite deliberative process, involving on site debate and consensus seeking.

However, as wikipedia became larger, and more attractive to vandalism, it quickly became apparent that this deliberative process could not keep up with the vast numbers of articles being created. So for certain categories of offense, it was agreed that any administrator could “Speedily Delete” a recent article, so long as they noted that the article fit into one of the agreed upon categories. Today these “Criteria For Speedy Deletion” (CSD) encompass dozens of possible reasons for nixing an article, organized into several broad groups.

Of the 350 articles created in my test period that were ultimately subject to Speedy Deletion, far and away the post common criterion cited was CSD – A7, which accounted for 134 articles Speedily Deleted. CSD-A7 is explained by the Policy page describing CSD that would have been in effect on October 1 & 2 2008 as pertaining to:

An article about a real person, organization (band, club, company, etc.), or web content that does not indicate why its subject is important or significant. This is distinct from questions of verifiability and reliability of sources, and is a lower standard than notability; to avoid speedy deletion an article does not have to prove that its subject is notable, just give a reasonable indication of why it might be notable. A7 applies only to articles about web content or articles on people and organizations themselves, not articles on their books, albums, software and so on. Other article types are not eligible for deletion by this criterion. If controversial, as with schools, list the article at Articles for deletion instead.

That is to say, CSD – A7 is intended to allow for the speedy deletion of articles about real people or organizations when the text of the article does not contain any meaningful assertion of why said person or group might be important enough for listing in an encyclopedia. Take, for example, this charming attempt to add an article on someone named “Sam Hoza:”

Sam Hoza iz a stud. yezzir

Clearly, this sort of petty vandalism, the equivalent of scratching one’s name onto the digital wall that is Wikipedia, can be speedily deleted by a Wikipedia admin without any real need to do much research work.

However, other articles that were Speedy Deleted under CSD – A7 during the test period are not so clearly lacking in claims of importance or significance. Take for example, this attempt, also deleted under CSD – A7, to write an article on a Kurdish Artist named “Hozan Kawa”

Hozan ‘Kawa’ is a very popular Kurdish artist. He has a great love for Kurdish songs and poems. Kawa was born in a small place around Kurdistan of Turkey, instead (Palas) related to the city (Múshe). The city is famous for artists as Zeynike, Shakiro, Resho, Hüseyine Múshe and many other people.

His mother (Guleya Elif) said to him one day. Son after my death, I want you to remember never forget: If you want me to be proud of you and you lift my head up, where I shall love you. You must introduce the beautiful voice for the Kurdish people. They will never forget you and there you will be a legend in Music.

He has 11 siblings in her family. His father did many services to help him. He was a strong piece of ‘Kawa’ who would succeed to become a great artist.

‘Kawa’ started his studies in Kurdish. Besides his studies he began to learn Turkish. In Kawa’s world, he further in music. He really invested everything to be a professional Artist.

1987 in the city (Múshe), did he started a music group in the process to a large and useful artist carrier.

In 1995 he traveled to Europe and the country France. After a week in France, he started his career artist. In Newroz (1996) he was known by the group (Berxwedan). Now among the Kurdish, he is a big familiar face.

Between group Berxwedan he began to make his first album by the name ‘Ava Evine’. His album camed out in the year 2001. After that he published his second album ‘Taya Dila’.

Kawa eventually also gaved the third album ‘Ez ú Tu’. His second and third albums did make him to a amazing great artist. His voice is gold worth listening to. It’s just really fantastic for our Kurdish opinion.

‘Kawa’ has released the fourth album also more information about it coming later.

This article clearly makes a claim as to its subject’s significance. It claims its subject is a “popular Kurdish artist.” Clearly, the admin who deleted this article had to make a judgement call as to whether or not this claim was to be taken seriously. That is to say, they had to make a judgement call as to the whether or not Hozan Kawa was, in fact, a notable Kurdish artist. Technically, such a decision would fall outside the letter of CSD-A7 and call for the more deliberate deletion process. However, given the perceived need of wikipedia editors for a rapid means for deleting what they see as”vanity pages” posted by musical groups, authors, artists, and others, it is perhaps not surprising to see CSD-A7 pressed into service as a mechanism for Speedily Deleting articles of these sorts deemed not notable.

It is here that the use of search engines enters the picture. Two of the suggested uses of “Search Engine Tests” for policing articles on Wikipedia, as given by Wikipedia’s “Search Engine Test How-To Guide” are:

3. Genuine or hoax – Identifying if something is genuine or a hoax (or spurious, unencyclopedic)

4. Notability – Confirm whether it is covered by independent sources or just within its own circles.

Thus, it seems reasonable that, given the large number of articles Speedily Deleted under CSD-A7 and given the evidence that CSD-A7 is being used by admins as a means for Speedily Deleting articles on subjects they believe are non-notable (perhaps even hoaxes), that testing using Google and other Search Engines might play an important role in informing their decision making process in some of these cases. This is especially so given the short period of time Speedy Delete decisions are often made in, making the rapid information retrieval made possible by a search engine even more attractive.

A further 17 articles were deleted under CSD-G12, which is to be used for cases of “blatant copyright infringement.” Detecting such infringement is another of the suggested uses of Search Engine Tests listed in the how to guide, and the use of search engines was suggested by Jimmy Wales as long ago as 2001 as a means of discovering copyrighted material being inserted into Wikipedia.

We can see that many Speedy Deletes may be informed by the use of Search Engines. While in some of these cases Speedy Deletes may occur because Google suggests the subject of an article “does not exist,” in other cases it may be that Google results simply imply the subject’s existence is not important enough to deserve listing in an encyclopdia.

This can also be seen in many cases of the more deliberative deletion process. (CONTINUED TOMORROW)

under: Uncategorized

Definitions: Peer Production

Posted by: afamigl | March 3, 2009 | 21 Comments |

(Note: In the following Fragment of my Introduction I define how I will use the term “Peer Production”)

Both “peer production” and “Cultural Studies” are terms which require some definition before moving forward. I borrow the term “peer production” from the work of law professor Yochai Benkler, who devotes his 2006 magnum opus, The Wealth of Networks to explaining this idea and a related notion, “social production,” and arguing that these new productive forms are playing an increasingly important role in our society. In this book, Benkler defines “peer production,” as “a new modality of organizing production: radically decentralized, collaborative and nonproprietary; based on sharing resources and outputs among widely distributed, loosely connected individuals who cooperate with each other without relying on either market signals or managerial commands.” (Benkler, 2006, p. 60) This dissertation will seek to examine several terms of Benkler’s definition more closely, and will suggest that some of the terms he uses to describe peer production may need to be revised, however Benkler’s sense that “a new modality of production” is arising in conjunction with digital media is one supported by a variety of other scholars (Shirky, 2008; Lessig 2001; Von Hippel 2005; Jenkins 2006) and an important starting point for my analysis.
Benkler identifies two other key features of peer production that should be noted here. First, he points out that peer production is based around “sharing resources and outputs,” that is to say, peer production is based around a “commons,” a set of resources free for (as Benkler sees it) anyone to use. In fact, Benkler sometimes calls “peer production,” “commons-based peer production.” (Benkler, 2006, p. 60) I have chosen the shorter version for use for the sake of simplicity. Second, Benkler argues that peer production is based on, “loosely connected individuals who cooperate with each other without relying on either managerial commands.” Thus, those participating in this mode of production are “peers,” lacking formal systems of hierarchy amongst themselves. While several things may complicate both of these notions, the peer production ideals Benkler articulates represent a break from the usual assumptions of contemporary capitalist political economy.
The term peer production also needs to be located within a particular historical context to be meaningful. Throughout history, a wide variety of methods of production have been experimented with, many of which incorporated forms of shared property and non-hierarchical organization. I will use peer production to refer to a method which has evolved in the developed world in conjunction with digital media over the course of the last 30 years. The history of this productive mode can be traced to its origins in the Free/Open Source Software movement (FOSS).”The quintessential instance of commons-based peer production,” Benkler writes, “has been free software.” (Benkler, 2006, p. 63) He later argues that, “Free software has played a critical role in the recognition of peer production, because software is a functional good with measurable qualities.” (Benkler, 2006, p. 64) Legal scholar Lawerence Lessig, perhaps the best known advocate for modifying Intellectual Property law to make it more favorable to the practices of peer production, attributes Free Software Richard Stallman as the source of many of his own theoretical insights. In his introduction to his 2004 book Free Culture, Lessig writes, “the inspiration for the title and for much of the argument of this book comes from the work of Richard Stallman and the Free Software Foundation.” (Lessig, 2004, p. xv) The Free Software Foundation’s General Public License was the first intellectual property licensing scheme designed to guarantee work released into a shared commons of information resources could not later be appropriated from that commons as private property, an important foundation for peer production (Benkler, 2006; Lessig 2004; Kelty, 2008; Stallman, 2002). The ability of volunteers spread throughout the world to coordinate work on the free operating system1 called Linux was one of the first examples of peer production in action, and one that convinced many that a new and interesting productive method was emerging (Benkler, 2006; Weber 2004; Lessig 2004; Raymond 2000a).

under: Uncategorized

Introduction: The Problem of Digital Media

Posted by: afamigl | February 22, 2009 | 20 Comments |

The failings of Mass Media have been widely remarked upon, practically since their inception. “Radio,” Bertolt Brecht famously complained in his 1932 essay The Radio as an Apparatus of Communication, “is one sided when it should be two. It is purely an apparatus for distribution, for mere sharing out.” Brecht argued that, if it could overcome this limitation “radio would be the finest possible communication apparatus in public life.” Sadly for Brecht and others like him, this was not to be. Instead, radio, shaped by the physical limits of spectrum scarcity, by tight political control exercised by states over the public airwaves, and economic limits imposed by expensive transmission equipment, became a one-way mass medium, in which only a few had the privilege to speak to the many.
Decades after Brecht made his remarks, a medium with the potential to achieve the sort of two-way communication he hoped for in radio was emerging, though not from a place Brecht would have thought to look for it. In the early 60s, the U.S. Defense Department’s Advanced Research Projects Agency (ARPA) was looking for a way to link researchers and computing resources in widely dispersed sites throughout the United States. Drawing off the theoretical insights of Norbert Wiener’s Cyberneitcs, which suggested ways humans and machines could be linked in two-way, interactive circuits of communication, ARPA researchers designed a suite of technologies allowing for the interconnection of otherwise incompatible computers and computer networks. (Hafner, 1998) By the late 80s, the “network of networks,” first called the ARPAnet (after the agency) and later the Internet (for the “internetworking” protocols that made it work), was thriving, linking together academics from around the world and across disciplines in multi-way conversations. Alongside the Internet, clusters of electronic bulletin boards and teletext messaging services – such as San Francisco’s famous WELL, and the French Minitel system – were allowing users to share information, expertise, and life experience with each other. Author Howard Rheingold, working from his experiences on the WELL, argued that these users were in the process of forming “virtual communities” based on their electronic communications. As isolated bulletin boards and services linked up to the growing Internet in the last years of the 1980s and early 1990s, these “virtual communities” promised (or threatened) to fuse into one, vast “two-way radio.”
By the beginning of the 21st century, however, the pendulum seemed to have swung back the other way, towards one to many mass-media. The Internet’s “killer app,” the World Wide Web, ushered in an age of readily available information, but it also helped create an environment in which the Internet served primarily as a distribution mechanism for this information. As a short piece in the December 9, 2001 edition of the New York Times puts it, “despite the popular conception of the Internet as our most interactive medium, on the great majority of Web pages the interaction all goes in one direction.” (Johnson, 2001) However, the piece goes on to note that “an intriguing new subgenre of sites, called WikiWikiWebs,” are bucking this trend by creating sites where “users can both read and write”(Johnson, 2001). The Times describes Wikis as “communal gardens of data,” in which volunteer participants work together to grow and nurture site content. It is particularly interested in calling attention to “the most ambitious Wiki project to date,” an attempt to apply, “this governing principle to the encyclopedia, that Enlightenment-era icon of human intelligence.” The name of this project, Wikipedia.
The origins of Wikipedia date to January of 2001, when project co-founder Larry Sanger announced its existence in an informal post to the mailing list of a prior web based encyclopedia project, Nupedia. Sanger asked readers of his post to “Humor me. Go there and add a little article. It will take all of five or ten minutes.” (Sanger, 2001a) When the Times took note of the project about a year later, Wikipedia had already grown to encompass 16,000 articles (Johnson, 2001). Seven years later, in 2008, the English-language Wikipedia included over two million articles, and the project has grown to include dozens of other languages, many with thousands or hundreds of thousands of articles of their own.
These projects have not only grown in size, they have also taken on considerable cultural visibility and significance. This is perhaps most clearly demonstrated by the case of Sarah Palin’s Wikipedia article. On August 29, 2008 John McCain, then the Republican nominee to be President of the United States, announced that he had chosen Alaska governor Sarah Palin to be his running-mate. News programming on the night of the 29th was dominated by this development, understandably so, given that the nearly unknown Palin was only the second woman to be nominated to run for the Vice-Presidency of the United states, and the first to run as a Republican. Among the many stories to air was a small segment on National Public Radio’s program “All Things Considered” discussing changes made to the article on Palin in the online encyclopedia, Wikipedia (Nogichi, 2008). The NPR story contended that the Wikipedia article on Palin had undergone a frantic round of editing the night before McCain made his announcement (an announcement that took almost everyone by surprise), and had been altered in a way that tended to enhance Palin’s image. These facts lead one Wikipedia editor to suspect the page had been changed by someone connected to the campaign.
The fact that a national news program was willing to devote time to the ins and outs of Wikipedia editing, a practice which can border on the arcane, on the day of a historically significant political announcement is one indication of the cultural salience Wikipedia has achieved in its seven years of existence. Furthermore, if the allegations of tampering are true, it would suggest that a national political campaign may have considered Wikipedia important enough to take the trouble of altering it to show their Vice Presidential candidate in the best light before she was publicly announced. In one sense, this work was not in vain, since Wikipedia’s own statistics indicate that the article on Sarah Palin was viewed 2.5 million times on the day of her announcement alone, with additional views over the next two days bringing the article up to a grand total of four million views for the last three days of the month of August (Wikipedia Article Traffic Statistics 200808). The month of September would see the article viewed another six million times (Wikipedia Article Traffic Statistics 200809). The campaign erred, however, if they hoped that these readers would be seeing their preferred revision of the article, since a flurry of editing activity quickly changed the information presented, as Wikipedia volunteers tried to bring the article into line with their professed ideal of “neutral” information.
For those in academe, it comes as no surprise that people are turning to Wikipedia for information in large numbers. Almost all in the teaching profession can relate an anecdote about students for whom the search engine and Wikipedia article are the first, last and only tools of research, much to their professors’ chagrin. The high ranking given to Wikipedia articles by the popular search engine Google almost certainly helps to make Wikipedia a popular source of knowledge. Siva Vaidynathan cites a 2008 article in the Chronicle of Higher Education which reports that a study done at the Hoover Institution finds that searches for 100 common “terms from prominent U.S. and world history textbooks” return Wikipedia as “the No. 1 hit […] 87 times out of 100” (Troop, cited in Vaidynathan, 2008). Vaidynathan goes on to note that he has “been trying to understand the rather rapid rise of Wikipedia entries in Google searches starting in 2007. In mere months, every search I did went from generating no Wikipedia results to having them at the top of the list.” (Vaidynathan, 2008) On a less anecdotal level, Nielsen’s Net Ratings finds that web traffic to Wikipedia has grown “nearly 8,000 percent” over the five year period from 2003-2008 and that “four of the five top referring sites to Wikipedia […] are search engines” (Nielsen Online, 2008). Web traffic data collection service Alexa lists Wikipedia number eight in its list of the 500 sites receiving the most web traffic, with only major search engines, the social networking behemoths Facebook and Myspace, and Microsoft’s MSN network receiving a higher volume of traffic.
If Wikipedia were just an Internet-accessible encyclopedia, its growing popularity as a source of knowledge would not be terribly remarkable. However, the “interactive” nature of how Wikipedia is produced, noted by the Times only a few months after the project’s inception and demonstrated by the activity on Sarah Palin’s article, marks a profound shift from the traditional methods of encyclopedia writing. Unlike past encyclopedias, such as the renowned Britannica, which were authored and edited by a hierarchically organized group of professional writers and editors working for a single firm, Wikipedia is produced by a loosely organized, largely egalitarian group of volunteers. Furthermore, once a traditional encyclopedia is published, the information presented in its articles becomes a fixed object of knowledge, at least until such a time as properly authorized experts are assembled to produce a new edition; Wikipedia, in contrast, has no fixed editions, rather the information on the site remains fluid, alterable at any time by (almost) anyone with a web-browser. Thus, Wikipedia represents not merely another collection of knowledge, but an example of a whole new means of creating knowledge, perhaps even a new “regime of truth,” to use Michel Foucault’s term.
Wikipedia did not invent this means of creating knowledge all on its own. Rather, it builds off of a larger movement I will call Peer Production, following Yochai Benkler, who has produced the most comprehensive theory of this form of production thus far (Benkler, 2006). Another important site where Peer Production takes place is the world of Free/Open Source Software (FOSS). Advocates of Collaborative Peer Production, including James Boyle, Yochai Benkler, Larry Lessig, and Henry Jenkins, hold that, under certain conditions, the traditional capitalist organizational methods of markets and firms may not be necessary for the process of producing information – whether that information is computer software or encyclopedia entries. Rather, if information is treated as a common resource, rather than as strictly controlled individual property, decentralized production will flourish as a variety of actors produce and share information for a variety of reasons. At first glance, this is exactly what appears to be happening on Wikipedia, where a wide variety of actors from all corners of the globe and from many walks of life: political operatives, concerned citizens, devoted fans, passionate scholars, and many others, contribute to the project on a volunteer basis and for their own reasons. The products of their labor are held in common, using a legal license developed by the Free Software Foundation called the GFDL (GNU Free Documentation License) that ensures that no one can treat Wikipedia articles as their exclusive property. The actual, physical computers that enable Wikipedia to exist are owned and operated by a not-for-profit foundation, called the Wikimedia foundation.
This decentralized, anti-propertarian method of producing information would, on its surface, seem to be of great interest to the academic discipline known as Cultural Studies. After all, the theorists Cultural Studies draws on have often concerned themselves with criticizing how mass-culture lends itself to processes of domination and exploitation: from Adorno and Horkheimer’s early work on the mesmerizing effects of the “culture industry,” to Stuart Hall’s exploration of how the consumers of mass-media might construct “oppositional codes” allowing them to resist the ideological biases of these media, to Delueze and Guattari’s celebration of the subversive potential of the “rhizome,” which connects everything to everything else, over the top-down hierarchy of the “tree.” Wikipedia, and the Peer Production form in general, would seem to have great potential to be put to use in the cause of the greater freedom and social justice Cultural Studies seems devoted to. Indeed, those involved in Peer Production have often themselves made similar calls for freedom and social justice. Wikipedia co-founder Jimmy Wales has often been heard to remark that the goal of the Wikipedia project is to “deliver the sum total of human knowledge to every human being on earth,” and Wikipedia’s explicit policy of including multiple points of view on a variety of topic seems broadly compatible with Foucault’s notion, cherished by Cultural Studies, of the “heterotopia” in which different systems of knowledge and power co-exist freely.
On closer examination, however, the relationship between Collaborative Peer Production and Cultural Studies is more complex. Whereas Cultural Studies, for all of its attempts to re-invent Marx, remains a discipline in which Marxist language and thought remain an important heritage, the practitioners of Collaborative Peer Production tend to treat Marx and Marxism as nothing but specters to be exorcised. Collaborative Peer Production, for all of its apparent post-modernism, remains a space where modernity and liberalism remain cherished ideals (Kelty 2008, Coleman 2004). Finally, whereas Collaborative Peer Production is a practical method for producing information, Cultural Studies is a largely theoretical discipline.
These differences, however, are exactly why Collaborative Peer Production and Cultural Studies can be, and should be, fruitfully combined. The theoretical insights of Cultural Studies, unbound by any practical need to “get things done,” can provide the means for pushing Collaborative Peer Production beyond its current envelope, and push it to more completely live up to its idealistic goals of freedom and justice. Collaborative Peer Production, in turn, can provide Cultural Studies with practical methods that might be employed in the cause of theoretical ideals, and real-world experience that can help refine and expand otherwise abstract theoretical notions. To understand how this might be achieved, first we must understand a bit more of the history of both Cultural Studies, and Peer Production.

under: Uncategorized

Blog/Diss Reboot

Posted by: afamigl | February 22, 2009 | 60 Comments |

This blog has been inactive for quite awhile, as things with the Diss have been in flux a bit. I’ve refocused my direction and have a new introduction. The language posted earlier will most likely be used for my upcoming digital art installation “Cyborg Vision: Situating Ourselves in Digital Space,” presented as part of the Battleground States Conference here at BGSU.

I’m going to try to start posting pieces of the newly rebooted Diss here on a regular basis in the coming weeks. Watch this space! Fragmentary but regular information to follow!

under: Uncategorized

Older Posts »

A Brief Note on Wikipedia and the Situation in Iran

Tasty Theory Clusters!

Search and Deletion of Wikipedia, part 3

Search and Deletion on Wikipedia, Part 2

Ghits in the Wild

Sick Day Update

Wikipedia and Search – Some Quick Numbers

Definitions: Peer Production

Introduction: The Problem of Digital Media

Blog/Diss Reboot

Categories

BGSU Stuff

Blogroll

Categories

Meta