Florian Cramer Peer-to-Peer Services: Transgressing the archive (and its maladies?) At the peak of their popularity, and just before they were shut down by court orders, Napster and Audio Galaxy were probably the most extensive public music archives of all time. Napster, the first popular incarnation of peer-to-peer data exchange services on the Internet, was the first global archive consisting of nothing more than the sum total of temporarily connected private archives without any sort of permanent existence, but rather, one that changed by the second, its catalogue being synchronously revised and rewritten. While older Internet services such as the World Wide Web took shape according to the conventional topologies of the archive and the library as places (*sites*), each with their own organizational schemata and access codes, what was being realized in peer-to-peer services was something Jacques Derrida had predicted with prophetic accuracy as early as May 1994 in an essay, *Archive Fever* (*Mal d'Archive*), that would make media theory blather about the Internet that followed seem outdated: *But the example of email is privileged in my opinion for a more important and more obvious reason: because electronic mail, and even more than the fax, is on the way to transforming the entire public and private space of humanity, and first of all the limit between the private, the secret (private or public), and the public or the phenomenal. This is not only a technique, in the ordinary and limited sense of the term: at an unprecedented rhythm, in quasi-instantaneous fashion, this instrumental possibility of production, of printing, of conservation, and of destruction of the archive must inevitably be accomplished by juridical and thus political transformations. These will effect nothing less than property rights, publishing and reproduction rights*.(1) Even more than email, peer-to-peer networks such as Napster, Gnutella, Kazaa and Freenet now show how radically the archive is being transformed by the digital transmission and storage of data. The fleeting and individual point-to-point data transfer of email is bound to the voluminous and globally accessible data stored on FTP servers and the World Wide Web. This combination calls the traditional location and the traditional architecture of the archive into question more radically than in the case of any other information technology that has gone before, including Ted Nelson's ultimately centralized concept of *hypertext*.(2) The archive is classically defined as a location at which artifacts and documents are selected from an external sources according to institutionally defined criteria, arranged internally and placed in relation to one another. In other words: Every archive first manages archived data and then the metadata of the archiving, often in the form of a catalogue. Because data usually already contain metadata (or paratexts) * books, for example, have tables of contents and indices; paintings, signatures; digital texts, mark-up codes and headers * they reveal microstructures of archiving which in turn must be integrated into the metadata of the archive. So the metadata of archiving is potentially infinitely complex and can be taken in its order-within-order to an infinite degree as an endless chain of metadata of metadata of metadata in the form of comprehensive catalogues, concordances, search engines and meta search engines. As anyone who has ever programmed a database or a software interface knows, the complexity of metadata and its encoding grows exponentially the more perfect, scalable and supposedly more user-friendly access to the data becomes. It's in this way that archiving becomes a second text, threatening to write over what has been archived and potentially wiping out the difference between the data object and the metadata. Jorge Luis Borges's *Library of Babel* contains, according to the speculation of the first person narrator, within a combinatorial framework, all the books that ever were, and so, also all of their descriptions and catalogues, but also all counter-arguments and antitheses of these descriptions and catalogues; even in its merely imaginary totality, the order of knowledge collapses. Borges's story is also referenced in Simon Biggs's software artwork *Babel*, a reprogramming of the Anglo-American Dewey decimal classification system as a cartographic Web browsing system so that, as the American Net art curator Steve Dietz writes, it becomes a *conflation of cataloging and navigation * of metadata (the cataloging information) and data (the website itself).*(3) The poetics and aesthetics of self-realizing metadata is also a theme of the Periodical Journal of Bibliography, published in the early 1990s by Grant Covell in Cambridge, Massachusetts, and exclusively recording fictitious books. Besides its data and metadata, an archive must also establish its rules of operation. Access codes are written: Opening hours, user identity cards and agreements, house rules, architectural borders and niches and, on the Internet, secure passwords, limits on bandwidth, licenses. With the migration of access to data networks, the coding of house rules and classical architecture is shifted to the machine-written control structures of software algorithms. Of course, the well-guarded, secret access to an archive is just as much a code of access as the radically open one. The anti-copyright appropriation of the sequestered Net art server hell.com, for example, by the Net art plagiarists 0100101110101101.org did not erase the codes of access, but rather, replaced more visible barriers with less obvious ones. So every archive is encoded at least three times over; first, in its archived data, second, in its metadata, and third, in its rules of access. Derrida questions the nature of the creator of these codes when he begins *Mal d'Archive* with the assertion that the archive *attains its meaning, its only meaning through the Greek *arkheion*: initially a house, a domicile, an address, the residence of the superior magistrates, the archons, those commanded.*(4) So he thinks of the archive only as an official institution and overlooks its unofficial divisions: The private archive as the place where private obsessions are collected, but also borderline areas between the official and the private such as Harald Szeemann's *Museum of Obsessions* which he claims is *not an institution but a 'life task'* while, on the other hand, it has already been institutionalized by his book of the same name, published by Merve.(5) As opposed to Derrida's archont archive, first, the private archive hides its location and its discourse, and secondly, the museum of obsessions defines both location and discourse negatively and contradictorily with its discourse of its refusal of discourse.(6) But what seems to apply to all kinds of archives is that, as Derrida asserts, documents in archives are only *kept and classified by virtue of a privileged topology*(7); that a private archive privileges its topology in the very sense that it keeps its data and metadata from the public and the museum of obsessions bears its privileged topology in the very title not because it collects obsessions but because it obsessively collects. The history of the Internet as well can be read as a history of archiving topologies and a relocation of privileges, constantly redefining the borders of the official, private and obsessive. First, all client-server architectures of the Internet are privileged topologies in the sense of Derrida's analysis of the classic archive. Their archonts are called system administrators and, on the level of the authorities, standardization committees such as the Internet Engineering Taskforce (IETF), the Internet Corporation for Assigned Names and Numbers (ICANN), the World Wide Web Consortium (W3C) and the Institute of Electrical and Electronics Engineers, Inc. (IEEE).(8) The infrastructure of Internet network protocols, particularly the fundamental protocol TCP/IP, could not function without the centralized assignment of and control over network addresses by ICAAN and the administration of hierarchically organized databases like that of the Domain Name System (DNS), assigning, for example, names like www.google.com to the IP address 216.239.39.101. This makes the Internet itself its own primary archive. If one reads IP addresses and domain names as primary and secondary titles or call numbers, then these numbers serve well enough as their own object data and metadata. So the archiving system has a sort of pre-existence to the content that is presumed to be stored within it. It is agnostic as far as the stored data is concerned even as it allows any number of layers of transport and access topologies on top of its essential structure. Among these layers are email, telnet, FTP, the World Wide Web and, most recently, peer-to-peer services conceived for PCs.(9) Telnet and FTP are among the oldest Internet services that would allow one to use or download data from a server via a terminal.(10) Both work according to the logic of the archont-controlled archive in that they are centrally supervised by a systems administrator, occupy, via their Net addresses, privileged spaces that can also be physically located, order data and metadata according to the hierarchical structure of file systems and utilize access codes in the form of user accounts, passwords and read-write permissions to data. By comparison, the World Wide Web isn't any differently structured than a FTP server is, but on the level of its (centrally standardized) document format and URL addressing scheme, it creates a third level of abstraction on top of the TCP/IP and server access protocol, suggesting to the reader a decentralized archive though, in fact, it creates only a meta-index within a self-enclosed archive space: *sites*. The World Wide Web also delineates a topological differentiation between privately held data and open data publication via the administratively controlled storage space of the server and the fact that its documents are usually only readable, not writeable. It is made manifest on the border between the PC and server hard drives. 0100101110101101.org, the artists of systematic exploration of archives and the borders between the private and the public on the Internet, draw attention to this limitation by turning matters inside out, particularly in their work life_sharing in which they store all their private data, including incoming email, on their public Web server. In theory, peer-to-peer networks are defined as contrary models to server architectures, but in fact, not only do all peer-to-peer services on the Internet rely on TCP/IP routing tables, the central archive of which is *Root Server A* run by ICANN, they are also often based on servers themselves. The oldest example is Usenet, offering discussion forums such as alt.artcom or de.comp.os.unix on its own layer of protocols since 1986 and, since 1988, the chat service IRC.(11) The data stored on both Usenet and IRC have no fixed storage location, but rather, wander from server to server in a chain-like hand-off to hand-off process. With their own client software * newsreaders and IRC programs * users connect to these servers but take part only indirectly in peer-to-peer data exchange. By the early 1990s, data was circulating around IRC and Usenet servers that would help give shape to the sort of data exchange via peer-to-peer clients: pornographic images and illegally copied software.(12) But the expansion of such private collections of obsessions out into public archives hampered the architecture of Usenet, decoupling the temporary data transmission from local storage and allowing systems administrators a wide range of control mechanisms, for example, the blocking of specific areas, restricted access codes for servers and the retroactive erasing of data fed into the networks by users. Napster, the first peer-to-peer service on the Internet conceived for PCs with dial-up access, was also the first to change the rules. Napster made users aware of the fact that every home PC connected to the Net was not merely a terminal for surfing the Web or reading email, but rather, also a potential server. Downloading via Napster did not incur a detour via a server, but instead, occurred directly between two of its users' PCs. Brecht and Enzenberger's media utopias, envisioning receivers as broadcasters as well(13) became a reality with the advent of Napster. But Napster, too, was based on a client-server architecture. All the data sent out to the Net by user PCs was indexed on a central server. The Napster archive was indeed comprised only of temporarily connected private archives, but its data and metadata were decoupled and the catalogue remained symbolically and physically located at the institution known as napster.com. It was for this reason that the story of Napster is a prime example of why control over an archive lies not in the control of the data itself, but rather, in control of metadata and topologies. From the beginning on, the array of downloads available via Napster was limited in that, of all the data offered up by users, the catalogue server only recognized audio files in mp3 format. By artificially reducing its metadata to files ending with *.mp3*, the catalogue also in fact blocked out everything that it was not predetermined to recognize. The law suits that can be interpreted as changes to Napster's software code by resorting to an overreaching code of law(14) were first used by the music and copyright industry to remove copyright-protected songs from Napster's index, and shortly afterward, to shut down the catalogue itself, ultimately leading to the shutting down of Napster's Internet services altogether. After Napster, Gnutella was the most successful peer-to-peer service based on the Internet, in no small part because it radically did away with the central server and the differentiation between the client and server. Now it's not only data but metadata that circulates among connected PCs. Queries to the index, or catalogue, are handed along and answered by all connected computers, following the principle of a message passed along a list of telephone numbers. In this way, Gnutella doesn't recognize any one single point of failure and cannot be shut down in the way that Napster was. This tactical advantage of the Gnutella software architecture has its accompanying disadvantage, however, in the sheer volume of data that results when even a mere search query circulates among connected computers. Gnutella also does away with the limitation to the peer-to-peer data exchange to mp3 audio files. A brief, random glance at Gnutella queries in November 2002 turned up: chasey lain fuck on the beach.mpeg it.mpeg all leisure suit larry games.zip n64 emulator with 11 games.zip hiphop - dead prez - hip hop.mp3 hiphop - das efx - real hip hop.mp3 cypress hill - insane in the brain.mp3 addict mp3 neon genesis evangelion - episode 05 - 06 avi beach candy sextravaganza part 1.mpg kama_sutra_lesson_2.mpg leann rimes - life goes on (1) perl 5 by example - ebook.pdf animal passion avi jackass the movie avi formula51 - samuel l. jackson sex pistols anarchy in the uk 1 mp3 harry potter 2 chamber of secrets avi Even taking into account that this is a random sampling, it aligns itself rather well with the public image of Gnutella in that six of eighteen queries are for pop songs in mp3 format, four are for porn videos, two for digitalized mainstream Hollywood fare, two for TV series, two for computer games and one for a programmer's manual. If one were to rearrange the list alphabetically * *addict anarchy animal beach brain candy chamber dead evangelion formula51 fuck games genesis hiphop insane jackass kama_sutra leisure life neon passion pistols real secrets sex sextravaganza* * the result would be a dictionary of Gnutella mutual interests which would also serve as an everyday poetics of obsession. And this is how *minitasking* (15), a software program that visualizes Gnutella network nodes, queries and results on the computer screen, attains its (albeit unintentional) pataphysical irony from the real-time topography of obsessive search terms. The global museum of obsessions created by Gnutella by uniting private archives might seem trivial at first glance. But it's less what's on offer than the means of access that determines its triviality. In the summer of 2002, for example, combinations of Gnutella queries turned up pirated copies of Jorge Luis Borges's *Ficciones* and novels by Vladimir Nabokov and Thomas Pynchon, along with recordings of the music of Stockhausen and LaMonte Young that hadn't been commercially available for ages. But aside from highbrow culture on the one hand and mainstream pop, movies, video and porn on the other hand, other private archives of obsessions released via Gnutella turn up only when searching not for content, but rather, agnostically for data formats. Combining the search queries *DSC [and] MVC [and] jpg*, for example, calls up data created by Sony cameras that has not been renamed and often leads to a surreal collective archive of digital amateur photography whose aesthetic range spreads from Walker Evans to Nobuyoshi Araki. The obscenities in particular are surprisingly neither dull nor pornographic when, for example, a body that has been optically blurred in the anonymously reproduced digital image *dsc010015.jpg* results in its being sexually focused. But when the total of small archives of obsessions is no longer determined by the individual collector and the topology of the collection, but rather, is a snapshot of the moment, comprised of coincidental correspondences of search terms and data names, the filtering function of metadata is made especially clear for peer-to-peer archives.(16) Accessing digital codes in these ways * whether they're presented later as text, audio or images or implemented as algorithms * can lead to unique artistic forms, as shown by the musical genre known as *bastard pop* in which mainstream pop songs are digitally (and usually anonymously) sliced and diced together. One characteristic of Bastard Pop is an aesthetic of intentional dilution and juxtaposition of opposites, such as Girls on Top's *I Wanna Dance With Numbers*, a combination of Whitney Houston's vocals and the electro pop of Kraftwerk. That bastard pop arose along with Internet peer-to-peer services is hardly a coincidence; the anonymous remixers usually gather their musical material as well as their music software from Gnutella and Co. Bastard pop, then, is the first popular musical form to come about from the Internet and globalized private archives, reciprocating its origins in these archives in its aesthetic of plagiarism. But the political dialectic of the switch from receiving to broadcasting apparatuses is also revealed in bastard pop. Because, from a legal perspective, peer-to-peer archivists are no longer private persons, but rather, publishers, and their collections of data are no longer private obsessions, but rather, a mass media distribution of copyright-protected content. It's not just legally but also technically that the difference between the act of storing and retrieving on the one hand and mass media transmission on the other collapses in networked computers unless one were to arbitrarily set limits based on the length of wires. Derrida also attests to the classic archive, which can be clearly located and is controlled by a clearly defined authority, an *institutional transfer from the private to the public*(17) which becomes quite a problem when it comes to peer-to-peer archives, one that is increasingly reflected even on the level of the algorithmic coding of software architectures and access topologies. Perhaps the next evolutionary level of Internet-based peer-to-peer services, still in its experimental stage, will be anonymous architectures such as Freenet and GNUnet, which make users anonymous and encrypt data with powerful cryptography. What's more, they not only, like Gnutella, automatically transfer the metadata of search queries along all connected computers, but also the data itself. So it's not only the archive of data on offer and its self-generating metadata that flows as a unit but also the locations of the data's storage. Shutting out any sort of surveillance and control by third parties is the self-proclaimed goal of the developers. One reads on the Freenet homepage: *Freenet is free software designed to ensure true freedom of communication over the Internet. It allows anybody to publish and read information with complete anonymity. Nobody controls Freenet, not even its creators, meaning that the system is not vulnerable to manipulation or shutdown*(18), while the GNUnet developers define their project as *anonymous censorship-resistant file-sharing*.(19) Local provider administrators can block these services by blocking their TCP/IP ports, but even this can be gotten around with a certain level of skill by steganographically *tunneling* Freenet or GNUnet data traffic through other Net protocols, such as requests for Web pages or email transfers. Does this spell the end of all privileged topologies of the archive? Certainly not. First, all peer-to-peer archives privilege certain information and usage by replacing the classic synchrony of the archive with a diachrony, that is, replacing its artifacts, temporarily caught up in ideally timeless spatialization, with a radical momentariness and instability of the archive. The unit of individual museums of obsessions evaporates in momentary states and in the proximity of the network's search terms. And newer peer-to-peer networks, too, don't change the privileging aspect of metadata * that is, file names * as the only, if unreliable access register of the archive. The attempts of the music and copyright industry to sabotage peer-to-peer archives with tactically false file names for trash data is a mere foretaste of problems to come. And finally, the architecture of the archive remains a privilege of programmer-archonts even when their free software falls under GNU-Copyleft (as in the cases of Freenet and GNUnet). Which is why claiming that *nobody controls Freenet, not even its creators* is just as naive as any assumption that an anonymity can be achieved merely through cryptographic *privacy*, undermining every private photo that accidentally leaks out onto the Net. So the true borders between the private and the public end up migrating to the level of personal computer file systems, or more precisely: the border marked by the directory (folder) opened, freeing up its content, complete with all subdirectories, for peer-to-peer downloads. These borders become all the more precarious the more classical mnemotechnical systems the computer absorbs * from calendars and photo albums to correspondence * recoding and refining them as software, and the more records, thanks to the growth of storage capacity (which, as opposed to the growth of computation speed along the trajectory set out by Moore's Law, has not yet been given its fair amount of attention), may be united within a storage media. So the PC is not only increasingly becoming a warehouse of biographical traces, it's also becoming a biography in and of itself in the literal sense of the writing of a life. Hard disks are becoming identity protocols, their data, intimate stories. The lyrics of Roberta Flack's soul hit of 1973, *Telling my whole life with his words, / Killing me softly with his song*, would be no less persuasive, if not quite as lovely, if they were rewritten as, *Telling my whole life with my files / Killing me softly with my hard disk*. It's quite conceivable that no artist's complete works and biography will be able to be written without a hard disk *dump*, a bit-by-bit copy of its contents if, in the meantime, storage technologies haven't become defective, back-up copies haven't been destroyed and partial biographies haven't been killed off softly. Word that such a *headcrash* is a mnemotechnical meltdown and often an economical one as well may well have gotten around; less widely known is that it is becoming a cultural meltdown as well. But wherever systems of data security fail, file-sharing networks, due to their unsystematic means of data transfer, could well become the future back-up media and the underground of cultural memory. (For Gert Mattenklott) Footnotes: (1) Jacques Derrida, *Mal d*Archive* [Der95], pp 17-18: *Mais je privilégie aissi l*indice du E mail pour une raison plus important et plus évidente: parce que le courrier électronique est aujourd*hui, plus encore que le Fax, en passe de transformer tout l*espace public et privé de l*humanité, et d*abord la limite entre le privé, le secret (privé ou public) et le public ou le phénoménal. Ce n*est pas seulement une technique, au sens courant et limité du terme: à un rhythme inédit, de façon quasi instantanée, cette possibilité instrumentale de production, d*impression, de conservation et de destruction de l*archive ne peut pas ne pas s*accompagner de transformations juridiques et donc politiques. Celles-ci affectent, rien de moins, le droit de propriété, le droit de publier et de reproduire.* (2) Centralized in the sense of his concept of a universally agreed upon format for documents and metadata and a centrally controlled *transcopyright* as well as a centralized means of handling fees. (3) Steve Dietz, Reverse Engineering the Library, Simon Bigg*s *Babel* http://hosted.simonbiggs.easynet.co.uk/babel/intro.htm (4) [Der95], p. 9: *le sens de ,archive*, son seul sens, lui vient de l*arkheîon grec: d*abord une maison, un domicile, une adresse, la demeure des magistrats supérieurs, les archontes, ceux qui commandaient.* (5) [Sze81], p. 125 (6) loc. cit. p.127 and p.136 (7) [Der95], p. 13: , *[...] les documents, qui ne sont pas toujours des écritures discursives, ne sont gardés et classés au titre de l*archive qu*en vertu d*une topologie privilégiée.* (8) A more detailed analysis of the political regulation of the Internet has been written by the political scientist Jeanette Hofmann in [Hof00] (9) which many assume, falsely, to be unique networks because their network protocol layers and user programs deviate from the topology of the underlying layer of the TCP/IP protocol. (10) The telnet specification RFC 318 dates back to 1973, the FTP specification RFC 454 to 1973. (11) The specifications of the Usenet protocol NNTP may be found in RFC 977, the IRC protocol was first specified in 1993 in RFC 1459. (12) One early document concerning this is Ursula Ott's article on pornography in university networks in the feminist magazine *Emma* in December 1991. (13) S. [Enz70] and [Bre32] (14) For more on the analogy of software code and law, see [Les00] (15) http://www.minitasking.com (16) The World Wide Web, on the other hand, has no real-time index; its search engines attempt one via their full-text indices, but confuse the difference between indexed data and indexed metadata by assuming a one-to-one relationship between the latter and the former. (17) [Der95], p. 13 (18) http://www.freenetproject.org (19) http://www.gnu.org/software/GNUnet/ References: [Bre32] Brecht, Bertolt: Der Rundfunk als Kommunikationsapparat. In: Werke. Frankfurt: x: Suhrkamp, 1992 (1932), pp. 552*557 [Der95] Derrida Jacques: Mal d*Archive. Paris: Éditions Galilée, 1995. English translation: Diacritics, Summer 1995, Vol. 25, No. 2, pp. 9*63 [Enz70] Enzensberger, Hans M.: Baukasten zu einer Theorie der Medien. In: Kursbuch 20 (1970), pp. 159-186 [Hof00] Hofmann, Jeanette: Und wer regiert das Internet? In: Kubicek, Herbert (ed.); Braczyk, Hans-Joachim (ed.); Klumpp, Dieter (ed.); Roßnagel, Alexander (ed.): Global@home. Jahrbuch Telekommunikation und Gesellschaft 2000. Heidelberg: Hüthig, 2000, pp. 67*78 [Les00] Lessig, Lawrence: Code and Other Laws of Cyberspace. New York: Basic Books, 2000 [Sze81] Szeemann, Harald: Museum der Obsessionen. Berlin: Merve, 1981