Plans by the Massachusetts Institute of Technology (MIT) look set to transform the way in which academics publish and archive their results and raw data.

On 4 November, the institute will launch its DSpace electronic archive, a joint venture with technology firm Hewlett-Packard of Palo Alto, California.

The ultimate goal of the system is to capture MIT's entire intellectual output in a stable archive, and to extend it to create a seamless global network of similar archives at other research institutions. These multiple databases could be searched as if they were a single entity, and specialized collections could be built by drawing data from participating archives.

More than 40 academic institutions worldwide are considering adopting DSpace, and seven universities, including the University of Cambridge in Britain, are installing and evaluating the system. The move towards such archives is in part driven by libraries, which are looking for cheaper alternatives to costly academic journals.

The technology involved is not rocket science, admits MacKenzie Smith, associate director for technology at MIT Libraries and head of the DSpace project. The main innovation is in transforming the often anarchic sprawl of researchers' websites into a professionally managed system, she says. Documents uploaded onto Dspace must be carefully tagged with 'metadata' codes, such as subject keywords, which will help search engines to navigate the database.

DSpace should also spare MIT researchers much of the hassle of setting up and maintaining their own websites. The system supports a huge variety of file formats, allowing users to make anything from preprints to medical images available online. It also lets users attach other features to their documents, such as software that can manage access to restricted material.

But Smith warns that getting busy faculty members to provide high-quality metadata on their material is a big obstacle, and could deter many would-be participants. “You find that users are not beating down your door,” says Eric Van de Velde, director of library information technology at the California Institute of Technology, who has developed a similar, smaller-scale institutional archive.

To get enough data onto the system to make it useful, DSpace has grouped content around individual departments or labs that have well-organized publishing activities. The launch version of DSpace includes five of these 'early adopter' communities such as MIT's Department of Ocean Engineering and MIT Press. The groups have already placed about 750 documents on the database.

Smith says that a major demand from researchers is the storage of their raw data, such as medical images and other large data files. “A lot of these deserve to be managed and preserved as much as the publications that result from them,” she says. She is also talking to publishers about the possibility of linking papers in their journals to supplementary information and data stored in the DSpace network of archives.

DSpace may also provide a more solid foundation for preprints than existing servers, but it can also support journals and new forms of information sharing, which could help libraries to cut down on journal subscription costs. In the long term, many academic centres feel that they need to retain control of more of what they produce, and DSpace may be a first step in that direction.

http://www.dspace.org