Sir

As a member of the small community of people working on malarial parasites, I am a constant user of sequence databases from the consortia at Sanger Centre, TIGR and Stanford (Nature 405, 601 –602 2000). I am deeply grateful to these centres, which release all the data freely on the Internet, even in raw form, as soon as they are available.

According to the data-release policy of these three consortia, “these data will assist colleagues in their research, particularly in the search for genes and the studies of the genes' biological function” and “provide investigators with information that may jump-start biological experimentation”. New perspectives are being added to virtually all the ongoing research projects and many more projects will start, as well as other ‘brute force’ approaches such as roteomix and SAGE (serial analysis of gene expression). All these data together will be integrated by bioinformaticists into more sophisticated and complete databases.

I sympathize with those who say that a period of “non-hypothesis science” has started, but I don't agree with them. Good scientists will go on making good hypotheses, and the vastness of available data will only eliminate time-consuming and painstaking benchwork. As things stand, much laboratory work contributes to the efforts of the sequencers by providing clues for annotation and function. In the near future, many other hints will help to clarify the fine structure of the sequenced regions, for example when introns and exons are present that are too small to be revealed by the currently used algorithms, or when regulatory regions need to be identified.

For these reasons, I found the quarrel between the sequencing consortia and laboratory researchers reported in your News article very worrying. It would of course be wrong for a tribe of ‘annotators’ to develop, exploiting raw data only for their own publications instead of helping the sequencers. But, again, the release policy states that any investigator should ask permission of the consortia to publish data based on the content of the databases, and referees should easily be able to distinguish between a genuine contribution and data piracy. In this regard, the PlasmoDB database of Roos and colleagues represents an important tool for all of us and cannot be described simply as an “Internet portal” or connected with the concept of “piracy”.

Scientific careers are largely based on the amount and quality of publications, but it would be impossible for every new chromosome of every organism to be published in the highest-impact journals such as Nature and Science . Perhaps an online journal or supplement dedicated to sequencing and annotation work should be launched.

The most serious aspect of this situation is, however, not mentioned in the malaria article but is well put in another News report on the same pages: “Drive for more genomes threatens mouse sequence” ( Nature 405; 602–603; 2000).

Craig Venter of Celera Genomics is quoted as stating that the mouse genome is almost correctly assembled, directed by the blueprint of the human genome — that is, by fundamental work done by public enterprise and non-profit consortia. In spite of this, Venter says Celera's mouse genome will be restricted to subscribers.