Introduction

Viruses, the majority of which are phages (viruses that infect bacteria), are abundant and important components of marine ecosystems. Phages control microbial abundance, influence community composition through lysis of specific host organisms (Fuhrman and Schwalbach, 2003; Weinbauer and Rassoulzadegan, 2004), and are a critical link in global biogeochemical cycles (Wommack and Colwell, 2000; Weinbauer, 2004). In addition, phages contribute to bacterial diversity through transduction and lysogenic conversion (Jiang and Paul, 1998; Faruque et al., 1999; Dinsdale et al., 2008; Paul, 2008; Rohwer and Thurber, 2009). Direct counts with epifluorescence microscopy and flow cytometry indicate that there are approximately 107 virus-like particles in each milliliter of surface seawater (Marie et al., 1999; Wommack and Colwell, 2000). The majority of these viruses are believed to be double-stranded DNA (dsDNA) viruses (Wommack and Colwell, 2000; Weinbauer and Rassoulzadegan, 2004) containing genomes ranging from 25 to 70 kilobases (kb) in length (Steward et al., 2000; Sandaa, 2008).

Current knowledge of marine viruses is highly biased toward dsDNA viruses. The small genome sizes of single-stranded DNA (ssDNA) viruses (1–9 kb) (LeClerc, 2002; Fauquet et al., 2005) have been a major obstacle for the examination of marine ssDNA viruses, potentially leading to an underestimation of viral abundance and diversity in the oceans. Typical methods for direct viral counts, such as epifluorescence microscopy and flow cytometry, are unable to enumerate these viruses because of the weak fluorescence signal produced (Tomaru and Nagasaki, 2007). The small, circular ssDNA genomes of these viruses also exclude them from pulsed-field gel electrophoresis studies of viral community diversity (Steward, 2001). Moreover, early metagenomic studies utilized methods that only captured the diversity of the dsDNA viral community (Breitbart et al., 2002, 2004), excluding the ssDNA viruses.

In spite of these methodological limitations, several recent discoveries have suggested that ssDNA viruses are more prevalent in the marine environment than previously recognized. In 2005, the Chaetoceros salsugineum nuclear inclusion virus, a ssDNA virus that infects a bloom-forming diatom, was the first ssDNA virus to be discovered in the marine environment through infection studies of algal cultures with natural viral communities (Nagasaki et al., 2005). Subsequently, another ssDNA virus was identified that infected a related diatom species (Tomaru et al., 2008). In addition, a ssDNA prophage was obtained in culture through induction of a Synechococcus isolate from the Gulf of Mexico (McDaniel et al., 2006). However, at this time, the number of marine ssDNA viruses that have been cultured in the laboratory is extremely limited. Metagenomic studies including a multiple displacement amplification step, which is known to enrich for small, circular, ssDNA genomes (Haible et al., 2006; Kim et al., 2008), have identified numerous sequences from ssDNA viruses (including both phages and eukaryotic viruses) in marine environments (Angly et al., 2006; Wegley et al., 2007; Desnues et al., 2008; Rosario et al., 2009). However, the ecology of the ssDNA viruses identified through metagenomic sequencing is completely unknown.

Here, we describe two complete marine ssDNA phage genomes belonging to the Microviridae, which were reconstructed from a viral metagenome from 80 m depth at the Bermuda Atlantic Time-series Study (BATS) site in the Sargasso Sea. The depth distribution of each of these ssDNA phages at the BATS site was determined at several time points over a 2-year period through PCR amplification of their replication initiation protein (Rep) gene. Finally, the diversity of one of the phage groups was examined in temporal samples collected from the BATS site, as well as spatial samples from transects through the North Atlantic Ocean.

Materials and methods

ssDNA phage genome assembly and analysis

Previously sequenced viral metagenomic data from 80 m depth in June 2005 from the BATS site (Hydrostation S: 32°10′N, 64°30′W) (Angly et al., 2006) in the northwestern Sargasso Sea were assembled in SeqMan (Lasergene DNASTAR, Madison, WI, USA) using criteria of 95% identity over 35 nt. All contiguous sequences (contigs) longer than 1000 nt were examined by BLAST to identify contigs with similarity to known Microviridae. Several contigs with similarities to the chlamydiaphage-like Microviridae were identified; however, no contigs exhibited similarities to the φX174-like phages. Two complete circular genomes related to the chlamydiaphage-like Microviridae were identified (SARssφ1 and SARssφ2; GenBank accession numbers HQ157198 and HQ157199), as well as two contigs representing partial genome sequences. For the two complete genomes, PCR was performed using several sets of primers designed throughout the genome to verify the genome sequence assembly. Putative Open Reading Frames (ORFs) of at least 300 nt with both start and stop codons were identified in SeqBuilder (Lasergene DNASTAR) and BLASTp was performed to determine similarity to sequences in GenBank (Altschul et al., 1997).

To determine the relationship of the novel phage genomes within the Microviridae, a phylogenetic tree was constructed from the family's three conserved ORFs (the major capsid, the minor capsid pilot and the Rep). Each set of protein sequences was aligned separately using the MUSCLE web server (http://www.ebi.ac.uk/Tools/muscle/index.html) hosted by the European Bioinformatics Institute (Edgar, 2004a, 2004b) and the aligned sequences were manually curated in BioEdit (Hall, 1999). The best-fitting amino acid substitution models were identified for the individual gene and concatenated alignments using ProtTest (Drummond and Strimmer, 2001; Guindon and Gascuel, 2003; Abascal et al., 2005). For the concatenated alignment, the LG+I+G+F model was implemented in PhyML to construct the maximum likelihood trees with the approximate likelihood ratio test as support for the branches (Guindon and Gascuel, 2003; Anisimova and Gascuel, 2006; Guindon et al., 2010). Phylogenetic trees were also constructed separately for each ORF (data not shown), which agreed with the concatenated alignment.

Sample collection and processing

To examine the depth distributions and temporal variability of the ssDNA phages, water samples were collected from the BATS station (31°40′N, 64°10′W) in August–October 2007, March 2008, June–December 2008 and July 2009 using Niskin bottles.

Polyethylene glycol precipitation (Sambrook et al., 1989) was used to concentrate viruses from the August–October 2007 samples. Briefly, 100 ml of whole seawater were 0.22 μm filtered, frozen and stored at −20 °C until processing. Samples were thawed at room temperature, solid polyethylene glycol 8000 was added at a 10% w/v ratio and samples were incubated overnight at 4 °C in the dark. Samples were centrifuged at 11 000 × g for 45 min at 4 °C, and the DNA was extracted from the pellet using a formamide extraction (Sambrook et al., 1989).

Samples of the viral community from June–August 2008, October 2008, December 2008 and July 2009 were processed using the small-scale filtering protocol described by Culley and Steward (2007). Whole seawater samples (50 ml) were filtered through a 0.22 μm Sterivex filter (Millipore Corp., Billerica, MA, USA) and then onto a 0.02 μm Anotop filter (Whatman, Maidstone, UK), which was frozen at −80 °C until extraction. DNA was extracted from the Anotop filters using a Masterpure complete DNA and RNA purification kit (Epicenter, Madison, WI, USA) with a protocol slightly modified from the manufacturer's instructions as previously described (Culley and Steward, 2007).

Large-scale viral concentrates (150–250 l) were collected from the surface and 100 m at the BATS site in March 2008 and September 2008. Each water sample was concentrated to a volume of <500 ml using a 100 kD tangential flow filtration unit (Hollow fiber filtration cartridge; GE Healthcare, Westborough, MA, USA). The viral concentrates were then filtered through a 0.22 μm Sterivex filter (Millipore) to remove bacteria, and stored at 4 °C until processing. Viruses were further concentrated through polyethylene glycol precipitation (as described above) and resuspended in 8 ml of 0.02 μm filtered seawater. The polyethylene glycol-precipitated viral concentrates were then loaded onto cesium chloride density gradients, ultracentrifuged at 61 000 × g for 3 h at 4 °C, and the 1.2–1.5 g ml−1 fraction was collected (Thurber et al., 2009). DNA was extracted from the purified viral fraction using a formamide extraction (Sambrook et al., 1989).

Samples from two separate transects through the Northern Atlantic Ocean were taken to examine the biogeography of the ssDNA phages. Viral concentrates were processed from 20 l of seawater from station BVMS-8 (41°0.059′N, 64°59.869′W) in April 2008 using the large-scale viral concentrate protocol. The small-scale filtering protocol was used to extract viral DNA from samples taken in October 2008 at sites BV42-3 (30°40′N, 64°19′W), BV42-5 (28°40′N, 64°37′W) and BV42-9 (24°40′N, 65°13′W).

PCR amplification, cloning and sequencing

DNA from each sample was amplified using isothermal strand displacement with random hexamer primers and Phi29 DNA polymerase (GenomiPhi V2 DNA Amplification Kit; GE Healthcare, Chalfont St Giles, UK) following the manufacturer's protocol. It is known that this method preferentially amplifies small, ssDNA circular genomes in mixed communities by two to three orders of magnitude (Haible et al., 2006; Kim et al., 2008); therefore, no attempts were made to quantify the abundance of the ssDNA phage genomes.

To confirm that PCR-quality DNA was successfully extracted from each viral sample, PCR for the portal protein (g20) of cyanomyophages was performed on all samples (Sullivan et al., 2008). Primers CPS1.1 and CPS8.1 were used in 25 μl reactions consisting of 1 μl GenomiPhi-amplified template, 1X Sigma REDTaq buffer (10 mM Tris-HCl, pH 8.3, 50 mM KCl, 1.1 mM MgCl2, 0.01% gelatin), 1 μM of each primer, 0.2 mM dNTPs and 1 unit of Sigma REDTaq DNA polymerase (Sigma-Aldrich, St Louis, MO, USA). The cyanomyophage reaction conditions consisted of an initial denaturation step of 94 °C for 5 min, followed by 35 cycles of 94 °C for 1 min, 35 °C for 1 min and 72 °C for 1 min; then a final elongation step at 72 °C for 10 min. Negative controls were performed without target DNA for each set of reactions to ensure that there was no spurious amplification.

Primer3 (Rozen and Skaletsky, 2000) was used to design primers to the putative Rep of the newly described ssDNA phage genomes. The primers for the SARssφ1 genome were based on a consensus of one complete genome (SARssφ1) and two additional large contigs, whereas the primers for the SARssφ2 genome were designed to be specific to that genome. The primer sequences for SARssφ1 were: SARssconsF (5′-TCATAYACTGCGTAATAWACTTTCTKC-3′) SARssconsR (5′-CGAATTATATATATCMCCCGAATTRSA-3′). PCR was carried out in 50 μl reactions, using 1X Sigma REDTaq buffer, 1 μM of each primer, 0.2 mM dNTPs, and 1 unit of Sigma REDTaq polymerase (Sigma-Aldrich). The touchdown PCR reaction consisted of the following steps: 95 °C for 5 min, 40 cycles of (94 °C for 1 min, an annealing step of 60 °C that decreased by 0.5 °C each cycle, then 72 °C for 3 min), followed by a final extension step at 72 °C for 10 min. The SARssφ1 PCR product was 695 bp in length. The primers for SARssφ2 were SARssφ2_3371F (5′-AACACAAGCGGAAGACCACT-3′) and SARssφ2_4473R (5′-TGTTTAGCTGGCGGTTTCTT-3′). Reactions were carried out in 25 μl volumes, using 1X Apex Red Taq buffer (750 mM Tris-HCl, pH 8.5, (NH4)2SO4, 1% Tween20), 1 μM each primer, 0.2 mM dNTPs, 0.5 U Apex Red Taq DNA polymerase. Cycling conditions for these reactions were as follows: 95 °C for 5 min, 40 cycles of 95 °C for 1 min, 63 °C for 1 min and 72 °C for 1 min; followed by a final elongation step of 72 °C for 10 min. Negative controls without template DNA were included for each set of reactions. The SARssφ2 PCR product was 1102 bp in length. PCR products were cloned into the pCR2.1-TOPO vector (Invitrogen, Carlsbad, CA, USA), and the vector was transformed into chemically competent DH5α-T1 cells. Clones were screened by PCR to ensure inserts of the proper size were present, then sequenced with the M13F primer (5′-GTAAAACGACGGCCAG-3′) by Beckman Coulter Genomics (Danvers, MA, USA).

SARssφ1 Rep sequence analysis

Vector sequences were removed and the low-quality regions were trimmed using SEQUENCHER 4.5 (Gene Codes, Ann Arbor, MI, USA). Sequences were deposited to GenBank with accession numbers HQ142125–HQ142384. All sequences were aligned using the MUSCLE web server hosted by the European Bioinformatics Institute (Edgar, 2004a, 2004b) and manually curated in BioEdit (Hall, 1999). The best-fitting model of nucleotide substitution was identified for the alignments using jModelTest (Guindon and Gascuel, 2003; Posada, 2008). Phylogenetic trees were then constructed from the nucleotide sequences using the GTR+I+G model in PhyML, and the approximate likelihood ratio test values were examined for branch support (Guindon and Gascuel, 2003; Anisimova and Gascuel, 2006; Guindon et al., 2010).

To examine temporal variability, the alignments from all sequences obtained from the BATS site were visualized in Geneious Pro v5.0.4 (Drummond et al., 2010) to identify patterns of nucleotide substitutions. To examine spatial variability, sequences obtained from each location in 2008 were de-replicated at 99% sequence identity with gaps using FastGroupII (Yu et al., 2006) before alignments were created. The tree for the spatial samples was integrated with the geospatial data using GenGIS v1.06 and MapMaker v1.0 to create the two-dimensional georeferenced phylogenetic tree (Parks et al., 2009).

Mantel tests were performed using matrices of pairwise differences to determine if genetic divergence was correlated to the time of sampling or geographic location. The pairwise genetic distances were calculated using the GTR model, which was previously determined to be the best-fitting model of nucleotide substitution, implemented in DAMBE (Xia and Xie, 2001) and then manually formatted as a full matrix using Microsoft Excel. Samples collected from the BATS site in June 2005, August–October 2007, March 2008, August 2008, September 2008 and July 2009 were utilized for the temporal comparisons. For the spatial samples, the pairwise geographic distances were calculated for the sequences obtained from the BVMS-8 (March 2008), Hydrostation S (September 2008), BV42-3, BV42-5 and BV42-9 (October 2008) sites, as well as from the BATS samples from 2008. The BATS data set was limited to samples obtained in 2008 to reduce bias associated with any temporal signal. The latitude and longitude for each sample was input into the Geographic Distance Matrix Calculator (http://biodiversityinformatics.amnh.org/open_source/gdmg/) to generate a full pairwise distance matrix in kilometers. The temporal distance matrix and geographic distribution matrix were each compared with genetic distance using a Mantel test as implemented in XLSTAT, with 10 000 permutations of the data to determine significance.

Results and Discussion

Marine ssDNA phage genome sequences

Two complete ssDNA genomes were assembled from the previously sequenced viral metagenome from 80 m at the BATS site (Angly et al., 2006) and verified by PCR. These are the first completely sequenced ssDNA phages from the marine environment. The novel genomes represent a community genome, as they are a mixture of very closely related viral genomes. Genomic characteristics suggest that these viruses are novel members of the family Microviridae, which are icosahedral viruses with circular ssDNA genomes between 4.4 and 6.1 kb (Fane, 2005). Members of the Microviridae infecting a diverse range of hosts (including proteobacteria, Spiroplasma, Chlamydia) have been isolated and completely sequenced (Sanger et al., 1977; Renaudin et al., 1987; Storey et al., 1989; Lui et al., 2000; Read et al., 2000; Brentlinger et al., 2002; Garner et al., 2004; Rokyta et al., 2006). Genome comparisons have identified two distinct groups of Microviridae (those similar to the chlamydiaphages and those similar to Escherichia coli φX174), leading to the suggestion that ssDNA phages evolve through different mechanisms than dsDNA phages (Brentlinger et al., 2002; Rokyta et al., 2006). Although the Microviridae have been extensively studied, the sequenced representatives infect a small number of bacterial hosts. Expansion of the Microviridae family with environmental sequences will enable a better understanding of their evolutionary mechanisms and ecological impacts.

Both phage genomes (SARssφ1 and SARssφ2) exhibited similarities to known phages of the Microviridae in terms of size, GC content, genome organization and protein sequences. The SARssφ1 genome was 4487 nt with 43.9% GC and the SARssφ2 genome was 4478 nt with 46.9% GC. The genome organizations of these phages were syntenous with the other members of the Microviridae (Figure 1). The phage genomes were most closely related to partial ssDNA phage sequences identified in microbialites through viral metagenomics (Desnues et al., 2008), but also displayed similarities to cultured members of the Microviridae. Maximum likelihood phylogenetic trees of three concatenated ORFs (the major capsid, the minor capsid pilot and the Rep) demonstrated that these two genomes represent novel members of the Microviridae. The SARssφ1 phage genome clustered just outside of the known Chlamydia phages, whereas the SARssφ2 phage genome clustered with Bdellovibrio phage φMH2K (Figure 2). As these two novel ssDNA phages originate from the marine environment and infect unknown hosts, it is interesting that they clearly cluster with the chlamydiaphage subfamily of the Microviridae, and are only distantly related to φX174. This phylogenetic placement supports the divided nature of the Microviridae family, suggesting that intermediates between these two subfamilies may not exist because of the nature of evolution of ssDNA phages (Brentlinger et al., 2002).

Figure 1
figure 1

Organization of five major genes in the SARssφ1 and SARssφ genomes compared with two other members of the Microviridae (Chlamydia phage 1 (Chp1), and Bdellovibrio phage φMH2K).

Figure 2
figure 2

Phylogenetic relationship of the marine ssDNA phages compared with known members of the Microviridae based on the three conserved ORFs (the major capsid, the minor capsid pilot and the Rep). The phages included in the tree are Escherichia phage φX174 (NC_001422.1), Spiroplasma phage SpV4 (NC_003438.1), Bdellovibrio phage φMH2K (NC_002643.1), the Chlamydia phages 1–4 (NC001741, NC_002194, NC_008355, NC_007461, respectively), CPAR39 (NC_002180) and phiCPG1 (NC_001998).

Distribution of ssDNA phages throughout the water column at the BATS site

The depth distribution for each of the newly identified ssDNA phage genomes throughout the upper 200 m of the water column at the BATS site was determined using PCR for the Rep gene. The presence of high-quality phage DNA in each sample was verified through PCR for the g20 gene of cyanomyophages (dsDNA phages that infect cyanobacteria), which was found at all depths tested. In contrast to the cyanomyophages, the two ssDNA phages displayed narrower depth distributions, and these distributions were distinct from each other (Table 1). SARssφ1 was most frequently identified at 80 m depth (75% of the samples; n=8), but also recovered in at least 50% of the samples taken from 100 to 120 m. SARssφ1 was identified at depths ranging from 40 to 160 m, but was never recovered from surface waters (0 or 20 m) or the deepest samples (200 m). In contrast, SARssφ2 was most frequently detected at depths of 100 m or greater. SARssφ2 was never detected in samples from 40 m or shallower, and was only sporadically detected at 60 m. Interestingly, at 80 m, where SARssφ1 was most frequently detected, SARssφ2 was never found. SARssφ2 was found in >75% of the samples from 140 m or deeper, including 200 m, where this phage was detected in 80% of the samples (n=5). As 200 m was the deepest depth tested, it is unknown if SARssφ2 was also found deeper in the water column. The absence of both ssDNA phages in surface waters at the BATS site was verified through examination of large-scale (200 l) viral concentrates collected from the surface and 100 m depths at three unique time points. Although both ssDNA phages were consistently identified in the 100 m viral concentrates, neither phage was recovered from the surface water viral concentrates.

Table 1 Presence of SARssφ1 and SARssφ2 at the BATS site for a variety of depths sampled between 2007 and 2009. The presence of high-quality phage DNA in each of the samples was verified through PCR amplification of the cyanomyophage g20 gene

The distinct depth distributions of these two ssDNA phages suggest that they infect different hosts. Although some potential host microorganisms can be found throughout the upper 200 m, highly structured depth distributions, which vary seasonally in accordance with the degree of water column stratification, have been demonstrated for bacterial communities at the BATS site (Morris et al., 2005; Carlson et al., 2009; Treusch et al., 2009). No seasonal differences in the presence or depth distribution of the ssDNA phages were apparent from our data and the hosts for these phages still remain unknown.

Temporal variation in ssDNA phage sequence diversity at the BATS site

A portion of the Rep gene from phages related to SARssφ1 was amplified and sequenced from samples collected from the BATS site in June 2005 and at various time points between August 2007 and July 2009. The Rep gene primer sequences were designed based on the SARssφ1 genome as well as two other contigs assembled from the June 2005 viral metagenome that represented partial genomes of related ssDNA phages. Sequence comparisons revealed the recovery of four major viral types with these primers, which are distinguishable based on phylogenetic comparisons as well as patterns of nucleotide polymorphisms (Figure 3). Type I was only found in 2007, and this type contained the majority of sequences obtained in that year. Type II was the dominant type found in samples collected during August 2008, but also was present in samples from all the other years tested. Type III was dominated by sequences from March to September 2008, but also contained sequences from all the other years tested. Type IV consisted of sequences from 2007 through 2009. As many as three different ssDNA phage types were identified from a single time point.

Figure 3
figure 3

Temporal variation in the diversity of the Rep gene from SARssφ1 phages at the BATS site in samples collected between June 2005 and July 2009. A maximum likelihood phylogenetic tree was constructed, and a schematic of the DNA alignment was placed adjacent to its corresponding node on the tree. The gray bars in the sequence alignment represent positions where the sequences match to a consensus, whereas the various color bars correspond to nucleotide polymorphisms. Sample names on the phylogenetic tree are color coded by sampling date.

The evolution of ssDNA phages has been a topic of recent interest since studies have demonstrated that ssDNA viruses have the highest per-site mutation rates of all DNA-based systems and overall evolutionary rates nearly as great as RNA viruses (Duffy et al., 2008; Cuevas et al., 2009; Duffy and Holmes, 2009). Comparison of the sequences recovered from different dates demonstrates that the diversity of ssDNA phages at the BATS site changed over time, which was verified through a Mantel test demonstrating a correlation between genetic distance and temporal distance between sample collection (n=209; r(AB)=0.251; P<0.0001). Temporal variation in ssDNA phage sequence diversity is evident from Figure 3. For example, sequences belonging to types I and III were identified in August 2007, but only type I sequences were recovered in September 2007, and 1 month later (October 2007), sequences belonging to types I, II and IV were found. Comparison of the same month in consecutive years also supports changes in the composition of the ssDNA phage community over time. In August 2007, phage sequences belonging to types I and III were found; however, in August 2008, only types II and IV were recovered. Similarly, the September 2007 sample contained only type I sequences, but the September 2008 sample contained types III and IV. Despite these temporal changes, nearly identical sequences (99% nucleotide identity) were recovered from samples collected more than 4 years apart (June 2005 versus July 2009), suggesting that individual ssDNA phage sequences can be stably maintained in the marine environment.

Diversity of the ssDNA phages throughout the North Atlantic

To determine the geographic distribution of ssDNA phages in the North Atlantic Ocean and examine the sequence variability throughout their geographic range, the Rep gene of phages similar to SARssφ1 was amplified from different sites between Massachusetts and Puerto Rico in 2008. Throughout the transects, SARssφ1 was identified at depths of 80–100 m, but never found in surface water samples, consistent with the depth distribution of this phage at the BATS site. The georeferenced tree showed that the Rep sequences generally clustered by location (Figure 4). Sequences obtained from Hydrostation S clustered with sequences from the BATS site, which is likely due to the fact that these sites are only 56 km apart. Divergent sequences appeared with increasing distance from the BATS site, and a Mantel test indicated a positive correlation between genetic distance and geographic distance between sampling locations (n=177; r(AB)=0.573; P<0.0001). Although some sequences from sites BV42-3 and BV42-5 clustered with sequences from the BATS site (types II–IV), many sequences from these southern sites belonged to novel clades (types VI and VII). Each viral type contained a variety of nucleotide polymorphisms, but many of the changes were synonymous (Supplementary Figure 1). Although a diversity of ssDNA phage sequences was recovered from the southernmost (BV42-9; types VII and VIII) and northernmost (BVMS-8; type V) sites, these sequences displayed no genetic overlap with sequences identified from BATS, demonstrating that genetic distance of sequences increased with geographic distance.

Figure 4
figure 4

Spatial variation in the diversity of the Rep gene from SARssφ1 phages from different sites in the North Atlantic Ocean in 2008. Sequences were de-replicated at 99% sequence identity with gaps, then a maximum likelihood phylogeny was constructed and layered on top of a map. Samples are color-coded based on location. The SARssφ1 phage genome is indicated with a white star, and the other two contigs used to design the primers are also shown in white.

Differences between ssDNA and dsDNA phages

This is the first in-depth analysis of the prevalence and diversity of environmental ssDNA phages over depth, time and space. Several previous studies have examined the biogeographical distribution of specific dsDNA viruses and found that these viruses are widely distributed in nature (Kellogg et al., 1995; Short and Suttle, 2002, 2005; Breitbart and Rohwer, 2004; Labonte et al., 2009; Huang et al., 2010). The recovery of nearly identical sequences from disparate environments throughout the world has led to the suggestion that there is a shared global gene pool for viruses (Breitbart and Rohwer, 2005; Angly et al., 2006). In contrast to the cosmopolitan distribution of some dsDNA phage sequences (Breitbart and Rohwer, 2004; Short and Suttle, 2005; Huang et al., 2010), the ssDNA phages described here displayed narrower (and distinct) depth distributions and their sequence divergence correlated with geographic distance. These findings are consistent with previous observations for ssDNA phages in microbialite systems, which were found to have extremely narrow geographic distributions (Desnues et al., 2008). At present, the reason behind this difference in biogeography between the marine ssDNA phages described here and previously studied dsDNA phages is unknown; however, it may be related to the distribution of their hosts or fundamental differences in viral lifestyle, genome evolution rates or virion stability. Future work needs to elucidate the host ranges of these and other ssDNA phages, and determine the roles of ssDNA phages in marine microbial ecology.