ICoMM
INTERNATIONAL CENSUS OF MARINE MICROBES

ICoMM Benthic Systems Working Group discussions: Southampton Oceanography Center meeting Jan 14-15.

General comment:
The ICoMM Benthos working group initiated their discussions using questions posed by the secretariat to each of the ICoMM working group members (see below and indicated by italics). Within the context of these discussions, the Benthos includes everything from sedimentary systems, seamounts, hydrothermal ridge axes, and bare rock ridge flank environments. A secondary circumscription of Benthic organisms might be: 'everything that is driven by 'Dark energy'. Early discussions reflected the working groupÕs collective view that an international census of marine microbes must include more than the counting of microbes by some genetic or physiological criteria. The census must gather contextual information in order to interpret the significance of diversity measurements and to maximize the return of scientific information from investments in the census. We also recognize that it will not be possible to inventory and count all organisms in all samples. However, the information gained from contextual data (e.g. differences between communities from similar environments that are geographically distinct or remote- can inform us about how near we are to obtaining a nearly comprehensive census. As a corollary, if ICoMM defines diversity through measurements of active genes, the environmental context will radically alter microbial expression patterns. For example, the same organism can have unlike functions and physiologies in different geochemical contexts. We also express the view: -- A census of what is there is not always a census of what is active. --- screening diversity does not lead to self-evident physiological categories or provide evidence on ecosystem function and environmental impact.

Diversity:
The discussion about diversity was challenging because there are many ways to describe genotypic and phenotypic similarities and differences between kinds of microbes. We reject the use of the term species in the classical sense used to describe sexually compatible multicellular organisms. There are no self-evident or objective 'species' criteria, with a unified level of resolution that is equally applicable to questions in microbial ecology, biogeochemistry, and evolution. We are working with different metrics at different scales of resolution; it is unavoidable at present. Comparisons of genes, genomes and function offer different modes for describing microbial diversity. The Technology working group should carefully address existing and future technical capabilities for measuring microbial diversity.

  1. What are the most timely and important questions regarding benthic microbial diversity?
    1. The working group discussed the recurrent theme: Is everything everywhere? Are there evolutionary lineages of microorganisms in marine environments and is it possible to identify biogeographical distribution patterns?
    2. The working group explored the question: Should ICoMM be concerned with genetic, genomic diversity or functional diversity?
      1. Genetic diversity as defined by a single or small number of genes may not always be sufficient for a census for a variety of reasons including:
        1. inadequate information to resolve location or phylotype in molecular trees
        2. conflict between relationships inferred from different genes (either due to horizontal gene transfer or altered rates of evolutionary change).
        3. technical problems associated with different efficiencies of DNA extraction or design of primers for amplifying highly diverged homologues.
      2. Estimates of genomic diversity in microbes are complicated by the potential for large-scale movement of genetic elements between microbial genomes and the technical challenge and expense of collecting genome sequences from microbial populations. Without large scaffolds, the assembly of shotgun environmental sequences from complex microbial populations may be beyond our grasp. The technology group should explore optimal and alternative strategies for gathering information about the genomic context for a gene that is highly conserved or that specifies a particular function.
      3. Since ecosystem parameters will influence functional diversity, contextual information e.g. water chemistry, temperature, etc. will be essential for interpreting functional diversity of microbial populations.
    3. How does diversity relate to function and ecosystem processes?
    4. How does the choice of gene influence diversity assessments and inference about presence or absence of functional groups in a complex community?
    5. What scales of heterogeneity - spatial, temporal are most appropriate for the census?
    6. How can we link diversity at different scales?
    7. What is the optimal measure of microbial diversity for a census of microorganisms? (Other scientific questions may require different measures of diversity)
  2. What metric can be used to describe microbial diversity?
    1. Functional diversity Ðmolecular techniques e.g. expression profiling, qPCR can serve as proxy for activity of specific functions.
    2. Phylogenetic Diversity- rRNA, other conserved genes, or multiple conserved genes
    3. Genomic Diversity- Genomic context of genes associated with a conserved gene, or with a cluster of genes that can be identified within large DNA insert libraries by functional assays. Includes sequence analysis of BACs, Cosmids and Fosmids.
    4. Depending on the research question under investigation, we may need different metrics. For example, if we seek to focus on the genetic diversity of a particular phylotype, it will be necessary to use rapidly evolving sequences. In contrast, a general survey of all organisms in a microbial population will rely upon more slowly evolving gene sequences. In some cases the census will seek information about relative or absolute numbers of different kinds of organisms.
  3. For molecular measures, what are the strengths and weaknesses of single-gene, genomic, and populations-level perspectives? Which of these will have the greatest long-term benefits.
    1. Problems associated with single-gene and genomic level perspectives are summarized above. One potential option would be the sequencing of genomes from representative single cells from different environments or several genomes from a single environment. This will require advances in cloning technology and at least a 100 fold reduction in the cost of DNA sequencing
  4. How should the dynamics of diversity be handled, and should the approach be based on species (phylotypes), population diversity of a phylotype or community diversity?
    1. The working group was less concerned about temporal diversity than diversity at different sites. Temporal diversity is important but may not be possible to monitor in the benthic environment because of challenges of obtaining samples. The technology group should address time-series sampling capabilities in the benthos.
  5. How will approaches to microbial diversity differ from those used in the Census of Marine Life, which focuses mainly on metazoans?
    1. The metric of diversity
      1. Critical discussion of this thorny issue will be necessary. There are no self-evident 'species' criteria, with a unified level of resolution. We are working with different metrics at different scales of resolution; it is unavoidable at present. Can the technology group define metrics using different techniques to guide ICoMM participants?
      2. Depending on the research issue, we may need to use different metrics. E.g. DNA sequence diversity, DNA arrays, expression profiling, qPCR, FISH technology, etc.
      3. Evaluate a number of selected key environments in terms of microbial diversity. Multiple examples should be studied for each selected key environment. Here the key question is to determine whether or not a particular environment selects for a specific microbial population structure.
      4. Single cells genomics are the ultimate screening tool to evaluate the biogeography issue. This will of course require enormous reductions in cost of DNA sequencing and annotation as well as development of technology for producing libraries from single cells. Genomics has the potential to advance our understanding of genetic diversity and may lead to new concepts/definitions of microbial 'species' that may aid in carrying out a census, in the way that MLST has for cultivated organisms but with greater resolution. The technology group should address how to incorporate diversity measurements based upon genomics into a census that also reports on the absolute and/or relative numbers of different kinds of microbes in a sample.

    Other issues:

    The working group discussed experimental strategies and how they could translate into fundable projects. Getting support for 'exploration' will be very difficult in a climate that explicitly encourages hypothesis driven research. The discussions on Integration outlined a number of experimental paradigms that ICOMM should formulate in terms of testable hypotheses.

    Integration:

    The following questions about Integration were addressed in the context of outlining experimental strategies that could attract funding under Links between divrsity measurements, environmental conditions and functional diversity.

    1. What level of biodiversity is necessary to interpret ecological, physiological and process-related observations?
    2. How can process and ecological data inform us about diversity and what levels of information are required?
    3. What are the key scientific questions that a census can address?
    4. Where are the gaps in the investigative framework?
    5. How is diversity related to process stability? Can predictive frameworks be defined?
      1. Links between diversity measurements, environmental conditions and functional diversity.
        1. Strategy
          1. By linking genetic and physiological diversity ('function') it is possible to formulate hypotheses about links between genotype and phenotype.
          2. Strategy 2) Looking the other way around, an analysis could start with metabolic function and diversity in a given environment, and extrapolate from these data towards potential diversity. What types of organisms can be expected in a given environment? Can the genes discovered in a certain environment account for all functions in this environment? Imagine new gene functions and classes. Do not start with screening diversity and try to fit it into a priori functional categories Ð the categorical framework may not fit.
          3. Strategy 3) The continuum of environmental conditions and niches may correspond to a continuum of sequence diversity. Several thousand genes may have to be sequenced, to document this link.
          4. Strategy 4): Global diversity patterns can be observed in a single species; these may be a consequence of distinct environmental conditions and pressures, or biogeographic separation.
          5. Strategy 5) Complement genomics with proteomics. Proteomics today gives genomic-level information, which indicates high complexities. Getting good, pure protein in sufficient quantities is the limiting factor. However, in some cases, there is useful information to be obtained if a key component is dominating (ANME-1 mcrA protein from Black Sea ANME mats) or biomass is sufficiently high to enable processing and purification.
          6. Strategy 6) Focus on groups with a well-developed toolbox that Couples function and diversity. SRBs are an excellent example where the coupling is tight --- excellent molecular equipment, primers, arrays, extremely complex and pervasive fermentations Ð not much work done (hydrogenase as devilÕs advocate idea). Methanogenesis as a simpler example

    Biogeography and diversity:

    The working group also discussed the concept of biogeography of microbes or Microbial distribution

    1. Principal problem 1). Can we reject 'everything is everywhere' with absolute certainly? A strict rebuttal is not possible, since we donÕt have a 100% complete census of any microbial habitat. A microorganism may remain undetected by various tools although other datasets indicate their existence. On the other hand, if the hypothesis is correct it eliminates the need to sample every environment to absolute completeness.
    2. Principal problem 2). The speed of evolutionary gene flow vs. environmental exchange determines whether biogeography is possible at all; genetic speciation must be faster that exchange mechanisms. Strength of dispersal mechanisms that override genetic isolation and evolution of separate lineages. Examples are the phylogenetic separation of freshwater/saltwater bacteria as shallow, but separate clusters (Zwaart and Crump 2003).
      1. Strategy 1: Fosmid libraries from geochemically similar environments, to differentiate changes in communities that are not necessarily affected by geochemistry.
      2. Strategy 2: Look for organisms in the wrong place. If organisms show up out of place again and again, that is an indication for an effective environmental dispersal mechansisms. Examples are thermophiles in arctic sediments.
      3. Strategy 3: Use same field campaigns for multiple investigators with multiple targets
      4. Strategy 4: Compare the microbial community structure and genomic repertoire for three - ten very similar or nearly identical geochemical habitats.

    Sampling, prioritization and coordination with other programs:

    1. Schedules, locations and priorities. This topic should be addressed by SAC and/or the Technology working group, as well as by participants at the general meeting.
    2. Relationship to sampling program in the Census of Marine Life and other ongoing or currently planned sampling efforts. This topic should be addressed by SAC and/or Technology working group
    3. Are there mileposts that will logically define phases of the project? This topic should be addressed by SAC and/or Technology working group
    4. What observations are needed at each sampling site? Density of sampling: A key question or hypothesis 'Is everything everywhere? where everything either refers to all major lineages (a genetic definition) or everything refers to all metabolic capabilities (a functional definetion)' If we could demonstrate unequivocally that everything is everywhere, it will reduce the requirements for sampling. This topic was addressed under Integration
    5. How should we address temporal variations?
    6. How should we address spatial heterogeneity, particularly with regard to commensal populations and chemosynthetic environments (e.g. seeps, whale falls, wood falls)? This topic should be addressed by SAC and/or Technology working group as well as by participants in general meeting. The Benthos systems working group did not define detailed spatial sampling strategies for particular sites but the discussions captured the idea that several seeps, vents, the sea floor and geographically distributed sites including sediments with similar water chemistries should be sampled and compared.

    Databases:

    The following questions guided the Benthos working groupÕs discussions of data bases. The technology working group may wish to address these issues in greater detail.

    1. What is the structure of the information that will be produced?
    2. What are the specific database needs for benthic systems?
    3. What are the preferred techniques for carrying out a census in benthic systems?
    4. How can databases be structured to facilitate communication?

    US databases funded by NSF (Note: NSF data base contact Peter Cornellon URI ) are generally judged according to their technical merits rather than utility to the investigator. ICOMM will emphasize content and a format that facilitates access. Currently, ICOMM relies upon MICROBIS which provides image - rich information about protists and a microbial name serving capability. It also functions as a 'traffic cop' in the sense it provides links between its content and relevant information residing on other data bases accessible through the WEB.

    Within CoML there is a primary data base-: OBIS plus (Eurobis), as well as web sites for each of the field projects. The Chess database is undergoing revision and will be integrated to OBIS. ICoMM must be compatible with OBIS and other CoML web sites. It should strive to share taxonomic information and naming capability with other COML databases Ð particularly for zooplankton which houses some of the protists.

    Micro*scope:

    ICoMM website: MICROBIS based on Micro*scope: (Contact Paddy Patterson), this has received diverse funding (principally through NASA), and a system is developed and growing, currently with approximately 3 million names

    • Non-molecular at present, image rich database of microorganisms and a detailed compilation of names that can be organized according to alternative taxonomic schemes.
    • Access by name, habitat, but at present focused on Protists
    • Image linked to other relevant sites on the web
    • image goes here Database web-resources

      Different classifications can be added at a later stage (see later phased approach)

      NOTE: Star site concept very useful: software can be exported to other sites, new data base synchronized back to micro*scope, however there remains danger in accuracy if there is not a single gatekeeper in charge of the information.

      Volunteer submissions of images and data via a webmaster were also discussed.

      The Engine for Micro*scope is UBIO, OBIS is considering using UBIO.

      The inclusion of basic biochemical information for cultured representatives of certain groups (along the lines as available in papers describing newly named taxa) was discussed.

      MICROBIS could take data such as phylogenetic trees, possibly via a link with ARB to allow search for nearest relative to new sequence data.

      It was suggested that only the most simple, core information for research purposes should be provided in MICROBIS, rather than detailed analysis such as would be used for publication quality data because of the dangers associated with automated alignment of sequences (though it is noted that this is now becoming standard practice due to the increased data coming on line with genomic sequencing efforts). The technology group should address which molecular data base(s) can optimally provide services and links to microbis.

      Important questions:

      1. What is the depth of information to include for a census? For example, the database could search by sequence and provided nearest relative orphylogenetic position. Lead (Link) to page and information available on related types. Includes biochemistry, It would be desireable to link to physiolobical databases such as BergeyÕs and the Prokaryotes. Technology group should address optimal way to integrate with other molecular data bases and interoperability.
      2. How do you judge relatedness standard probabilistic methods If sites (databases) already have information, use them in the framework, eg meta-site concept donÕt duplicate in MICROBIS use existing resources. Eg Tree of life, Miracle (biogeochem linked to organisms), also biogeography sites Can set up database to search for occurrence of sequence by geographic or site data with contextual information: Structure of MICROBIS: phylotype and contextual info and links
      3. Important information:

        What are the contextual data required:

        Where is the quality control ? not possible to be too precise eg Can database hold controversial (arguable) information ? yes, but its always up to the individual to make an assessment. FISH data and molecular probes could also be added at later date, the database may be developed in phases As new information is added then new searches will be required , this is part of the dynamic use of the data.

      4. Options

        1. rebuild its own phylo trees
        2. use another data base to update phylogenetic basis the latter is preferred if formal link can be established
          1. census data should be simple and not duplicate effort, use of links better, MICROBIS would ship sequence to ARB for analysis, Note: ARB always behind genbank
          2. MICROBIS should be user friendly and census driven..
          3. Technical hierarchy: GENBANK: ARB : MICROBIS (latter more contextual info, central repository for census information, images, interpretative, search tools, points to ARB and GENBANK

        Data entry:

        Data entry should establish a common set of minimal identifiers and contextual information. Mechanism in place but not public yet, password protected system will be in place to prevent spurious information, comment possible but this leads to some problems Need expert reviewers to insure integrity of data check on how other databases handle this, but does slow down progress Is the database going to include environmental parameters? Are databases with such info in existence? Within the NSF funded Microbial Observatories there are examples of databases with physiological information. Seamount database (through GERM) can be used to design thisÉlinks to information can be included We should ask for information on sample sites and relevant conditions, and agree format, e.g ODP system , little in other eg ChEssThis is a problem with GENBANK contextual info often lacking, only optional and not encouraged However, IODP can be asked to ask authors to include this information, RIDGE2000 is an example of good practice Try and insure contextual information is included in as many places as possible IODP can insist on this information future cooperation can be based on this to help compliance Steve D'Hondt asked that people let him know what is required and he will forward the request.

        Coffee discussion.

        Environmental data much more variable and difficult to collect, yet very useful in understanding the distribution and likely occurrence of organisms/sequences in specific environments.

        Ask CHeSS what OBIS is capturing in terms of environ data: (A-not much)

        MICROBIS : Key environmental parameters that should be added for context and linked to the sequence/isolate information.

        1. Location: latitude, longitude, and depth.
        2. Aerobic/ anaerobic (Oxygen concentration)
        3. pH
        4. Temperature
        5. Salinity
        6. Conductivity
        7. 7. Broad characteristics: sediment, rock, water
        8. 8. All other available information collected from a sampled site. Other databases with supporting information should be identified and linked, cross- referenced. MICROBIS or ICoMM could serve as a portal to information about potential sampling opportunities that might be afforded by planned cruises and the associated investigators (i.e., provide an International BFGÐSAC working with secretariat should identify those opportunities

        Example programs that provide partial information:

        RIDGE2000
        Hawaii HURL
        IODP
        UNOLS
        EUROCEAN

        Relationships with other programs:

        1. RIDGE, ODP, Genomes to Life, Microbial Observatories, RCN
        2. What are the future opportunities that are currently funded for sampling? How can sample access be coordinated?
        3. Biodiversity Organization issues: Centralized, Coordinated, Distributed, Combinational Models

        How to Proceed:

        1. Funding Strategies
          1. What are the highest priorities?

          There are many priorities that should be addressed depending on the benthic environment of interest that are detailed below. But as a premise it should be underscored life in extreme environments is a major existing driver for scientific research. In the deep ocean that is not always fully tapped. For example, IODP has one of the 3 organizing themes microbiology there have not been any microbiological legs besides leg 201

          Benthic Priorities:

          Subseafloor Priorities:

          1. E.g., 1: ponded sediment/young ridge flank at ~30¡N MAR. permits study of sedimentary communities (~30 m sediment) and recovery of young (7 Ma) hydrologically active crust.
          2. E.g., 2: low-activity (in upper water column) (30¡S) vs. high activity sedimentary systems (in UWC) (~45-50¡S) as a function of crust age in Southern Pacific Ocean sites.
          3. E.g. 3: Black sea: comparing the modern microbial community with the sedimentary community, looking for evidence of degradation in-situ vs. preservation of water-column processes occurring in the present and past.
          4. E.g., 4: Cariaco basin: green varved sediments that record an excellent paleoceanographic history and have been studied intensively for this but never been drilled for microbiology.

          Coastal priorities:

          1. Estuaries - sites of high primary productivity and potentially autotrophic pathways. E.g., Venice estuary, highly impacted, productive, historical.
          2. Stromatolites Ð give us a window into the early evolutionary processes. Communities of functionally interactive microbes that display low diversity. High spatial gradients in chemistry (O2, salinity, biomass) that can be quantified in-situ. Enables functional as well as phylogenetic diversity to be assayed. Shark bay, Bahamas

          Deep Sea priorities:

          1. Exploration Ð vents and beyond. Current NSF funding is largely focused on long-term monitoring of select sites but does not support exploratory research. NOAA exploration is currently the only US platform that is engaged in significant exploratory research. ICoMM could help facilitate this research by emphasizing an exploratory program aimed at contributing to a microbiological survey.
          2. Rock communities at ridge axes in bare cold rock habitats. Studies are needed to determine the succession of endolithic microbial communities on bare rock off axis of ridges as a function of age.
          3. Benthic sediments in traps to address the issue of 'sediment drift' for the purposes of looking at benthic dispersal are needed. Currently there are only sediment traps for looking at either planktonic sedimentation or deep-ocean dispersal of vent larva. No studies have yet looked at the dispersal of benthic dispersal of prokaryotes in off-axis environments.

          Arctic/Antarctic priorities:

          1. Sea ice communities: the floating benthos.

          Where could ICoMM seed monies be most effective towards initiating programs and collaborations in benthic systems?

          Promoting and supporting financial endeavors between international collaborators is a must. ICoMM should advocate for funding programs that support international efforts between different programs, for example, for a European analogue to Microbial Observatories to be developed.

          What are the key programs for benthic studies? How can benthic diversity research be better advocated within these or other programs for multi-institutional and international programs?

          In the US RIDGE, Ocean Sciences, Microbial observatories. Internationally, IODP is the only effective one Ð and it supports little post-cruise science. NASA supports many post-cruise studies but has no mechanism for funding cruises.

          How to proceed:

          1. White Papers
            1. What audiences should be targeted? Societies (AAM, ASM, ASLO, EGS, AGU), plus European and international equivalents e.g. ISME, IUMS, FEMS, funding agencies, sea-going organizations (WHOI, Scripps).
            2. What are the important questions?
            3. Is everything everywhere? How are microbes dispersed in the deep ocean? What is the extent of microbiology in the sub-surface ocean? What is the total biomass supported in the deep and what supports it?
            4. What are the important messages?
              The microbiology of the benthos is nearly entirely uncharted except in coastal systems, and even these habitats are under sampled. Because of issues related to obtaining deep- ocean samples there is an absolute critical need to have contextual information for benthic surveys of any sort and this needs to be a integral component of any integrated census effort.