Microbial communities have a central role in global environmental processes and the Earth's biogeochemistry by cycling nutrients and fixing carbon (Falkowski et al, 1998). However, because of the complexity of these communities and the lack of culturability of most of its members, the molecular and ecological details as well as influencing factors of these processes are still poorly understood. Environmental shotgun sequencing (metagenomics) has the potential to start unraveling the underlying complex interspecies ecological interactions and metabolic networks, by quantification of the molecular functions (‘parts lists’) of all microbial communities on Earth (Tringe and Rubin, 2005a; Raes and Bork, 2008). However, despite a wide range of published metagenomics studies (see Liolios et al, 2006; Raes et al, 2007; Wooley et al, 2010, for an overview), our knowledge of the variation, functioning and ecology of complex microbial ecosystems remains limited, mostly because the resulting ‘parts lists’ could not be put into sufficiently detailed environmental context. Although previous studies have shown that the environment has an influence on the parts list of various communities, the extent of this effect and the relative importance of a broad range of different environmental factors (climate, nutrients, physicochemical parameters and so on) is unknown (Tringe et al, 2005b; DeLong et al, 2006; Dinsdale et al, 2008; Kunin et al, 2008; Gianoulis et al, 2009) or was investigated with a focus on single species (Johnson et al, 2006) or specific gene families (Patel et al, 2010). This said, recent models predicting nutritional strategy from metagenomic data show great promise toward the understanding of some of these relationships (Lauro et al, 2009). Also, as microbial biogeography and ecology studies have mostly focused on phylogenetic patterns, little is known about the role of molecular traits (i.e., the genes and their products) in these matters (Martiny et al, 2006; McGill et al, 2006; Green et al, 2008). Likewise, the role of molecular trait variation in important ecosystemic processes such as global primary production is far from clear (Falkowski et al, 1998). To start addressing these issues, we investigated the feasibility of molecular trait-based ecology by integrating large-scale marine metagenomics data with geochemical, meteorological and ecological measurements and used this information to investigate (i) the relationship between environment and functional community composition (the metagenome-derived gene/pathway repertoire of an ecosystem), (ii) the factors influencing functional dispersal (defined here as the functional effects of species dispersal as well as horizontal gene transfer- and phage-mediated gene flow), i.e., the movement of functional traits through geographical space, (iii) the interplay between functional composition and primary production and (iv) the geographic variation in global functional diversity and its consequences. The various correlations we found, despite various imaginable limitations of environmental sequence data (see further), thereby indicate that molecular functional composition, as derived from metagenomes, can serve as a powerful marker and predictor of ecological processes.