Fig. S1. Separation of the taxonomic groups by the pentamer PCA distances (PCA scores) for the 16S rRNA gene data.

Fig. S2. Correspondence between alignment-based and pentamer PCA (score) distances for the 16S rRNA gene data.

Fig. S3. Distribution of TIGFAM coverage.

Fig. S4. Influence of the TIGRFAM categories on the co-occurring gene clusters.

Table S1. Loadings for the co-occurring gene clusters for genes showing a gene cluster specific distribution.

