Niche partitioning of bacterial communities in biological soil crusts and subsoil under grasses and trees in the semi-arid Kalahari

David R Elliott, Andrew D Thomas, Stephen R Hoon, Robin Sen

This html file is generated from source file kalahari_BSC_bacteria.Rmd which is the basis of all analyses presented in the above paper. The .Rmd file can be consulted to find the R commands used in the analyses.

Summary of Online Resources (supplementary data)

Data preparation

Load data from QIIME

## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 2922 taxa and 48 samples ]
## sample_data() Sample Data:       [ 48 samples by 23 sample variables ]
## tax_table()   Taxonomy Table:    [ 2922 taxa by 7 taxonomic ranks ]

Chloroplast and mitochondrial sequences were removed after import (very few reads).

Pre-process data for each analysis

Richness / diversity

The full data set was used with original count data and without rarefaction, according to recommendation of the phyloseq and vegan documentation. Rarefaction was used in an earlier version of the analysis. With rarefaction to 437 sequences (excluding 2 samples) the results and statistical tests (see later) followed almost exactly the same profile, although the absolute values changed a bit.

OTU abundance plots and comparisons

Sequence observations were converted to percentage for each sample, to allow comparison between samples of different sequencing depth.

Ordination

OTUs accounting for < 0.01% of sequences found in the study were excluded from ordinations.

Main figures

Figure 2. Total carbon and nitrogen in crust and subsoil samples at each site (n=6). Boxes represent the interquartile range (IQR), and error bars extend to the most extreme values within 1.5 * IQR of the box. Median values are shown as a line within the box and outliers are shown as black spots. Sample coding: AG=annual grass, PG=perennial grass, S=shrub, T=tree.

Figure 3. OTU richness estimation (Chao1) and diversity index (Shannon) in crust and subsoil. a. Comparison of measures at each site; b. Individual sample richness/diversity with respect to sample carbon content. Boxes represent the interquartile range (IQR), and error bars extend to the most extreme values within 1.5 * IQR of the box. Median values are shown as a line within the box and outliers are shown as black spots. Sample coding: AG=annual grass, PG=perennial grass, S=shrub, T=tree.

Figure 4. Phylum abundance by (a) site and (b) soil carbon content. Boxes represent the interquartile range (IQR), and error bars extend to the most extreme values within 1.5 * IQR of the box. Median values are shown as a line within the box and outliers are shown as black spots. Sample coding: AG=annual grass, PG=perennial grass, S=shrub, T=tree. Significance and direction of correlation between phylum abundance and soil carbon is indicated by + or - (determined by Spearman test). Significance codes for positive correlation: +++ < 0.001; ++ < 0.01; + <0.05. Similar plots for less abundant phyla are included in supplementary data.

Figure 5. Correspondence analysis of the microbial community. Coloured markers indicate individual samples, and dispersion ellipses show the 95 % standard deviation confidence interval for crust/subsoil and canopy/open classifications. OTU identification numbers are shown in different colours for the 9 most abundant OTUs belonging to the followng groups: full dataset (black), phylum Cyanobacteria (green), and phylum Bacteroidetes (blue). Environmental variables with significance p < 0.05, are shown as biplotted vectors (based on permutation tests; n=1000).

Figure 6. Relative abundance of the top 9 OTUs detected in the study, by (a) site and (b) soil carbon content. Boxes represent the interquartile range (IQR), and error bars extend to the most extreme values within 1.5 * IQR of the box. Median values are shown as a line within the box and outliers are shown as black spots. Significance and direction of correlation between OTU abundance and soil carbon is indicated by + or - (determined by Spearman test). Significance codes for positive correlation: +++ < 0.001; ++ < 0.01; + <0.05. Sample coding: AG=annual grass, PG=perennial grass, S=shrub, T=tree.