This page will hold the code for some of the bioinformatics and statistical methods used for the article The gut microbiota and depressive symptoms across ethnic groups

1 ASV inference

### process reads

# merge reads, max mismatches 30
#~/usearch64 -fastq_mergepairs $data_dir/*_R1*.fastq -relabel @ -fastq_maxdiffs 30 -fastq_pctid 80  -fastqout ./merged.fastq 2>&1 | tee -a helius.log

# filter reads to max expected error 1 per read
#~/usearch64  -fastq_filter ./merged.fastq  -fastq_maxee 1 -fastaout ./filtered.fasta 2>&1 | tee -a helius.log

# dereplicate 
#~/usearch64 -fastx_uniques ./filtered.fasta -fastaout ./uniques.fa -sizeout -relabel Uniq 2>&1 | tee -a helius.log



###### ASV pipeline

# UNOISE3 denoising
#~/usearch64 -unoise3 ./uniques.fa -zotus ./zotus.fa 2>&1 | tee -a helius.log

# make ASV table
~/usearch64 -otutab ./merged.fastq -zotus ./zotus.fa -otutabout ./helius.unoise3.ASV.table_FINAL.txt 2>&1 | tee -a helius.log


 
 
###### OTU pipeline

# UPARSE OTU clustering
~/usearch64 -cluster_otus ./uniques.fa -otus ./otus.fa -relabel Otu 2>&1 | tee -a helius.log

# make OTU table
~/usearch64 -otutab ./merged.fastq -otus ./otus.fa -otutabout ./helius.uparse.OTU.table_FINAL.txt  2>&1 | tee -a helius.log

## assign taxonomy with DADA2 (SILVA, Greengenes)

## allign ASVs with MAFFT

## make tree with IQ-Tree / FastTree Dbl

2 Statistical analysis

2.1 Generic SPSS code Linear regression Models

2.2 Generic SPSS code Linear regression Models 1 - 3

REGRESSION

/MISSING LISTWISE

/STATISTICS COEFF OUTS CI(95) R ANOVA CHANGE ZPP

/CRITERIA=PIN(.05) POUT(.10)

/NOORIGIN

/DEPENDENT [Dependent variable]

/METHOD=ENTER [Predictor variable]

/METHOD=ENTER [Covariate list Model 1]

/METHOD=ENTER [Covariate list Model 2]

/METHOD=ENTER [Covariate list Model 3]

/SCATTERPLOT=([Dependent variable] ,*ZRESID)

/RESIDUALS HISTOGRAM(ZRESID) NORMPROB(ZRESID)

2.3 Generic SPSS code ANOVA (testing interaction ethnicity by alpha diverity or relative abundcances)

UNIANOVA H1_PHQ9_sumscore BY H1_EtnTotaal H1_geslacht WITH shannon H1_lft

/METHOD=SSTYPE(3)

/INTERCEPT=INCLUDE

/PRINT ETASQ DESCRIPTIVE PARAMETER

/CRITERIA=ALPHA(.05)

/DESIGN=H1_EtnTotaal*shannon H1_EtnTotaal shannon H1_geslacht H1_lft

2.4 SPSS code Linear regression forward selection PcoA

REGRESSION

/MISSING LISTWISE

/STATISTICS COEFF OUTS CI(95) R ANOVA CHANGE ZPP

/CRITERIA=PIN(.05) POUT(.10)

/NOORIGIN

/DEPENDENT H1_PHQ9_sumscore

/METHOD=FORWARD

PC01BC PC02BC PC03BC PC04BC PC05BC PC06BC PC07BC PC08BC PC09BC PC010BC PC011BC PC012BC PC013BC PC014BC PC015BC PC016BC PC017BC PC018BC PC019BC PC020BC .

Retained: Weighted UniFrac PC02 PC03 PC07 PC011 PC014 PC019

Bray-Curtis PC02BC PC03BC PC04BC PC05BC PC014BC PC016BC PC018BC

2.5 Generic SPSS code Linear regression Models Beta-Diversity

REGRESSION

/MISSING LISTWISE

/STATISTICS COEFF OUTS CI(95) R ANOVA CHANGE ZPP

/CRITERIA=PIN(.05) POUT(.10)

/NOORIGIN

/DEPENDENT [Dependent variable]

/METHOD=ENTER [Ethicity or Empty]

/METHOD=ENTER [Covariate list Model 1 -3]

/METHOD=ENTER [retained principal components Beta-diversity]

/SCATTERPLOT=([Dependent variable] ,*ZRESID)

/RESIDUALS HISTOGRAM(ZRESID) NORMPROB(ZRESID)