This page will hold the code for some of the bioinformatics and statistical methods used for the article The gut microbiota and depressive symptoms across ethnic groups
### process reads
# merge reads, max mismatches 30
#~/usearch64 -fastq_mergepairs $data_dir/*_R1*.fastq -relabel @ -fastq_maxdiffs 30 -fastq_pctid 80 -fastqout ./merged.fastq 2>&1 | tee -a helius.log
# filter reads to max expected error 1 per read
#~/usearch64 -fastq_filter ./merged.fastq -fastq_maxee 1 -fastaout ./filtered.fasta 2>&1 | tee -a helius.log
# dereplicate
#~/usearch64 -fastx_uniques ./filtered.fasta -fastaout ./uniques.fa -sizeout -relabel Uniq 2>&1 | tee -a helius.log
###### ASV pipeline
# UNOISE3 denoising
#~/usearch64 -unoise3 ./uniques.fa -zotus ./zotus.fa 2>&1 | tee -a helius.log
# make ASV table
~/usearch64 -otutab ./merged.fastq -zotus ./zotus.fa -otutabout ./helius.unoise3.ASV.table_FINAL.txt 2>&1 | tee -a helius.log
###### OTU pipeline
# UPARSE OTU clustering
~/usearch64 -cluster_otus ./uniques.fa -otus ./otus.fa -relabel Otu 2>&1 | tee -a helius.log
# make OTU table
~/usearch64 -otutab ./merged.fastq -otus ./otus.fa -otutabout ./helius.uparse.OTU.table_FINAL.txt 2>&1 | tee -a helius.log
## assign taxonomy with DADA2 (SILVA, Greengenes)
## allign ASVs with MAFFT
## make tree with IQ-Tree / FastTree Dbl
Variable | Role | Parameter | Variable name in data file |
Depression | Dependent / Predictor | Depression score (PHQ-9) | H1_PHQ9 |
Alpha diversity | Dependent / Predictor | Shannon | shannon |
Relative abundance | Predictor | Counts ranked | RZOTU[number] |
Beta Diversity | Predictor | PCoA Bray Curtis | PC[# 1-20]BC |
Beta Diversity | Predictor | PCoA Weighted Unifrac | PC[#1-20] |
Ethnicity | Covariate model 1 | Ethnic group dummy coded (Regression) | Eth_1 to Eth_6 |
Ethnicity | Covariate model 1 | Variable level (ANOVA) | H1_EtnTotaal |
Sex | Covariate model 1 | female 0, male 1 | H1_geslacht |
Age | Covariate model 1 | Years | H1_lft |
Education | Covariate model 1 | Educational attainment | H1_opleid |
Smoker | Covariate model 2 | 0 no, 1 yes | current_smoker |
Physical Activity | Covariate model 2 | Squash total score | H1_Squash_totscor |
Alcohol | Covariate model 2 | AUDIT Alcohol consumption | H1_Auditalcohol |
Body mass Index | Covariate model 2 | BMI | H1_LO_BMI |
Diagnosed with gastro-intestinal Disorder | Covariate model 3 | 0 no, 1 yes | darmstoornis_diagnose |
Diabetes | Covariate model 3 | 0 no, 1 yes | H1_Diabetes_SelfGlucMedHba1c |
Diarrhea in past week | Covariate model 3 | 0 no, 1 yes | Diarr_week |
Antibiotic used in past 2 weeks | Covariate model 3 | 0 no, 1 yes | antibiotica_2wk |
Proton pump inhibitor use | Covariate model 3 | 0 no, 1 yes | PPI |
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS CI(95) R ANOVA CHANGE ZPP
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT [Dependent variable]
/METHOD=ENTER [Predictor variable]
/METHOD=ENTER [Covariate list Model 1]
/METHOD=ENTER [Covariate list Model 2]
/METHOD=ENTER [Covariate list Model 3]
/SCATTERPLOT=([Dependent variable] ,*ZRESID)
/RESIDUALS HISTOGRAM(ZRESID) NORMPROB(ZRESID)
UNIANOVA H1_PHQ9_sumscore BY H1_EtnTotaal H1_geslacht WITH shannon H1_lft
/METHOD=SSTYPE(3)
/INTERCEPT=INCLUDE
/PRINT ETASQ DESCRIPTIVE PARAMETER
/CRITERIA=ALPHA(.05)
/DESIGN=H1_EtnTotaal*shannon H1_EtnTotaal shannon H1_geslacht H1_lft
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS CI(95) R ANOVA CHANGE ZPP
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT H1_PHQ9_sumscore
/METHOD=FORWARD
PC01BC PC02BC PC03BC PC04BC PC05BC PC06BC PC07BC PC08BC PC09BC PC010BC PC011BC PC012BC PC013BC PC014BC PC015BC PC016BC PC017BC PC018BC PC019BC PC020BC .
Retained: Weighted UniFrac PC02 PC03 PC07 PC011 PC014 PC019
Bray-Curtis PC02BC PC03BC PC04BC PC05BC PC014BC PC016BC PC018BC
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS CI(95) R ANOVA CHANGE ZPP
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT [Dependent variable]
/METHOD=ENTER [Ethicity or Empty]
/METHOD=ENTER [Covariate list Model 1 -3]
/METHOD=ENTER [retained principal components Beta-diversity]
/SCATTERPLOT=([Dependent variable] ,*ZRESID)
/RESIDUALS HISTOGRAM(ZRESID) NORMPROB(ZRESID)