Volume 41, Issue 3
RESEARCH ARTICLE

A small‐sample multivariate kernel machine test for microbiome association studies

Xiang Zhan

Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA

These authors are the joint first authors.

Search for more papers by this author
Xingwei Tong

School of Mathematical Sciences, Beijing Normal University, Beijing, China

These authors are the joint first authors.

Search for more papers by this author
Ni Zhao

Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA

Search for more papers by this author
Arnab Maity

Department of Statistics, North Carolina State University, Raleigh, NC, USA

Search for more papers by this author
Michael C. Wu

Corresponding Author

E-mail address: mcwu@fhcrc.org

Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA

Correspondence

Michael C. Wu, Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA.

Email: mcwu@fhcrc.org

Jun Chen, Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN, USA.

Email: Chen.Jun2@mayo.edu.

Search for more papers by this author
Jun Chen

Corresponding Author

E-mail address: Chen.Jun2@mayo.edu

Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN, USA

Correspondence

Michael C. Wu, Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA.

Email: mcwu@fhcrc.org

Jun Chen, Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN, USA.

Email: Chen.Jun2@mayo.edu.

Search for more papers by this author
First published: 26 December 2016
Citations: 13

ABSTRACT

High‐throughput sequencing technologies have enabled large‐scale studies of the role of the human microbiome in health conditions and diseases. Microbial community level association test, as a critical step to establish the connection between overall microbiome composition and an outcome of interest, has now been routinely performed in many studies. However, current microbiome association tests all focus on a single outcome. It has become increasingly common for a microbiome study to collect multiple, possibly related, outcomes to maximize the power of discovery. As these outcomes may share common mechanisms, jointly analyzing these outcomes can amplify the association signal and improve statistical power to detect potential associations. We propose the multivariate microbiome regression‐based kernel association test (MMiRKAT) for testing association between multiple continuous outcomes and overall microbiome composition, where the kernel used in MMiRKAT is based on Bray‐Curtis or UniFrac distance. MMiRKAT directly regresses all outcomes on the microbiome profiles via a semiparametric kernel machine regression framework, which allows for covariate adjustment and evaluates the association via a variance‐component score test. Because most of the current microbiome studies have small sample sizes, a novel small‐sample correction procedure is implemented in MMiRKAT to correct for the conservativeness of the association test when the sample size is small or moderate. The proposed method is assessed via simulation studies and an application to a real data set examining the association between host gene expression and mucosal microbiome composition. We demonstrate that MMiRKAT is more powerful than large sample based multivariate kernel association test, while controlling the type I error. A free implementation of MMiRKAT in R language is available at http://research.fhcrc.org/wu/en.html.

Number of times cited according to CrossRef: 13

  • Correlation and association analyses in microbiome study integrating multiomics in health and disease, , 10.1016/bs.pmbts.2020.04.003, (2020).
  • Testing hypotheses about the microbiome using the linear decomposition model (LDM), Bioinformatics, 10.1093/bioinformatics/btaa260, (2020).
  • A review of kernel methods for genetic association studies, Genetic Epidemiology, 10.1002/gepi.22180, 43, 2, (122-136), (2019).
  • Exact variance component tests for longitudinal microbiome studies, Genetic Epidemiology, 10.1002/gepi.22185, 43, 3, (250-262), (2019).
  • Relationship Between MiRKAT and Coefficient of Determination in Similarity Matrix Regression, Processes, 10.3390/pr7020079, 7, 2, (79), (2019).
  • pldist: ecological dissimilarities for paired and longitudinal microbiome association analysis, Bioinformatics, 10.1093/bioinformatics/btz120, (2019).
  • An Adaptive Multivariate Two-Sample Test With Application to Microbiome Differential Abundance Analysis, Frontiers in Genetics, 10.3389/fgene.2019.00350, 10, (2019).
  • Inference on phenotype‐specific effects of genes using multivariate kernel machine regression, Genetic Epidemiology, 10.1002/gepi.22096, 42, 1, (64-79), (2018).
  • A small‐sample kernel association test for correlated data with application to microbiome association studies, Genetic Epidemiology, 10.1002/gepi.22160, 42, 8, (772-782), (2018).
  • An adaptive microbiome α-diversity-based association analysis method, Scientific Reports, 10.1038/s41598-018-36355-7, 8, 1, (2018).
  • A Small Sample Prediction Method for Engineering p-S-N Curve , Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 10.1051/jnwpu/20183650831, 36, 5, (831-838), (2018).
  • A fast small‐sample kernel independence test for microbiome community‐level association analysis, Biometrics, 10.1111/biom.12684, 73, 4, (1453-1463), (2017).
  • Computational profiling of the gut–brain axis: microflora dysbiosis insights to neurological disorders, Briefings in Bioinformatics, 10.1093/bib/bbx154, (2017).

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.