phoenics
is an R package designed to
perform a differential analysis for metabolomics datasets, while
accounting for metabolomic pathway information. More precisely,
phoenics
performs a test at pathway level based on
metabolite quantifications and information on pathway metabolite
composition. The automatic query of metabolic pathways is implemented in
the package.
library("phoenics")
The dataset MTBLS422
contains data obtained from the
study Choo et al., 2017. Raw data have been
made available on MetaboLights (with the id MTBLS422 https://www.ebi.ac.uk/metabolights/editor/MTBLS422).
Metabolite quantifications were obtained based on the raw signal using
ASICS
package and are provided in the object quantif
included in
the dataset.
data("MTBLS422")
ls()
## [1] "design" "pathways" "quantif"
head(quantif)
## 1071 1151 1231 1281 1371
## Glycerol 0.000000000 0.000000000 0 0.000000000 0.000000000
## D-Glucose-6-Phosphate 0.002339491 0.001251202 0 0.001958586 0.002652924
## D-Galactose 0.001628629 0.003448431 0 0.001535844 0.002693243
## D-Sorbitol 0.000000000 0.000000000 0 0.000000000 0.000000000
## Myo-Inositol 0.001190331 0.001929542 0 0.001325843 0.001760645
## D-GlucuronicAcid 0.001495646 0.003459446 0 0.001377114 0.001855756
## 1531 1681 1751 2081
## Glycerol 0.001403664 0.0000000000 0.0008294523 0.000000000
## D-Glucose-6-Phosphate 0.002877861 0.0035782998 0.0025688724 0.002273193
## D-Galactose 0.001952493 0.0037286043 0.0000000000 0.001240772
## D-Sorbitol 0.001066357 0.0009889747 0.0012056168 0.000000000
## Myo-Inositol 0.001430719 0.0028295101 0.0011248417 0.002085898
## D-GlucuronicAcid 0.001791299 0.0007645118 0.0000000000 0.001816359
## 2161 2311
## Glycerol 0.000000000 0.000000000
## D-Glucose-6-Phosphate 0.001106493 0.002437845
## D-Galactose 0.002072136 0.001392737
## D-Sorbitol 0.000000000 0.000000000
## Myo-Inositol 0.001382456 0.001893727
## D-GlucuronicAcid 0.002410063 0.001502470
They can be transformed into a data format suitable for
phoenics
with:
quantif_f <- from_ASICS_to_PHOENICS(quantif)
head(quantif_f)
## C00116 C00092 C00984 C00794 C00137 C00191
## 1071 0.000000000 0.002339491 0.001628629 0.000000000 0.001190331 0.001495646
## 1151 0.000000000 0.001251202 0.003448431 0.000000000 0.001929542 0.003459446
## 1231 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000
## 1281 0.000000000 0.001958586 0.001535844 0.000000000 0.001325843 0.001377114
## 1371 0.000000000 0.002652924 0.002693243 0.000000000 0.001760645 0.001855756
## 1531 0.001403664 0.002877861 0.001952493 0.001066357 0.001430719 0.001791299
## C00864 C01697 C00029 C00003
## 1071 0.0005098139 0.0000000000 0.0005008620 0.0000000000
## 1151 0.0007075218 0.0005573846 0.0000000000 0.0000000000
## 1231 0.0000000000 0.0000000000 0.0000000000 0.0009024399
## 1281 0.0004129427 0.0005796955 0.0004431261 0.0000000000
## 1371 0.0007317685 0.0005080792 0.0000000000 0.0000000000
## 1531 0.0002865527 0.0000000000 0.0006293917 0.0000000000
In addition, pathway information was obtained using KEGGREST
package and is included in pathways
:
head(pathways)
## metabolite_code metabolite_name pathway_code pathway_name
## 24 C00029 UDP-glucose mmu00052 Galactose metabolism
## 25 C00116 Glycerol mmu00052 Galactose metabolism
## 26 C00137 myo-Inositol mmu00052 Galactose metabolism
## 27 C00794 D-Sorbitol mmu00052 Galactose metabolism
## 28 C00984 alpha-D-Galactose mmu00052 Galactose metabolism
## 29 C01697 Galactitol mmu00052 Galactose metabolism
The user can also query this information using the function
pathway_search
:
pathways <- pathway_search(metab = colnames(quantif), organism = "mmu")
phoenics
testThe test implemented in phoenics
is a mixed model with
fixed and random effects. The current study design is provided in
design
:
head(design)
## Id Age Treatment Mouse
## 1071 T1_G1R1 5.5 Control G1R1
## 1151 T1_G1R2 5.5 Control G1R2
## 1231 T2_G1R2 7.5 Control G1R2
## 1281 T2_G1R1 7.5 Control G1R1
## 1371 T1_G3R2 5.5 Vanco-Imi G3R2
## 1531 T3_G1R1 9.0 Control G1R1
The Age
and Treatment
are thus used as
fixed effects and the Mouse
is used as a random effect:
out_test <- test_pathway(quantif_f, design, pathways,
fixed = c("Age", "Treatment"), random = "Mouse",
npc = 2, model = "blmer")
out_test
## phoenics method based on PCA
## A list of pathways which contains for each pathway:
## $pathway_name: Pathway name
## $pathway_code: Pathway code
## $metabolites: Metabolites in the pathway
## $PCA: Factor analysis results
## $model: Model results
## $test_pathway: Pathway test results
Results are organized as a list where each element of the list corresponds to a tested pathway:
names(out_test)
## [1] "Galactose metabolism" "Inositol phosphate metabolism"
## [3] "Vitamin digestion and absorption"
The function extract
can be used to extract results on a
given pathway:
res_1 <- extract(out_test, "Galactose metabolism")
Fixed effect \(p\)-values and full
model are provided, respectively, in entries test_pathway
and model
of this object.
res_1[[1]]$test_pathway
## Fixed_effect pval
## 1 Age 0.2145888
## 2 Treatment 0.3831807
res_1[[1]]$model
## $mmu00052_PC1
## Cov prior : Mouse ~ wishart(df = 3.5, scale = Inf, posterior.scale = cov, common.scale = TRUE)
## Prior dev : -0.6945
##
## Linear mixed model fit by REML ['blmerMod']
## Formula: mmu00052_PC1 ~ Age + Treatment + (1 | Mouse)
## Data: score_path
## REML criterion at convergence: 40.441
## Random effects:
## Groups Name Std.Dev.
## Mouse (Intercept) 2.054
## Residual 1.629
## Number of obs: 11, groups: Mouse, 4
## Fixed Effects:
## (Intercept) Age TreatmentVanco-Imi
## 2.0008 -0.3463 1.1450
##
## $mmu00052_PC2
## Cov prior : Mouse ~ wishart(df = 3.5, scale = Inf, posterior.scale = cov, common.scale = TRUE)
## Prior dev : -1.996
##
## Linear mixed model fit by REML ['blmerMod']
## Formula: mmu00052_PC2 ~ Age + Treatment + (1 | Mouse)
## Data: score_path
## REML criterion at convergence: 36.1712
## Random effects:
## Groups Name Std.Dev.
## Mouse (Intercept) 2.209
## Residual 1.136
## Number of obs: 11, groups: Mouse, 4
## Fixed Effects:
## (Intercept) Age TreatmentVanco-Imi
## -3.1797 0.3949 0.5002
In addition, adjusted \(p\)-values
(BH method) across pathways can be directly obtained
with the function adjust_pval
:
adjust_pval(out_test)
## pathway_name pathway_code Fixed_effect pval
## 1 Galactose metabolism mmu00052 Age 0.214588838
## 3 Inositol phosphate metabolism mmu00562 Age 0.037233449
## 5 Vitamin digestion and absorption mmu04977 Age 0.009257926
## 2 Galactose metabolism mmu00052 Treatment 0.383180685
## 4 Inositol phosphate metabolism mmu00562 Treatment 0.485524008
## 6 Vitamin digestion and absorption mmu04977 Treatment 0.419680043
## adjusted_pval
## 1 0.21458884
## 3 0.07446690
## 5 0.02777378
## 2 1.00000000
## 4 1.00000000
## 6 1.00000000
PCA results can be explored using the entry PCA:
res_1[[1]]$PCA
## **Results for the Principal Component Analysis (PCA)**
## The analysis was performed on 11 individuals, described by 10 variables
## *The results are available in the following objects:
##
## name description
## 1 "$eig" "eigenvalues"
## 2 "$var" "results for the variables"
## 3 "$var$coord" "coord. for the variables"
## 4 "$var$cor" "correlations variables - dimensions"
## 5 "$var$cos2" "cos2 for the variables"
## 6 "$var$contrib" "contributions of the variables"
## 7 "$ind" "results for the individuals"
## 8 "$ind$coord" "coord. for the individuals"
## 9 "$ind$cos2" "cos2 for the individuals"
## 10 "$ind$contrib" "contributions of the individuals"
## 11 "$quali.sup" "results for the supplementary categorical variables"
## 12 "$quali.sup$coord" "coord. for the supplementary categories"
## 13 "$quali.sup$v.test" "v-test of the supplementary categories"
## 14 "$call" "summary statistics"
## 15 "$call$centre" "mean of the variables"
## 16 "$call$ecart.type" "standard error of the variables"
## 17 "$call$row.w" "weights for the individuals"
## 18 "$call$col.w" "weights for the variables"
In particular, phoenics
implements some plots that can
be directly obtained using
plot(out_test, pathway_id = "Galactose metabolism", plot = "var")
plot(out_test, pathway_id = "Galactose metabolism", plot = "ind",
habillage = "Age")
plot(out_test, pathway_id = "Galactose metabolism", plot = "eig")
plot(out_test, pathway_id = "Galactose metabolism", plot = "var")
plot(out_test, pathway_id = "Galactose metabolism", plot = "group")
phoenics
main test functionA MFA step, more suited to can be used in place of the PCA. In this case, the levels first fixed effect are used to split the effects into different tables which are matched by the levels of the first random effect (the typical case is when the first fixed effect contains information on time points in a repeated measurement study where samples are provided by the first random effect).
out_test <- test_pathway(quantif_f, design, pathways,
fixed = c("Age", "Treatment"), random = "Mouse",
npc = 2, model = "blmer", analysis = "MFA")
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
out_test
## phoenics method based on MFA
## A list of pathways which contains for each pathway:
## $pathway_name: Pathway name
## $pathway_code: Pathway code
## $metabolites: Metabolites in the pathway
## $PCA: Factor analysis results
## $model: Model results
## $test_pathway: Pathway test results
The option pathways = "auto"
can also be used to
automatically query KEGGREST instead of performing the two step analysis
relying on test_pathway
:
out_test3 <- test_pathway(quantif, design, pathways = "auto",
fixed = c("Age", "Treatment"), random = "Mouse",
npc = 2, model = "blmer", organism = "mmu")
out_test3
[1] Choo J. M., Kanno T., Zain N. M. M., Leong L. E. X., Abell G. C. J., Keeble J. E., Bruce K. D., Mason A. J., Rogers G. B. (2017). Divergent relationships between fecal microbiota and metabolome following distinct antibiotic-induced disruptions. mSphere, 2(1). DOI: 10.1128/msphere.00005-17.
[2] Benjamini Y, Hochberg Y (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B, 57(1): 289–300. DOI: 10.1111/j.2517-6161.1995.tb02031.x
sessionInfo()
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=fr_FR.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Europe/Paris
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] phoenics_0.5
##
## loaded via a namespace (and not attached):
## [1] gtable_0.3.6 xfun_0.49 bslib_0.8.0
## [4] ggplot2_3.5.1 htmlwidgets_1.6.4 ggrepel_0.9.6
## [7] rstatix_0.7.2 lattice_0.22-6 vctrs_0.6.5
## [10] tools_4.4.2 generics_0.1.3 sandwich_3.1-1
## [13] tibble_3.2.1 cluster_2.1.6 pkgconfig_2.0.3
## [16] Matrix_1.7-1 scatterplot3d_0.3-44 lifecycle_1.0.4
## [19] factoextra_1.0.7 farver_2.1.2 compiler_4.4.2
## [22] munsell_0.5.1 leaps_3.2 codetools_0.2-20
## [25] carData_3.0-5 htmltools_0.5.8.1 sass_0.4.9
## [28] yaml_2.3.10 car_3.1-2 pillar_1.10.0
## [31] ggpubr_0.6.0 nloptr_2.1.1 FactoMineR_2.11
## [34] jquerylib_0.1.4 tidyr_1.3.1 MASS_7.3-61
## [37] flashClust_1.01-2 DT_0.33 cachem_1.1.0
## [40] abind_1.4-8 boot_1.3-31 multcomp_1.4-26
## [43] nlme_3.1-166 tidyselect_1.2.1 digest_0.6.37
## [46] mvtnorm_1.3-2 dplyr_1.1.4 purrr_1.0.2
## [49] labeling_0.4.3 splines_4.4.2 fastmap_1.2.0
## [52] grid_4.4.2 colorspace_2.1-1 cli_3.6.3
## [55] magrittr_2.0.3 survival_3.7-0 broom_1.0.7
## [58] TH.data_1.1-2 withr_3.0.2 backports_1.5.0
## [61] scales_1.3.0 estimability_1.5.1 rmarkdown_2.29
## [64] emmeans_1.10.6 lme4_1.1-35.5 blme_1.0-6
## [67] ggsignif_0.6.4 zoo_1.8-12 coda_0.19-4.1
## [70] evaluate_1.0.1 knitr_1.49 rlang_1.1.4
## [73] Rcpp_1.0.13-1 xtable_1.8-4 glue_1.8.0
## [76] minqa_1.2.8 jsonlite_1.8.9 R6_2.5.1
## [79] multcompView_0.1-10