this is probably a simple issue, but would appreciate some help please!
I have some data:
data
structure(list(Samid = c("AD001", "AD002", "AD004", "AD005",
"AD008", "AD010", "AD011", "AD012", "AD013", "AD014", "AD015",
"AD016", "AD017", "AD018", "AD019", "AD020", "AD021", "AD022",
"AD023", "AD024", "AD025", "AD026", "AD027", "AD028"), GATA3 = c(0.07850703,
0.07850703, 0.4477987, 0.07850703, 0.2362246, 0.44779867, 0.46578259,
0, 0.46578259, 0.44779867, 0.24396914, 0.46578259, 0.23622459,
0.24396914, 0.07850703, 0.07850703, 1.25391517, 0.82224747, 0.07850703,
0.07850703, 0.07850703, 0.07850703, 0.83507423, 0.07850703),
IL4 = c(0, 0, 0, 0, 0, 0, 0, 1.26781758, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0), IL4R = c(1.65301611, 0.14846188,
1.6307388, 0.14846188, 0.2073535, 0.14846188, 0.4656834,
1.48227697, 0.65075963, 0.17073914, 0.14846188, 0.14846188,
0.37809262, 0.17073914, 1.65301611, 0.14846188, 1.55269688,
0.14846188, 2.15320576, 0.17073914, 0.44340614, 0.17073914,
0, 0.44340614), IRF4 = c(0, 0, 0, 0, 0, 0, 2.83446844, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), CD207 = c(0.80003601,
0.33421377, 3.4723849, 2.32021021, 0.5828276, 0.94797393,
0.13406957, 0.70861984, 2.25418614, 1.4883206, 2.38978722,
3.47193671, 0.32452279, 2.31827895, 0.80003601, 0.80003601,
0.50751017, 2.32021021, 3.0989443, 2.0619054, 1.05640955,
3.31881563, 3.37422811, 2.32021021), IL1B = c(0.20787567,
0, 0, 0.20787567, 0, 0.20787567, 0, 0, 0, 0.20787567, 0.20787567,
0, 0, 0, 0, 0, 0, 0.20787567, 0, 0.20787567, 0.20787567,
0.61415248, 0, 0), Clinical.diagnosis = structure(c(2L, 2L,
2L, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 3L, 3L, 2L, 2L, 3L, 3L, 3L,
3L, 3L, 3L, 2L, 3L, 3L, 2L), .Label = c("irritated", "negative",
"positive"), class = "factor")), row.names = c(NA, -24L), class = "data.frame")
I want to run a Mann-whitney U on each gene (columns 2:7) comparing with the last column Clinical.Diagnosis.
I can do this individually:
wilcox.test(data$GATA3, data$Clinical.diagnosis)
However, I want to iterate through each gene. I've tried this, but I'm having an issue parsing the gene into the function:
data %>% results=mutate_at(vars(GATA3:IL1B), ~ wilcox.test(. ~ Clinical.diagnosis))
Sadly doesn't work. I want "." to refer to the contents of the each gene. Finally, when I do get the results, I'd like to append the results (which will be as a list) to the original data frame. For example have 2 columns (W = result, p-value = result).
My priority is getting the test run for every gene though...
Many thanks in advance for your help!