2

I want to apply a function to each column in R. Suppose following is the dataframe with (3xn):

df <- data.frame(
  h1 = c(1,2,3),
  h2 = c(2,3,1),
  h3 = c(3,2,1),
  h4 = c(1,2,3),
  h5 = c(1,2,3)
)
rownames(df) <- c("e1", "e2", "e3")
df
#    h1 h2 h3 h4 h5
# e1  1  2  3  1  1
# e2  2  3  2  2  2
# e3  3  1  1  3  3

if we want to check if the first 2 elements suppose (e1==1, e2==2) for each column (h1,h2...). How could we apply the checking function to each column in the data frame?

5
  • Please do not post an image of code/data/errors: it cannot be copied or searched (SEO), it breaks screen-readers, and it may not fit well on some mobile devices. Ref: meta.stackoverflow.com/a/285557/3358272 (and xkcd.com/2116). Please just include the code or data (e.g., dput(head(x)) or data.frame(...)) directly. Commented Oct 31, 2019 at 17:10
  • @r2evans When it was posted first, it was not an image though. I think it got edited Commented Oct 31, 2019 at 17:11
  • You don't have permissions yet to show an image. But if you put it in with that intent, typically somebody edits your question to actually show the image. But my point is that an image of data does me (and others) no good, and I categorically won't spend time transcribing data from an image into usable code or data. It is just as easy (perhaps easier) for you to copy text from your R console and paste into a code-block than to get a screenshot and post it in as an image. Commented Oct 31, 2019 at 17:12
  • 1
    In general, "apply function to each column" is literally lapply(dataframe, myfunc). akrun's suggestion to use colSums is one of the special cases, and is much more efficient in this situation. Commented Oct 31, 2019 at 17:12
  • For the record, after taking the sample data in alex_jwb90's answer (and changing to data.frame), this question is a bit more easily reproducible. I kept the row names solely because you referenced them as e1==1, etc; note that many operations on frames will not preserve row names, including just about everything within the tidyverse meta-package; so while I can see some utility in row names in general (and it can be a polarizing opinion for some), I normally don't use or rely on them. Commented Oct 31, 2019 at 17:47

3 Answers 3

3

Subset the rows of the data based on either row.names or the head, compare == with a vector of values, get the colSums of the logical matrix derived from it and check if that is equal to 2 i.e. if both the elements are TRUE for each column

colSums(mat[c("e1", "e2"),] == c(1, 2))==2

Or with apply, loop over the columns (MARGIN = 2), apply the function (anonymous function call) and check if all are TRUE from the comparison

apply(head(mat, 2), 2, function(x) all(x  == c(1, 2)))
Sign up to request clarification or add additional context in comments.

3 Comments

thank you for your answer. But, I have to do something like following and hence need to use apply/lapply/sapply function in R: rank.shape = function(x) { # i here ranges from 1 to 5 which is the number of columns df = NA if (x[1,][i]==2 && x[2,][i]==1){ //.... } else if(x[1,][i]==2 && x[2,][i]==1.5){ //....) } list.shape = lapply(matrixRanks[1,], rank.shape)
@ChimiWangmo I answered for the question posted
Chimi, that much code in a comment doesn't always format well. Further, you explicitly say you need to use one of the apply family of functions, so please be clear in your question. Using "apply" as a verb does not clearly indicate using apply as a function.
3

Using @alex_jwb90's data,

lapply(df, function(a) a[1:2] == 1:2)
# $h1
# [1] TRUE TRUE
# $h2
# [1] FALSE FALSE
# $h3
# [1] FALSE  TRUE
# $h4
# [1] TRUE TRUE
# $h5
# [1] TRUE TRUE

lapply(df, function(a) all(a[1:2] == 1:2))
# $h1
# [1] TRUE
# $h2
# [1] FALSE
# $h3
# [1] FALSE
# $h4
# [1] TRUE
# $h5
# [1] TRUE

sapply(df, function(a) all(a[1:2] == 1:2))
#    h1    h2    h3    h4    h5 
#  TRUE FALSE FALSE  TRUE  TRUE 

Comments

0

You can try this (extensible to check more than two rows if you remove the & row_number() <= 2)

library(dplyr)

df = tibble(
  h1 = c(1,2,3),
  h2 = c(2,3,1),
  h3 = c(3,2,1),
  h4 = c(1,2,3),
  h5 = c(1,2,3)
)

df %>%
  mutate_all(
    list(equals_rownum = ~.==row_number() & row_number() <= 2)
  )

If you don't want to create new columns <col>_equals_rownum but replace h1,h2,...-columns, just remove the name in the list-call.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.