0

I want to loop through different datasets in a list, using lapply, and in every item of the list through the columns, but only those that are saved in a vector called vector_test. These variables I like to center, so basically subtract the weighted mean of every variable that is looped through in every dataset.

Let's assume I have the following 3 datasets saved in a list:

df1<-data.frame(v1=c(1,2,3,4,5,6,7),
                v2=c(9,8,7,6,5,4,3),
                v3=c(4,5,6,7,4,4,3),
                v4=c(5,6,4,5,6,5,6))

df2<-data.frame(v1=c(1,5,3,4,9,6,7),
                diff_var=c(1,3,4,6,2,3,4),
                v2=c(9,8,2,6,3,4,3),
                v3=c(4,5,6,7,3,4,3),
                v4=c(5,2,4,4,6,1,6))

df3<-data.frame(v1=c(1,5,8,4,2,6,1),
                v2=c(1,8,1,6,2,4,7),
                v3=c(1,5,2,5,3,4,3),
                v4=c(5,9,4,5,6,2,6))

test_liste<-list(df1,df2,df3)

Further, I have names of variables saved in a vector:

vector_test<-c("v3","v4")

Tried a for loop/sapply embedded in lapply but cannot seem to figure out a way of only picking the variables that have identical names from the vector compared to the datasets.

If any clarfication is needed or additional code, please let me know!

Thanks in advance!

1 Answer 1

2

Using lapply you could do:

lapply(test_liste, function(x) {
  x[vector_test] <- lapply(x[vector_test], function(x) x - mean(x))
  x
})
#> [[1]]
#>   v1 v2         v3         v4
#> 1  1  9 -0.7142857 -0.2857143
#> 2  2  8  0.2857143  0.7142857
#> 3  3  7  1.2857143 -1.2857143
#> 4  4  6  2.2857143 -0.2857143
#> 5  5  5 -0.7142857  0.7142857
#> 6  6  4 -0.7142857 -0.2857143
#> 7  7  3 -1.7142857  0.7142857
#> 
#> [[2]]
#>   v1 diff_var v2         v3 v4
#> 1  1        1  9 -0.5714286  1
#> 2  5        3  8  0.4285714 -2
#> 3  3        4  2  1.4285714  0
#> 4  4        6  6  2.4285714  0
#> 5  9        2  3 -1.5714286  2
#> 6  6        3  4 -0.5714286 -3
#> 7  7        4  3 -1.5714286  2
#> 
#> [[3]]
#>   v1 v2         v3         v4
#> 1  1  1 -2.2857143 -0.2857143
#> 2  5  8  1.7142857  3.7142857
#> 3  8  1 -1.2857143 -1.2857143
#> 4  4  6  1.7142857 -0.2857143
#> 5  2  2 -0.2857143  0.7142857
#> 6  6  4  0.7142857 -3.2857143
#> 7  1  7 -0.2857143  0.7142857
Sign up to request clarification or add additional context in comments.

1 Comment

Perfect, thank you! Just one addition: adapted code a little bit to incoporate weighted mean from stats package. Code is following: ` test_liste_neu<-lapply(test_liste, function(y) { y[vector_test] <- lapply(y[vector_test], function(x) x - weighted.mean(x,y$v1,na.rm=T)) y }) `

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.