0

I have a list of dataframes where I try to apply a function. My function should iterate 3 times. After each iteration the results should be saved on the results list.

My data frames have numeric content and different names of columns except for the last 6 columns (which have the same name).

My code is as follows:

# suposse i have three df with the following names
myfirstdf 
myseconddf
mythirddf 

mydflist # a list containing 3 data frames
for (i in 1:3){
results[[i]] <- lapply(mydflist, function(x) {         
longdata <- ncol(x)-i
sum ( x[,1:longdata])
} )

names(results[[i]]) <- sprintf("results[[i]]", 1:length(results))
}

What I want to do is to access to the results of each dataframe by adding the ith number of iteration, something like: results$mydflist$myfirstdfiwhere iwill be the number of iteration so results$mydflist$myfirstdf1. But with my code I've got results$results1$results1

2
  • Unless I've misunderstood something, it might be more convenient to reverse the order of your for loop and lapply; i.e. (1) build a function that takes a "data.frame", iterates 3 times, adds a column each time and, returns the result, (2) lapply your function over "mydflist". Perhaps, an example mydflist and wanted output will help better. Commented Jan 28, 2016 at 14:13
  • I've just edited my question, hope is more clear Commented Jan 28, 2016 at 14:43

1 Answer 1

1

Your code produces a list of length 3, the number of iterations, and each of the three list items is again a list of length 3, the number of dataframes in mydflist. But from the formulation

What I want to do is to access to the results of each dataframe by adding the ith number of iteration, something like: results$mydflist$myfirstdfi where i will be the number of iteration so results$mydflist$myfirstdf1.

in your question I guess what you really want is a flat list of length 9, containing one item for each each iteration step and each dataframe in mydflist, named

"myfirstdf1" "myseconddf1" "mythirddf1"

"myfirstdf2" "myseconddf2" "mythirddf2"

"myfirstdf3" "myseconddf3" "mythirddf3".

The following function can handle both cases:

iteration <- function( dfList, fnct, numberOfIterations, flat=TRUE )
{
  L <- list()

  for (i in 1:numberOfIterations){
    L[[i]] <- lapply( dfList, fnct, i )
    names(L[[i]]) <- paste0( names(dfList), i )
  }
  return( if (flat) unlist(L,recursive=FALSE) else L )
}

Example:

mydflist <- list(
  myfirstdf  = data.frame(matrix(1:20,4,5)),
  myseconddf = data.frame(matrix(1:12,2,6)),
  mythirddf  = data.frame(matrix(1:15,3,5))
)

f <- function(df,i)
{
  longdata <- ncol(df)-i
  sum(df[,1:longdata])
}

results      <- iteration(mydflist,f,4,FALSE)
results_flat <- iteration(mydflist,f,4)

(I've changed the number of iterations from 3 to 4, to avoid confusion with the number of dataframes.) Here is the resulting list results, which is not flat:

> results
[[1]]
[[1]]$myfirstdf1
[1] 136

[[1]]$myseconddf1
[1] 55

[[1]]$mythirddf1
[1] 78


[[2]]
[[2]]$myfirstdf2
[1] 78

[[2]]$myseconddf2
[1] 36

[[2]]$mythirddf2
[1] 45


[[3]]
[[3]]$myfirstdf3
[1] 36

[[3]]$myseconddf3
[1] 21

[[3]]$mythirddf3
[1] 21

[[4]]
[[4]]$myfirstdf4
[1] 10

[[4]]$myseconddf4
[1] 10

[[4]]$mythirddf4
[1] 6

Notice that the number of the iteration step appears twice. For example the result for third dataframe in the first iteration step is

> results[[1]]$mythirddf1
[1] 78

In the names of the flat list results_flat the number of the iteration step appears only once:

> results_flat
$myfirstdf1
[1] 136

$myseconddf1
[1] 55

$mythirddf1
[1] 78

$myfirstdf2
[1] 78

$myseconddf2
[1] 36

$mythirddf2
[1] 45

$myfirstdf3
[1] 36

$myseconddf3
[1] 21

$mythirddf3
[1] 21

$myfirstdf4
[1] 10

$myseconddf4
[1] 10

$mythirddf4
[1] 6

E.g. the result for third dataframe in the first iteration step is

> results_flat$mythirddf1
[1] 78

If you want to access this result via results$mydflist$mythirddf1, then build a one component list results as follows:

> results <- list(mydflist=iteration(mydflist,f,4))

The one and only component of this list results is the list results_flat above, and its name is mydflist:

> results
$mydflist
$mydflist$myfirstdf1
[1] 136

$mydflist$myseconddf1
[1] 55

$mydflist$mythirddf1
[1] 78

$mydflist$myfirstdf2
[1] 78

$mydflist$myseconddf2
[1] 36

$mydflist$mythirddf2
[1] 45

$mydflist$myfirstdf3
[1] 36

$mydflist$myseconddf3
[1] 21

$mydflist$mythirddf3
[1] 21

$mydflist$myfirstdf4
[1] 10

$mydflist$myseconddf4
[1] 10

$mydflist$mythirddf4
[1] 6
Sign up to request clarification or add additional context in comments.

2 Comments

I've edited and enhanced my answer. Hopefully now it contains the desired results.
The edited answer works perfectly! Thanks!! may i question: does the difference between resultsand results_flatis only when I want to access to each list?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.