0

I've got a large data frame finaldata and i would like to produce a bunch of other smaller data frames explanatory1, explanatory2 e.t.c.... consisting of 10 columns each from finaldata

I'm trying to do this using a for loop but its throwing me an error attempt to apply non function

for(i in 1:length(finaldata)/10) {
  nam <- paste("explanatory", i, sep = "")
  assign(nam, finaldata[,10(i):10(i)+10])
}

I have also tried

for(i in 1:length(finaldata)/10){
  assign(paste("explanatory",i,sep=""),finaldata[,10(i):10(i)+10])}

But this gave me the same error, from what I understand the error is being caused by my passing finaldata[,10(i):10(i)+10] as an argument to assign, but I don't see why it wouldn't work ina for loop, or be any different from passing finaldata[,10:10+10]

Any help would be greatly appreciated!

1
  • 2
    To clarify, do you want to split up finaldata into multiple data.frames, with columns 1:10, 11:20, 21:30, ..., respectively? Commented Jun 9, 2014 at 12:43

4 Answers 4

4

Using split:

ll <- lapply(split(colnames(finaldata),rep(seq_len(ncol(finaldata)/10),each=10)),
       function(x)finaldata[,x])

This will create a list. But You can extract from it separate variables (not recommanded) :

ll <- setNames(ll,paste0("explanatory",seq_along(ll)))
list2env(ll)
Sign up to request clarification or add additional context in comments.

1 Comment

This does the first part, but doesn't produce separate lists, instead produces a data frame of lists.
2

Create sample data to play with:

df <- data.frame(matrix(vector(), 10, 33))

Find the number of dataframes you're going to create:

number_of_dataframes <- ceiling(ncol(df) / 10)

Loop through the dataframes, finding a range of columns to use for creating that individual dataframe. Use assign to give each one a unique name:

current_column <- 1
for (i in 1:number_of_dataframes) {
  start_column <- current_column
  end_column <- min(current_column + 9, ncol(df))
  assign(paste0("df",i), df[ , start_column:end_column])
  current_column <- end_column + 1
}

The min check makes sure you don't attempt to assign more columns than existed in the original dataframe.

Comments

1

You were almost there... Try this...

for(i in 1:ncol(finaldata)/10) {
  nam <- paste0("explanatory")
  if((10*(i - 1)+10) > ncol(finaldata)){
        assign(nam, finaldata[,(10*(i-1) +1):ncol(finaldata)])
  }else{
        assign(nam, finaldata[,(10*(i-1) +1):(10*(i - 1)+10)])
  }
}

Comments

1

This did exactly it, thanks to @canary_in_the_data_mine. Choose "number_of_dataframes" to be some factor of "finaldata" that you want for your purpose, then:

number_of_dataframes <- ceiling(ncol(finaldata) / 5)
current_column <- 1
for (i in 1:number_of_dataframes) {
  start_column <- current_column
  end_column <- current_column + 5
  assign(paste0(explanatory,i), finaldata[,start_column:end_column])
  current_column <- end_column + 1
}

The only change I made was to end_column.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.