0

I have 10 datasets in a folder, with 4 columns, which I wish to read in as seperate dataframes in r, for which I use the following to do:

temp = list.files(pattern="*.csv")
for(i in 1:length(temp)){
  assign(paste("name",i,sep = ""), as.data.frame(read.table(temp[i])))
}

Then if i want to change the column names as well as adding a new column V5 <- V3**2 in either the same loop or a different loop, how could this be done.

The other suggestions for changing column names i've seen here in stackoverflow suggest creating a list of columns and then changing them. But they dont change the data in the global environment.

Could any of you help with this?

Many thanks.

3
  • I discourage the use of assign in almost every situation. In this case, I'd suggest the data be in a list, ala x <- lapply(temp, read.table). If you need to add columns, you can do x <- lapply(x, function(L) transform(L, V5=V3^2)). Commented Feb 16, 2019 at 18:09
  • thanks, could one then also just use lapply to change column names of those file columns? Commented Feb 16, 2019 at 18:14
  • Certainly. You can do whatever you want. If you want to change the names in just one of them, then you can do colnames(x[[3]]) <- c(...). If you want to change the second column name in all of them, then x <- lapply(x, function(L) { colnames(L)[2] <- "quux"; L; }). Commented Feb 16, 2019 at 18:24

2 Answers 2

1

The following will read-in the .csv files in "path" , unifying their column names and adding an additional computational column and then combine them all into a single data fame.

path <- ""
temp <- list.files(path=path,pattern="*.csv",full.names = T)
dfs <- lapply(temp,function(x)
  {
    df <- read.csv(x,stringsAsFactors = F,col.names=c("col1","col2","col3","col4"))
    df$col5 <- 1*2
    df
  })

do.call("rbind",dfs)
Sign up to request clarification or add additional context in comments.

1 Comment

I think there is a slight blur of concepts here. The OP asked about dealing with multiple frames but never suggested that they be combined, so the do.call at the end is presumptuous. (It is certainly valid in many situations, I don't know that it is here.) Good safe use of full.names, I always encourage it in a defensive coding posture.
0

Rename all the datasets in an order like df-01, df-02... df-10 and read like following

   for(ii in 2:5){
       input_csv <- sprintf('sample_-%02d.csv', ii)
       read.csv(input_csv, stringsAsFactors = F,col.names=c("col1","col2","col3","col4"))
       print(input_csv)
       df$V5 <- df$V3**2
    }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.