0

I am reading a couple of excel files from a directory and I want the dataframe that is read to be dynamically named as per a vector of strings

I have a string vector which has name of countries cnts <- c("de", "ar", "fr")

Then I read an excel file, whose path is stored in a vector (file) already df <- read.xlsx(file[1], 1) Now I want to rename df to the first element in the countries vector, so I do cnts[1] <- df

But this does not work and gives me an error

In cnts[2] <- df number of items to replace is not a multiple of replacement length

I want the df to renamed as de I know the problem, it is trying to write the whole df to a string vector at position 1, but how can I dynamically rename dataframes?

1
  • I don't understand what you need. You want a variable named "de" which will contain the data frame df ? (and the second variable is "ar" which contain the second data frame etc ...) Commented Jun 26, 2019 at 13:01

3 Answers 3

2

cnts[1] <- df means that you tried is storing a dataframe in a string of length 1 "de" <- df.

You can use assign, you must read why using assign is bad

cnts <- c("de", "ar", "fr")

df <- data.frame(a=1:5)

assign(cnts[1], df)
de

A better practice would be to use a list of size cnts and affect the dataframe to the right element of the list.

Sign up to request clarification or add additional context in comments.

1 Comment

Totally in line with you. OP should look into lists or other data structures.
2

With cnts[1] <- df you are telling R to add a dataframe to the first element of character vector cnts, which isn't possible. You can use assign to achieve what you want, but the general consensus is that assign should be avoided, particulary when programmatically importing multiple files. It might be a bit counterintuitive at first, but it often makes more sense to put your dataframes in named lists, e.g.:

cnts <- c("de", "ar", "fr")

# Create an empty list with names from `cnts`.
df_list <- vector(mode = "list", length = length(cnts))
names(df_list) <- cnts

# Read in the XLSX and add to appropriate list element.
df_list[[cnts[1]]] <- read.xlsx(file[1])

Instead of df_list <- vector(mode = "list", length = length(cnts)) you could also just use df_list <- list(), but the former is more efficient, particularly as your lists get longer. You can use either in your case, but it's never too early to learn good habits that will spare you some frustration down the road.

You'll end up with something like the following object:

$de
  one two
1   1   3
2   2   4

$ar
  one two
1   1   3
2   2   4

$fr
  one two
1   1   3
2   2   4

If you want to be super efficient you can also do something like this, assuming the names in cnts and the file names in file match positionally:

df_list <- lapply(file, read.xlsx)
names(df_list) <- cnts

2 Comments

I like your advice, but for someone new to lists, `names<-`(vector(mode = "list", length = length(cnts)), cnts) would be extremely confusing--and muddies the fact that generally the way to initialize a list would be df_list <- list() (or use lapply). Why not just name the list after filling it, with a nicely standard names(df_list) <- cnts?
@Gregor good point. Fixed the part with names<-(), but I disagree about using vector as it's not great practice to grow vectors. I did, however, expand on the part with lapply.
0

Another option is to read all the datasets into a list, set the names of the list elements with 'cnts' (assuming it is the same order) and pollute the global environment with lots of objects (list2env)

list2env(setNames(lapply(files, read.xlsx), cnts), .GlobalEnv)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.