Change name of dataframe based on a string vector

Question

I am reading a couple of excel files from a directory and I want the dataframe that is read to be dynamically named as per a vector of strings

I have a string vector which has name of countries cnts <- c("de", "ar", "fr")

Then I read an excel file, whose path is stored in a vector (file) already df <- read.xlsx(file[1], 1) Now I want to rename df to the first element in the countries vector, so I do cnts[1] <- df

But this does not work and gives me an error

In cnts[2] <- df number of items to replace is not a multiple of replacement length

I want the df to renamed as de I know the problem, it is trying to write the whole df to a string vector at position 1, but how can I dynamically rename dataframes?

I don't understand what you need. You want a variable named "de" which will contain the data frame df ? (and the second variable is "ar" which contain the second data frame etc ...) — gdevaux
– gdevaux, Commented Jun 26, 2019 at 13:01

Clemsang · Accepted Answer · 2019-06-26 13:04:41Z

2

cnts[1] <- df means that you tried is storing a dataframe in a string of length 1 "de" <- df.

You can use assign, you must read why using assign is bad

cnts <- c("de", "ar", "fr")

df <- data.frame(a=1:5)

assign(cnts[1], df)
de

A better practice would be to use a list of size cnts and affect the dataframe to the right element of the list.

edited Jun 26, 2019 at 13:04

answered Jun 26, 2019 at 13:03

Clemsang

5,5833 gold badges28 silver badges46 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

boski Over a year ago

Totally in line with you. OP should look into lists or other data structures.

score 2 · Accepted Answer · 2019-06-26 13:35:11Z

2

With cnts[1] <- df you are telling R to add a dataframe to the first element of character vector cnts, which isn't possible. You can use assign to achieve what you want, but the general consensus is that assign should be avoided, particulary when programmatically importing multiple files. It might be a bit counterintuitive at first, but it often makes more sense to put your dataframes in named lists, e.g.:

cnts <- c("de", "ar", "fr")

# Create an empty list with names from `cnts`.
df_list <- vector(mode = "list", length = length(cnts))
names(df_list) <- cnts

# Read in the XLSX and add to appropriate list element.
df_list[[cnts[1]]] <- read.xlsx(file[1])

Instead of df_list <- vector(mode = "list", length = length(cnts)) you could also just use df_list <- list(), but the former is more efficient, particularly as your lists get longer. You can use either in your case, but it's never too early to learn good habits that will spare you some frustration down the road.

You'll end up with something like the following object:

$de
  one two
1   1   3
2   2   4

$ar
  one two
1   1   3
2   2   4

$fr
  one two
1   1   3
2   2   4

If you want to be super efficient you can also do something like this, assuming the names in cnts and the file names in file match positionally:

df_list <- lapply(file, read.xlsx)
names(df_list) <- cnts

edited Jun 26, 2019 at 13:35

answered Jun 26, 2019 at 13:12

user10191355

2 Comments

Gregor Thomas Over a year ago

I like your advice, but for someone new to lists, `names<-`(vector(mode = "list", length = length(cnts)), cnts) would be extremely confusing--and muddies the fact that generally the way to initialize a list would be df_list <- list() (or use lapply). Why not just name the list after filling it, with a nicely standard names(df_list) <- cnts?

user10191355 Over a year ago

@Gregor good point. Fixed the part with names<-(), but I disagree about using vector as it's not great practice to grow vectors. I did, however, expand on the part with lapply.

akrun · Accepted Answer · 2019-06-26 13:20:41Z

0

Another option is to read all the datasets into a list, set the names of the list elements with 'cnts' (assuming it is the same order) and pollute the global environment with lots of objects (list2env)

list2env(setNames(lapply(files, read.xlsx), cnts), .GlobalEnv)

answered Jun 26, 2019 at 13:20

akrun

891k38 gold badges590 silver badges700 bronze badges

Collectives™ on Stack Overflow

Change name of dataframe based on a string vector

3 Answers 3

1 Comment

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related