1

I have two dataframes that I want to be able to rbind together. They have the similar information in them, but not in the same order and not with the same column names (and this is a mix of strings, integers, and real numbers, so matrices will not work).

What I then need to do is convert one of the dataframes (we'll call it new_df) into the same structure as the other dataframe (we'll call it old_df).

I want to create an empty dataframe with the column structure of old_df AND the same number of rows as new_df.

I know I can create the first part of that with empty_df <- old_df[0,], but how can I specify the number of rows?

I know the number of rows I want to end up with, so I'd like to specify that. I cannot find this anywhere.

What I want is something like this (if this worked):

empty_df <- old_df[rep(0,nrow(new_df)),]

I tried:

empty_df <- old_df[rep(0,nrow(new_df)),]

  • This just does the same as old_df[0,] with 0 rows

empty_df <- old_df[0,]

empty_df$ID <- new_df$ids

  • Obviously that doesn't work as I am trying to add a different number of rows
8
  • something like this? as.data.frame(matrix(,nrow(df),ncol(df))) Commented May 10, 2019 at 18:47
  • 3
    This is an XY Problem where you ask help on your y solution of empty data frame but not the x problem of why you need such an object. My guess is you will then iterate through loops to assign rows and cols whereas vectorized methods may be available. Commented May 10, 2019 at 18:52
  • @Wimpel Unfortunately, I cannot use matrices to initialize because I need to preserve the various classes of the "old_df" dataframe. Commented May 10, 2019 at 19:10
  • @Parfait I am trying to initialize the number of rows for the exact reason that I don't want to loop through anything. I've stated the "why" in the question (I need to rbind these two dataframes). Commented May 10, 2019 at 19:10
  • 1
    I'm confused. Then why do you need an empty data frame? Please show us a sample of both data frames. Commented May 10, 2019 at 19:26

1 Answer 1

2

If I understand the question correctly, the following hack will do what the OP wants. It creates the number of rows by setting the row.names attribute directly. And if a dataframe has row names, it must have the corresponding rows.

empty_df <- old_df[0, ]
attr(empty_df, 'row.names') <- 1:nrow(new_df)

str(empty_df)
#'data.frame':  300 obs. of  5 variables:
# $ Sepal.Length: num 
# $ Sepal.Width : num 
# $ Petal.Length: num 
# $ Petal.Width : num 
# $ Species     : Factor w/ 3 levels "setosa","versicolor",..:

The dataframe empty_df now has 300 rows.

Data creation code.

The test data creation code uses the built-in dataset iris.

set.seed(1234)

old_df <- iris
new_df <- rbind(iris, iris)
new_df <- new_df[, sample(ncol(new_df))]
Sign up to request clarification or add additional context in comments.

1 Comment

Perfect! Thank you! That is exactly what I need.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.