R : add a column with missing values to a dataframe

Question

I am using financial data and the row names of my main dataframe are dates.

   > assets[1:3,1:5]
            ALD   SFN  TCO KIM   CTX
2003-01-03 48.1 23.98 23.5  23 22.34
2003-01-06 48.1 23.98 23.5  23 22.34
2003-01-07 48.1 23.98 23.5  23 22.34

I would like to add a column (here I want to add FOC$close to assets) from a dataframe that is of same type but some dates are missing :

   > FOC[1:3,1:2]
           Close Adj.Close
2003-01-03   510       510
2003-01-07   518       518

The missing values should just be NA's, so it would look like that :

   > assets[1:3,1:6]
            ALD   SFN  TCO KIM   CTX FOC
2003-01-03 48.1 23.98 23.5  23 22.34 510
2003-01-06 48.1 23.98 23.5  23 22.34 NA
2003-01-07 48.1 23.98 23.5  23 22.34 518

Is there a nice way to do that? I managed to do something similar with rows by doing something like

> rowtoadd <- list(ALD=18.1,...)
> dataframe[nrow(dataframe) + 1, names(rowtoadd)] <- rowtoadd

but I am not able to do this for columns.

slushy · Accepted Answer · 2014-04-18 16:03:05Z

0

You can use the merge method.

I think you are using xts time-series objects. These handle the row names automatically. From help(merge.xts), there is a keyword argument join that you can use to control how the merge occurs. It defaults to 'outer'. Example:

dat = merge(assets[1:3,], FOC[,1:2], join='left')
> dat
            ALD   SFN  TCO KIM   CTX Close Adj.Close
2003-01-03 48.1 23.98 23.5  23 22.34   510       510
2003-01-06 48.1 23.98 23.5  23 22.34    NA        NA
2003-01-07 48.1 23.98 23.5  23 22.34   518       518

edited Apr 18, 2014 at 16:03

answered Apr 18, 2014 at 15:42

slushy

3,4371 gold badge20 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Robert Krzyzanowski Over a year ago

I don't think that is what the OP was looking for. The problem is that the row names are still ignored. (see the last comment in my answer)

Lemko Over a year ago

I just found what I needed : dat <- merge(assets,FOC,by="row.names",all.x=T) does the trick, I didn't know I could use all.x=T to get the NA's. Thanks for the help.

slushy Over a year ago

@Robert Krzyzanowski Thank you -- I updated the response to make it clear that merge for xts operates on the row names (which must be dates).

Robert Krzyzanowski · Accepted Answer · 2014-04-18 15:43:07Z

You could fill them in first and then cbind:

# Example data
df <- data.frame(list(split(rep(c(48.1, 23.98, 23.5, 23, 22.34), each = 3), rep(1:5, each = 3))))
colnames(df) <- c('ALD', 'SFN', 'TCO', 'KIM', 'CTX')
row.names(df) <- paste0('2003-01-0', c(3, 6, 7))
df <- df[order(as.POSIXct(row.names(df))), ] # This is important for cbind to work right
FOC <- data.frame(Close = c(510, 518), Adj.Close = c(510, 518))
row.names(FOC) <- paste0('2003-01-0', c(3, 7))

# Fill in NAs
FOC[setdiff(row.names(df), row.names(FOC)), ] <- NA
df <- cbind(df, FOC[order(as.POSIXct(row.names(FOC))), 1])
colnames(df)[length(df)] <- 'FOC'

The result:

            ALD   SFN  TCO KIM   CTX FOC
2003-01-03 48.1 23.98 23.5  23 22.34 510
2003-01-06 48.1 23.98 23.5  23 22.34 NA
2003-01-07 48.1 23.98 23.5  23 22.34 518

Sort by as.POSIXct(row.names(..)) is important because cbind does not check. Without it, we would get

            ALD   SFN  TCO KIM   CTX FOC
2003-01-03 48.1 23.98 23.5  23 22.34 510
2003-01-06 48.1 23.98 23.5  23 22.34 518
2003-01-07 48.1 23.98 23.5  23 22.34 NA

Collectives™ on Stack Overflow

R : add a column with missing values to a dataframe

2 Answers 2

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related