1

I have a data frame (DF) with 3 columns with values and 200 columns with only a heading and NA values. As an an example:

  Row Price  Qty   2.10  2.15  2.20  2.25  ....
    1  2.10  100      0     0     0     0
    2  2.15  200      0     0     0     0   
    3  2.25  100      0     0     0     0    
    4  2.10  100      0     0     0     0  
    5  2.25  300      0     0     0     0  

I am attempting to use dplyr and mutate_each to pass a function to all columns 4:n that uses the column headers name. The function will mutate each column so that for in the row where Price = Column Name, then the Qty amount is applied. See desired results bellow:

  Row Price  Qty   2.10  2.15  2.20  2.25  ....
    1  2.10  100    100     0     0     0
    2  2.15  200      0   200     0     0   
    3  2.25  100      0     0     0   100    
    4  2.10  100    100     0     0     0  
    5  2.25  300      0     0     0   300  

Any thoughts on how to execute this?


# input data
DF <- structure(list(Row = 1:5, Price = c(2.1, 2.15, 2.25, 2.1, 2.25
), Qty = c(100L, 200L, 100L, 100L, 300L), X2.10 = c(0L, 0L, 0L, 
0L, 0L), X2.15 = c(0L, 0L, 0L, 0L, 0L), X2.20 = c(0L, 0L, 0L, 
0L, 0L), X2.25 = c(0L, 0L, 0L, 0L, 0L)), .Names = c("Row", "Price", 
"Qty", "2.10", "2.15", "2.20", "2.25"), class = "data.frame", row.names = c(NA, 
-5L))   

2 Answers 2

6

You can create the price cols by reshaping:

library(reshape2)
dcast(DF[1:3], Row+Price+Qty ~ Price, value.var = "Qty", fill = 0) 

#   Row Price Qty 2.1 2.15 2.25
# 1   1  2.10 100 100    0    0
# 2   2  2.15 200   0  200    0
# 3   3  2.25 100   0    0  100
# 4   4  2.10 100 100    0    0
# 5   5  2.25 300   0    0  300

Prices that don't show up in the second column will not get their own columns using the approach above. If that edge case was important, I would use data.table:

library(data.table)
setDT(DF)
for (j in names(DF)[-(1:3)]){  
  ii = which( DF$Price == as.numeric(j) )
  set(DF, i=ii, j=j, v=DF$Qty[ii]) }

#   Row Price Qty 2.10 2.15 2.20 2.25
# 1   1  2.10 100  100    0    0    0
# 2   2  2.15 200    0  200    0    0
# 3   3  2.25 100    0    0    0  100
# 4   4  2.10 100  100    0    0    0
# 5   5  2.25 300    0    0    0  300

I'm sure something similar can be done with mutate_each.

Sign up to request clarification or add additional context in comments.

3 Comments

I don't see a straightforward way to handle such a case with mutate_each() as it won't capture the column names for each . and find the corresponding Qty
@StevenBeaupré Okay. Well, if you know some other dplyr-ish way to do it, I'd be curious to see.
I couldn't come up with an elegant approach.. I have: data.frame(Price = as.numeric(grep("\\.", names(DF), value = TRUE))) %>% full_join(DF %>% select(-contains(".")), .) %>% tidyr::spread(Price, Qty, drop = FALSE, fill = 0)
1

Another way is to do a join as follows:

require(data.table) # v1.9.6+
setDT(DF)
for (p in tail(names(DF), -3L))
    DF[.(Price=as.numeric(p)), (p) := Qty, on="Price"]
DF[]
#    Row Price Qty 2.10 2.15 2.20 2.25
# 1:   1  2.10 100  100    0    0    0
# 2:   2  2.15 200    0  200    0    0
# 3:   3  2.25 100    0    0    0  100
# 4:   4  2.10 100  100    0    0    0
# 5:   5  2.25 300    0    0    0  300

Joins can be combined with aggregations and updates (just like subsets are) with data.table's syntax. The idea is to extract the rows that match using the i argument, and update those rows for that column with the corresponding value of Qty.

Note that there are no copies made here. The original data.table is updated by reference.

1 Comment

Even though you're joining on Price multiple times, you don't set the key. Is this because the autoindexing takes care of those efficiencies? Also, I'm wondering if there's a tradeoff between this and the set way... like would you opt for the latter if there were a ton more columns? The syntax here is certainly nicer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.